Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

For a while I have been interested in seeing if some tasks work well when split across multiple threads.

On the one project I have been busy with I have been creating a lot of small utility apps, to do single tasks for me, so this morning while modifying one I thought I would give it a try.

What I wrote seems to work, although I am not sure if it is the best way, or recommended way to do something like this, so I want to ask what is wrong with my code and how could it be done better. What it is doing is not really important, and I am not interested if this sort of task should be split up, but more along the lines of how good or bad is the code that I have written.

private static void ExportDocTypes(IEnumerable<string> docTypes)
{
    var queue = new Queue<string>(docTypes);
    var tasks = new List<Task>();
    for (byte i = 0; i < threadCount; i++)  //threadCount is a const set to 4.
    {
        Action a = () =>
        {
            while(queue.Count() > 0) {
                var type = queue.Dequeue();
                ExportDocuments(type, i);  //i is only sent for response.write
            }
        };
        var t = new Task(a);
        t.Start();
        tasks.Add(t);
        System.Threading.Thread.Sleep(10);
    }
    while (tasks.Count() > 0)
    {
        foreach (var t in tasks)
        {
            if (t.IsCompleted)
            {
                t.Dispose();
                tasks.Remove(t);
            }
        }
        System.Threading.Thread.Sleep(1000);
    }
}
share|improve this question
2  
Does that code even work? I'd expect it to crash if two tasks complete withing the same second. –  CodesInChaos 5 hours ago
    
It actually does. It is very unlikly that two would ever end at the same time, but I still don't see how that would be an issue. The Sleep(10) was so that two tasks didn't try grab the same docType at the same time. –  JonathanPeel 4 hours ago
    
When you remove t from tasks, the size of tasks changes, which invalidates the foreach loop. Imagine it was a for loop, you would iterate outside the bounds of the list once it shrinks. You could break from the foreach loop, but Task.WaitAll() or even linq .All(t => t.IsCompleted) would be more elegant. –  patchandthat 3 hours ago
    
That is a very good point, I didn't even notice that. –  JonathanPeel 1 hour ago
add comment

3 Answers

I would do something like this:

private static void ExportDocTypes(IEnumerable<string> docTypes)
{
    var queue = new ConcurrentQueue<string>(docTypes);
    var tasks = new List<Task>();

    for (byte i = 0; i < threadCount; i++)  //threadCount is a const set to 4.
    {
        byte iteration = i;
        Task t = Task.Factory.StartNew(() =>
        {
            string type;
            bool queueNotEmpty = queue.TryDequeue(out type);
            while(queueNotEmpty) 
            {
                ExportDocuments(type, iteration);  //i is only sent for response.write
                bool queueEmpty = queue.TryDequeue(out type);
            }
        });
        tasks.Add(t);
    }

    Task.WaitAll(tasks.ToArray());
    tasks.ForEach(task => task.Dispose());
}

Making use of things like ConcurrentQueue as many threads will be simultaneously accessing it. Also making greater use of Linq and static methods on Task to check for completion instead of using Thread.Sleep.

Edit: Also, totally missed this previously, you need to make a local copy of i within the for loop, otherwise i will point to the current value, instead of the captured value at that iteration. Something to be aware of while spawning Tasks in loops.

share|improve this answer
    
Thank you would while(queue.TryDequeue(out type)) work, is there a reason for not doing that? –  JonathanPeel 8 hours ago
    
That would indeed be an improvement. I only wrote it that way for clarity. –  patchandthat 8 hours ago
    
See edits made. –  patchandthat 7 hours ago
    
I noticed the iteration issue as well. Thank you. –  JonathanPeel 7 hours ago
    
That variable isn't needed at all, I was response.writing it just for my own interest. –  JonathanPeel 7 hours ago
show 2 more comments

You can use Parallel class:

private static void ExportDocTypes(IEnumerable<string> docTypes)
{
    int i = -1;
    Parallel.ForEach(docTypes, docType => ExportDocuments(docType, Interlocked.Increment(ref i)));
}

http://msdn.microsoft.com/ru-ru/library/system.threading.tasks.parallel_methods%28v=vs.110%29.aspx

share|improve this answer
    
I have not seen the Parallel class before, but that does look useful. The list of doxType could have 100 or maybe 1000 items. Would these all run together, or does it handle a thread limit automatically? –  JonathanPeel 5 hours ago
1  
By default it tries to calculate the optimal number of threads (usually it's a number of cores your CPU has). You can, however, set this number manually. Check the overloaded method which takes ParallelOptions class as a parameter. –  Nikita Brizhak 5 hours ago
    
That is pretty cool, I like the idea of it being able to determine what the best is (I am pretty sure it will do a better job than I will). I will give it a try. Do you think this is better than the other method? It is less code, possibly neater code, other than that does it have other advantages? –  JonathanPeel 4 hours ago
1  
You can use Reflector or some other utility to check the actual implementation, but i am pretty sure that it is going to be quite similar to what you are trying to do: with tasks, thread pool and stuff. And with no "magic". Now, if you can come up with better implementation than microsoft or not - that is not for me to judge. :) You may certainly try. –  Nikita Brizhak 3 hours ago
add comment

Rather than reviewing your specific code, let me answer your question directly:

What are the best practices with multithreading in C#?

  1. Don't. Multithreading is a bad idea. Programs are hard enough to understand already with a single point of control flow in them; why on earth would you want to add a second, third, fourth... ? Do you enjoy tracking down incredibly difficult-to-reproduce, impossible-to-understand bugs? Do you have an encyclopaedic knowledge of the differences between weak and strong memory models? No? Then don't write multithreaded programs! Processes are a perfectly good unit of work; if you have work to do, spawn a process.

  2. Oh, so you want to do multithreaded programming anyways. In that case use the highest level of abstraction available. Programming against threads directly means writing programs about workers when you could be writing programs about jobs. Remember, workers are just means to an end; what you really want to get done are the jobs. Use the Task Parallel Library: tasks are jobs, threads are workers. Let the TPL figure out for you how to make efficient use of workers.

  3. Wait, let's go back to point one again. Ask yourself do you really need concurrency? With true concurrency we have two or more points of control in a program actually active at the same time. With simulated concurrency the two points of control take turns using the processor, so they are not actually concurrent. Most of the time we end up with simulated concurrency, because there are more threads than there are processors. This leads us naturally to the conclusion that perhaps actual concurrency is not what we need. the Much of the time what people really want is asynchrony; concurrency is one way to achieve asynchrony, but it is not the only way. Use the new async/await functionality in C# 5 to represent asynchronous workflows.

  4. If you are confused by the previous point, and think wait a minute, there is no way to achieve asynchrony without multiple threads of control in a process, read Stephen Cleary's article There Is No Thread. Don't try to write multithreaded programs until you understand it.

  5. Again, let me go back to use the highest level tool available to you. Those tools were written by experts. You're not an expert. Do you know how to make a lazily-computed value written to a field such that you are absolutely guaranteed that the lazy computation executes no more than once and the field is never observed to be in an inconsistent state no matter what is in any given processor cache? I don't. You probably don't. Joe Duffy does, which is why you should use the Lazy<T> primitive he wrote rather than trying to write your own.

  6. If you use Thread.Sleep with an argument other than zero or one in any production code, you are doing something deeply wrong. Threads are expensive; you don't pay a worker to sleep, so don't pay a thread to sleep either.

share
add comment

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.