What are the performance differences between using parallel.foreach and the task inside the foreach loop?

I would like to know what is the best way or if there are any documents / articles that can help me determine what are the differences in using Parallel.foreach and Task in normal mode for each cycle, as shown below:

case 1 - Parallel.foreach:

Parallel.foreach { // Do SOmething thread safe: parsing an xml and then save // into a DB Server thry respoitory approach } 

case 2 - Task inside foreach:

 foreach { Task t1 = Task.factory.startNew(()=> { //Do the same thing as case 1 that is thread safe } } Task.waitall() 
  • I made my own tests, and the result showed case 1, performing a better path than case 2. The ratio is something like this: sequential vs case 1 versus case 2 = 5s: 1s: 4s

Although in case 1 and case 2 there is almost 1: 4? Does this mean that we should always use parallel.foreach or parallel.for if we want to work in parallel in a loop?

+4
source share
3 answers

First, the best documentation on this is Part V CLR through C #.

http://www.amazon.com/CLR-via-C-Developer-Reference/dp/0735667454/ref=sr_1_1?ie=UTF8& QID = 1376239791 & cf = 8-1 & keywords = CLR + using + C % 23

Secondly, I would expect Parallel.Foreach to work better, because it will not only create Tasks, but also group them. In a book by Jeffrey Richter, he explains that tasks that run individually will be queued in the thread pool. There is some overhead to blocking the actual thread pool queue. To combat this, the tasks themselves have a turn for the tasks being created. This task subtask performed by tasks can actually do some work without blocking!

I would need to read this chapter again (chapter 27), so I'm not sure if Parallel.Foreach works this way, but this is what I would expect from this.

Blocking, he explains, is expensive because it requires access to the kernel level design.

In any case, do not expect them to be processed sequentially. Using Parallel.Foreach is less likely to be processed sequentially than the foreach keyword due to the aforementioned internal elements.

+1
source

What Parallel.ForEach() does is that it creates a small number of Task to handle iterations of your loop. Task relatively cheap, but they are not free, so it improves performance. And the body of your cycle is fast, the improvement can be really big. This is the most likely explanation for the behavior you observe.

+1
source

How many tasks do you perform? Creating a new task can take a considerable amount of time if you get hung up enough. that is, the following runs in 15 ms for the first block and more than 1 second for the second block, and the second block does not even start the task. Uncomment Start , and the time increases to almost 3 seconds. WaitAll adds only a small amount.

 static class Program { static void Main() { const int max = 3000000; var range = Enumerable.Range(0, max).ToArray(); { var sw = new Stopwatch(); sw.Start(); Parallel.ForEach(range, i => { }); sw.Stop(); Console.WriteLine(sw.ElapsedMilliseconds); } { var tasks = new Task[max]; var sw = new Stopwatch(); sw.Start(); foreach (var i in range) { tasks[i] = new Task(()=> { }); //tasks[i].Start(); } //Task.WaitAll(tasks); sw.Stop(); Console.WriteLine(sw.ElapsedMilliseconds); } } } 
0
source

Source: https://habr.com/ru/post/1496326/


All Articles