Parallel ForEach using very little computing power over time

I have the following code, and after a lapse of time (an hour or two) I notice that it takes more time and longer to repeat the elements. Is there something I do that causes this? If so, how can I fix this?

int totalProcessed = 0; int totalRecords = MyList.Count(); Parallel.ForEach(Partitioner.Create(0, totalRecords), (range, loopState) => { for (int index = range.Item1; index < range.Item2; index++) { DoStuff(MyList.ElementAt(index)); Interlocked.Increment(ref totalImported); if (totalImported % 1000 == 0) Log(String.Format("Processed {0} of {1} records",totalProcessed, totalRecords)); } }); public void DoStuff(IEntity entity) { foreach (var client in Clients) { // Add entity to a db using EF client.Add(entity); } } 

Thanks for any help

+4
source share
2 answers

ElementAt is a very slow extension method with the following implementation:

 public static void T ElementAt(this IEnumerable<T> collection, int index) { int i = 0; foreach(T e in collection) { if(i == index) { return e; } i++; } throw new IndexOutOfRangeException(); } 

Obviously, it works longer when the index is larger. You should use indexer MyList[index] instead of ElementAt .

+10
source

As @mace noted, using ElementAt has performance issues. Each time you call this, the iterator starts at the beginning of MyList and skips n elements until it reaches the desired index. This becomes cumulatively worse when the position of the index becomes higher.

If you still need streaming access to MyList , you can slow down performance using Skip and Take . As you try to find a position in MyList , it will still affect performance, but Take guarantees that you will receive a batch of elements after receiving it, and not for that for each element.

I also notice that you use the foreach section style, but you do this for the entire range. In the example below, I implemented a section style with a package.

 int totalRecords = MyList.Count(); int batchSize = 250; Parallel.ForEach(Partitioner.Create(0, totalRecords, batchSize), range => { foreach (var thing in MyList.Skip(range.Item1).Take(batchSize)) { DoStuff(thing); //logging and stuff... } }); 

Update

Having read the question again, you may also have problems with too many threads used for what is probably related to binding to the IO problem, that is, to the network, and then to DB \ disk. I say this since you say that there is little CPU usage, which makes me think that you are locked out on IO and that it is getting worse.

If it was clean before ElementAt , you will still see high CPU utilization.

Configure MaxDegreeOfParallelism to configure the maximum number of threads used:

 const int BatchSize = 250; int totalRecords = MyList.Count(); var partitioner = Partitioner.Create(0, totalRecords, BatchSize); var options = new ParallelOptions { MaxDegreeOfParallelism = 2 }; Parallel.ForEach(partitioner, options, range => { foreach (int thing in MyList.Skip(range.Item1).Take(BatchSize)) { DoStuff(thing); //logging and stuff... } }); 
+4
source

Source: https://habr.com/ru/post/1335164/


All Articles