Optimize general list performance

Question

Optimize general list performance

I am trying to optimize the arithmetic operation of shared lists. I have 3 lists with zero double value, as defined below.

List<double?> list1 = new List<double?>(); List<double?> list2 = new List<double?>(); List<double?> listResult = new List<double?>(); int recordCount = list1.Count > list2.Count ? list2.Count : list1.Count; for (int index = 0; index < recordCount; index++) { double? result = list1[index] + list2[index]; listResult.Add(result); }

Is there a way to speed up this operation if I have huge lists?

Thanks for your input.

+4

optimization generics c # linq

Alan b Oct 05 '12 at 1:10

source share

5 answers

If you know the size of lists in advance, simple arrays should work faster. Created like this:

 double?[] Array1 = new double?[10];

0

Steve wellens Oct 05 '12 at 1:19

source share

The first thing you could do is specify the capacity of the result list

 List<double?> listResult = new List<double?>(recordCount);

This will allow you to pre-allocate space for time to save the results for each of the calls to List.Add ().

If you are really concerned about the work, you can break the lists into pieces and fix multiple threads to execute partial result sets, and then combine their complete set when they are done.

0

Mike parkhill Oct 05 '12 at 1:22

source share

 var result = from i in Enumerable.Range(0, Math.Min(list1.Count, list2.Count)) select list1.ElementAtOrDefault(i) + list2.ElementAtOrDefault(i); foreach (var item in result) { Debug.Write(item.Value); }

0

Jeremy thompson Oct 05 '12 at 1:24

source share

If you have the option of using raw arrays instead of lists, you can do it faster - how much it depends on how you want to go low. Correcting several errors in your source, I wrote three different versions. One goes the way your code does, creating a new list of results (I took the liberty of using a constructor that has bandwidth, preventing a bunch of intermediate distributions).

I also wrote a version that sums two arrays into a third, given that deleting a list can increase efficiency.

Finally, I wrote a version that uses unsafe code to add the first array to the second using pointers.

The comparison result is below:

 Timings: 500000 iterations of 10000-item lists Lists: 00:00:13.9184166 Arrays (safe): 00:00:08.4868231 Arrays (unsafe): 00:00:03.0901603 Press any key to continue...

The code I used can be found in this Github gist .

Unsafe code may be a little too optimized, but it’s pretty amazing to see the difference between the three samples. For clarity, I recommend sticking with safe code and using arrays.

0

Ben Oct 05 '12 at 1:49

source share

Reed copsey · Accepted Answer · 2012-10-05T01:20:15+0000

Is there a way to speed up this operation if I have huge lists?

You can postpone the creation of a list for results until you count:

 List<double?> list1 = new List<double?>(); List<double?> list2 = new List<double?>(); int recordCount = list1.Count > list2.Count ? list2.Count : list1.Count; List<double?> listResult = new List<double?>(recordCount);

This will allow you to specify the exact capacity needed for the results and avoid redistribution within the list itself. For “huge lists,” this is probably one of the slowest parts, since allocating memory and copy as the list becomes large will be the slowest operation here.

In addition, if the calculation is simple, you can use several cores:

 List<double?> list1 = new List<double?>(); List<double?> list2 = new List<double?>(); int recordCount = list1.Count > list2.Count ? list2.Count : list1.Count; var results = new double?[recordCount]; // Use an array here Parallel.For(0, recordCount, index => { double? result = list1[index] + list2[index]; results[index] = result; });

Given that the “work” here is so simple, you probably need a custom separator to make the most of parallelism (see How to speed up work with a small Body loop ):

 var results = new double?[recordCount]; // Use an array here var rangePartitioner = Partitioner.Create(0, recordCount); Parallel.ForEach(rangePartitioner, range => { for (int index = range.Item1; index < range.Item2; index++) { results[index] = list1[index] + list2[index]; } });

If this is not a bottleneck, you can use LINQ to do this as a single line:

 var results = list1.Zip(list2, (one, two) => one + two).ToList();

However, this will be (very slightly) less efficient than loop processing if performance is really a bottleneck.

Optimize general list performance

More articles: