For LINQ Performance Gurus - Search Through Non-Shared ICollection

I need to search the collection of Item objects that contain the time property. I have a quick solution, but it is very dirty (will publish if nothing better appears). Below are my attempts to perform a search using LINQ and something else.

In my particular case, I know that the elements are ordered from low to high depending on time. When I repeat them, it is 9/12, 9/13, 9/14. I would like to quickly find a solution, even if it is not streamlined, but not important now.

//ICollection c = GetCollection(); //25,000+ items DateTime TIME = DateTime.Now.AddDays(-1); EventLog scan = new EventLog("Application", "Server", "N/A"); EventLogCollection c = scan.Entries; Console.WriteLine(logs.Count); // All entries already in list here // 64 sec - SLOW List<Item> l1 = new List<Item>(); foreach (Item i in c) { if (i.time > TIME) { l1.Add(i); } } // 93 sec - SLOWER var l2 = c.Cast<Item>().AsParallel().Select(n => n.time > TIME); var i = l2.Count(); // 98 sec - EVEN SLOWER! List<Item> l3 = new List<Item>(); Parallel.ForEach(c.Cast<Item>(), n => { if (n.time > TIME) { l3.add(n); } }); 

My current solution is BinarySearch for start and end times and an ICollection loop based on these indices. It is very fast (1-2 sec), but very dirty. I do not need another solution, but I thought that I would throw it to you performance specialists.

Is there a faster and more elegant way to search for ICollection? BTW I have no control over my collection and cannot change its structure .. NET 4.0

And no, I cannot use System.Diagnostics.Eventing.Reader because I am stuck in Windows XP

+4
source share
3 answers

EDIT

According to the statement that dasblinkenlight's answer , as a rule, I wrote a short helper helper function.

 public static int BinaryFirstIndexOf<T>( Func<int, T> indexer, Func<T, boo> predicate, int count) { var low = 0; var high = count - 1; while (low < (high - 1)) { var mid = low + ((high - low) / 2); if (predicate(indexer(mid)) { high = mid; } else { low = mid; } } if (predicate(indexer(low))) { return low; } if (low != high && predicate(indexer(high))) { return high; } return -1; } 

What will I use so

 var time = DateTime.Now.AddDays(-1); var c = scan.Entries; var first = BinaryFirstIndexOf( i => c[i], e => e.TimeGenerated > time, c.Count); if (first >= 0) { var result = new List<Item>(c.Count - first); Enumerable.Range(first, c.Count - first).AsParallel() .ForAll(i => { var j = i - first; result[j] = (Item)c[i]; }); } 

Do not want to do something like this

 var time = DateTime.Now.AddDays(-1); var c = scan.Entries; var cutOff = 0; for (var i = c.Count - 1; i > - 1; i--) { if (c[i].TimeGenerated < time) { cutOff = i; break; } } var result = new List<Item>(c.Count - cutOff - 1); var j = 0; for (var i = cutOff + 1; i < c.Count; i ++) { result[j] = (Item)c[i]; j++; } 

I assume that the data you want is at the end and occupies less than half of this collection.

Or perhaps using linq for parallel processing,

 var time = DateTime.Now.AddDays(-1); var c = scan.Entries; var cutOff = 0; for (var i = c.Count - 1; i > - 1; i--) { if (c[i].TimeGenerated < time) { cutOff = i; break; } } var result = new List<Item>(c.Count - cutOff - 1); Enumerable.Range(cutOff + 1, c.Count - cutOff - 1).AsParallel() .ForAll(i => { var j = i - cutOff - 1; result[j] = (Item)c[i]; }); 
+1
source

Your "request" is incorrect. You actually select a bunch of Boolean elements, but don't filter. Do you want to:

 var l2 = c.Cast<EventLogEntry>().AsParallel().Where(n => n.time > TIME); 

It may not be faster, but worth checking out. Damn, I wouldn’t even do it in parallel to start with.

In fact, Count has an overload that takes a predicate, so you can use this:

 var count = c.Cast<EventLogEntry>().Count(n => n.time > TIME); 

Or parallel version:

 var count = c.Cast<EventLogEntry>().AsParallel().Count(n => n.time > TIME); 

Like Chris, I am shocked that it will take more than a minute for any implementation ...

+2
source

This is the first statement:

 List<Item> l1 = new List<Item>(); foreach (Item i in c) { if (i.time > TIME) { l1.Add(i); } } 

Does the change improve the performance (it will filter the list before starting foreach):

 List<Item> l1 = new List<Item>(); foreach (Item i in c.Where(a => a.time > TIME)) { l1.Add(i); } 
0
source

Source: https://habr.com/ru/post/1437194/


All Articles