C # HashSet <T> search performance (compared to ObservableCollection <T>)?

C # The overall search performance of a HashSet <T> should be O (1), and the search performance of an ObservableCollection <T> should be O (n).

I have a large number of unique elements, each element has a DateTime property that is not unique.

Each element evaluates its HashCode by simply returning its DateTime.GetHashCode ().

Now I want to get a subset of my data, for example. all items having a date that is between March 2012 and June 2012.

var result = from p in this.Elements where p.Date >= new DateTime(2012, 03, 01) && p.Date <= new DateTime(2012, 30, 06 select p; 

If I run this LINQ query in a collection of 300,000 elements, it takes ~ 25 ms to return 80 elements in a given range - it doesn't matter if I use HashSet <T> or ObservableCollection <T>.

If I scroll through all the elements manually and check them, it takes the same time, ~ 25 ms.

But I know the HashCode of all dates that are in a given range. Is it possible to get all elements with HashCodes data from my HashSet <T>? I think it will be much faster ...

Is it possible to speed up a LINQ query? I assume that it does not use the special features of my HashSet <T>?

+6
source share
2 answers

As indicated, a hash set is very effective at determining whether a given hash is in a set. Your query uses only the fact that hashset implements IEnumerable to iterate over the entire set and compare the date. He will not use hashes at all. This is why the manual path is executed at the same time as the request.

You cannot get a hash-based element from hashset, you can only check for the presence of an element in a set. A dictionary is what you want if you need to get it (which seems like you are not doing it)

Determine what you need to do with your data and use a structure optimized for this. It can be your own class that supports several internal structures, each of which is effective in one thing (for example, to search for ranges and another to check for the presence of several fields), or there may be an existing structure that meets your needs. But, not knowing what exactly you want to do with your data, it is difficult for you to advise.

Another thing to keep in mind is that you are optimizing prematurely. If 25 ms for manual search is fast enough, perhaps any structure that implements IEnumerable will be good enough. In this case, you can choose one based on other criteria that you need.

+4
source

You are not using the correct data structure. You should use something like a sorted list (sorted by the Date property), where you can then do a binary search to start and end the range.

+4
source

Source: https://habr.com/ru/post/916012/


All Articles