What is the most efficient collection class in C # for finding strings

string[] words = System.IO.File.ReadAllLines("word.txt"); var query = from word in words where word.Length > "abe".Length && word.StartsWith("abe") select word; foreach (var w in query.AsParallel()) { Console.WriteLine(w); } 

Basically, word.txt contains 170,000 English words. Is there a collection class in C # that is faster than a string array for the above query? There will be no insertion or deletion, just search if the line starts with "abe" or "abdi".

Each word in the file is unique.

EDIT 1 This search will run potentially millions of times in my application. I also want to stick to LINQ for the collection request, because I might need an aggregate function.

EDIT 2 Words from the file are already sorted, the file will not change

+6
source share
3 answers

I myself would create a Dictionary<char, List<string>> , where I would group the words by their first letter. This will significantly reduce the search for the necessary word.

+4
source

If you need to perform a search once, there is nothing better than a linear search - an array is great for it.

If you need to do a second search, you can consider saving the array (n Log n) and searching by any prefix will be fast (long n). Depending on the type of search using a dictionary of string lists indexed by a prefix, there may be another good option.

+1
source

If you are looking many times, than you are changing a file with words. You can sort the words in the file every time you change the list. After that, you can use bisseral search. Thus, you need to make up to 20 comparisons in order to find a match of words with your key and some additional comparisons of neighborhoods.

0
source

Source: https://habr.com/ru/post/887038/


All Articles