C ++ Data Structure, which will best contain a large list of names

Can you share your thoughts on what the best STL data structure would be to store a large list of names and search for those names?

Edit: Names are not unique, and the list can grow as new names can be continuously added to it. And by and large, I speak from 1 million to 10 million names.

+5
source share
2 answers

Since you want to search for names, you need a structure that supports fast random access. This means that the vector, deque and list are out of the question. In addition, the vector / array is slow at random additions / inserts for sorted sets, because they must move the elements to make room for each inserted element. However, adding to the end is very fast.

Consider std::map , std::unordered_map or std::unordered_multimap (or their siblings std::set , std::unordered_set and std::unordered_multiset if you only save the keys).

If you are going to make unique, random access, I would start with one of the containers unordered_ *.

If you need to keep an ordered list of names and you need to search / iterate ranges and sorted operations, a tree-based container such as std::map or std::set should work better with an iterative operation than a hash-based container because the first will store items next to their logical predecessors and successors. For random access, this is O (log N), which is still decent.

Prior to std :: unordered_ *, I used std::map to store a large number of objects for the object cache, and although there are faster containers with random access, it scales well enough for our purposes. The new unordered_map has an O (1) access time, so it is a hashed structure and should give you the best access time.

+4
source

You might consider using the concatenation of these names with a separator, but a search can lead to a hit. You will need to adjust the adjusted binary search.

But first you need to try the obvious solution, which is a hash map called unordered_map in stl. See if it suits your needs. Searching there should be pretty fast, but at a price for memory.

0
source

Source: https://habr.com/ru/post/1203758/


All Articles