Data structure for recognizing repetition values

I run a fairly large search and get a System.OutOfMemoryException exception.

The problem is that I save a string key for every state I previously visited, like HashSet<sting>. As soon as it reaches 7 million elements, it falls. My thought is that I do not need to extract the rows, just find out if it exists in the set.

I seem to remember the specialized data structure for this kind of thing, but I can't remember its name for life. If I remember correctly, he had fairly constant memory requirements, and you add elements to it, and he can say with some degree of certainty whether you added any value to it. I do it, or it exists. Any tips?

+3
source share
5 answers

There is no standard set in .NET for this, but you can store many lines in Trie , using much less than, for example, a hash table / set

+2
source

You are probably thinking of a Bloom filter . This gives a probabilistic result when you check to see if there is a row in the set. If so, you will always find it. If this is not the case, you can still find that it is, depending on what else is in your set. Its memory requirements vary depending on the number of unique elements you add, but it is much lower than what a HashSet will take.

+3
source

, u trie. - trie, :

  • trie , O (m) , -. - . - -, -. - - O (N), O (1), O (m) .
  • trie .
  • trie, -, , , .
  • - -, trie .
  • Trie .
+2
source
+1
source

Are you talking about vocabulary class?

http://msdn.microsoft.com/en-us/library/xfhwa508.aspx

Excerpt from MSDN:

Each key in the dictionary must be unique according to the comparative dictionary. the key cannot be null, but the value can be if the TValue value type is a reference type.

You can use the method ContainsKeyto check if a record has already been inserted before inserting a new record.

0
source

Source: https://habr.com/ru/post/1759214/


All Articles