Hash code as a key in a key set

As far as I know (thought), Dictionary is implemented as a hash table, where the hash code is used to identify the bucket, which then searches for the key.

In my opinion, this means that the hash code of the object remains stable during one run of my program (fluent).

Now here

http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx

I read

"The hash code is designed to efficiently insert and search collections based on the hash table. The hash code is not a constant value. For this reason: [...] Do not use the hash code as a key to extract the object from the collection with the key. "

Can someone explain to me what this means?

+6
source share
5 answers

When the documentation speaks of a โ€œkey collectionโ€, they do not mean the same as the Dictionary. Note that there is actually a KeyedCollection base class: http://msdn.microsoft.com/en-us/library/ms132438%28v=vs.110%29.aspx

The key point is the following:

Unlike dictionaries, the KeyedCollection<TKey, TItem> element is not a key / value pair; instead, the entire element is a value, and the key is embedded in the value. For example, a collection item derived from KeyedCollection<String,String> (KeyedCollection(Of String, String) in Visual Basic) might be "John Doe Jr." where the meaning is "John Doe Jr." and the key is "Doe"; or collecting employee records containing integer keys can be obtained from KeyedCollection<int,Employee> . The abstract GetKeyForItem method retrieves a key from an element.

Thus, a collection with a key is a collection of objects, as well as a way to extract a key from each. Conceptually, it looks like a table in a database where you can define a primary key, which is a subset of the entire record.

Thus, bearing in mind, the answer becomes relatively clear. As others have said, hash code equality does not mean equality of objects. But keys in key form, such as primary keys in a database table, must uniquely identify the exact object. Thus, the possibility of hash collisions makes them unacceptable for this purpose.

In addition, even in the Dictionary there is an important difference between using objects as keys and using hash codes of the same objects as a key. If two objects have a hash collision but are not compared as equal, Dictionary will still save them as two separate keys. Therefore, overriding GetHashCode just for return 1 always works (although, obviously, it's not very good for performance). As a demonstration:

 var dict = new Dictionary<MyClass, string>(); var hashDict = new Dictionary<int, string>(); dict[myObj1] = "One"; hashDict[myObj1.GetHashCode()] = "One"; dict[myObj2] = "Two"; hashDict[myObj2.GetHashCode()] = "Two"; Console.Out.WriteLine(dict[myObj1]); //Outputs "One" Console.Out.WriteLine(hashDict[myObj1.GetHashCode()]); //Outputs "Two" 

( myObj1 and myObj2 are instances of MyClass that have the same hash code but are not compared as equal)

+5
source

They can talk about KeyedCollection.
In this case, there is no purpose to use a hash as a key.
It is assumed that the key is considered the real value used by the class.

enter the link here

As in the example

 public class SimpleOrder : KeyedCollection<int, OrderItem> { // The parameterless constructor of the base class creates a // KeyedCollection with an internal dictionary. For this code // example, no other constructors are exposed. // public SimpleOrder() : base() {} // This is the only method that absolutely must be overridden, // because without it the KeyedCollection cannot extract the // keys from the items. The input parameter type is the // second generic type argument, in this case OrderItem, and // the return value type is the first generic type argument, // in this case int. // protected override int GetKeyForItem(OrderItem item) { // In this example, the key is the part number. return item.PartNumber; } } 

PartNumber is a property of OrderItem (which is int)
You should never use Hash OrderItem as GetKeyForItem

+3
source

I think this particular point is not about using a hash code as a key. For example, they do not have Dictionary<int, MyObject> , where the integer key is a hash code.

The main reason for this is that two different elements can have the same hash codes.

A safe way to use hash codes ... is not to use them directly. That is, very rarely do you write code that calls GetHashCode . If your code does not call GetHashCode , then your code will not be able to save the values, and you may not get into the problem depending on what you should not depend on.

+2
source

Documentinon means that the hash code is not guaraneteed (or even likely) the same between successive starts of your program. Therefore, if you try to use it as a key to an external data source, such as a database or keystore, it will not be reliable. However, using it as a base for an index in a bucket table (in memory, as in a dictionary) is exactly what it is intended for.

+1
source

This explains this:

The .NET Framework does not guarantee a standard implementation of the GetHashCode method, and the value returned by this method may differ between versions of the .NET Framework and platforms such as 32-bit and 64-bit platforms.

Each time you run your program in the same environment, you can always get the same hash codes, but if you run the same program on a different platform or in a different version of the .NET Framework, there is no guarantee that the hash codes will be the same.

+1
source

Source: https://habr.com/ru/post/970279/


All Articles