In Python, why is a tuple hashable but not a list?

Below, when I try to hash a list, it gives me an error, but it works with a tuple. I believe that this has something to do with immutability. Can someone explain this in detail?

List

x = [1,2,3] y = {x: 9} Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'list' 

Tuple

 z = (5,6) y = {z: 89} print(y) {(5, 6): 89} 
+5
source share
6 answers

Dikts and other objects use hashes to quickly store and retrieve items. The mechanics of this all happen "under the covers" - you, as a programmer, do not need to do anything, and Python processes it all internally. The basic idea is that when creating a dictionary with {key: value} Python should be able to hash everything that you used for key so that it can quickly store and quickly find the value.

Immutable objects or objects that cannot be changed are hashed. They have one unique value that never changes, so python can evaluate the hash and use it to efficiently search for dictionary values. Objects in this category include strings, tuples, integers, etc. You might think: “But I can change the line! I just go mystr = mystr + 'foo' , but actually what it does is create a new instance of the line and assign it to mystr , it does not change the existing instance. Never change, so you can always be sure that when you create a hash for an immutable object, searching for the object by its hash will always return the same object from which you started, and not with the modified version.

You can try this for yourself: hash("mystring") , hash(('foo', 'bar')) , hash(1)

Variable objects or objects that can be modified are not hashed. The list can be changed in place: mylist.append('bar') or mylist.pop(0) . You cannot safely hash a mutable object, because you cannot guarantee that the object has not changed since it was last viewed. You will find that list , set and other mutable types do not have the __hash__() method. Because of this, you cannot use mutable objects as dictionary keys.

Edit: Eric Duminil's answer is a great example of unexpected behavior that arises from using mutable objects as dictionary keys

+11
source

Here are examples of why it might not be a good idea to allow mutable types as keys. This behavior may be useful in some cases, but it can also lead to surprise results or errors.

Python

You can use a numerical list as a key by defining __hash__ in a subclass of list :

 class MyList(list): def __hash__(self): return sum(self) my_list = MyList([1, 2, 3]) my_dict = {my_list: 'a'} print my_dict.get(my_list) # 'a' my_list[2] = 4 # __hash__() becomes 7 print my_dict.keys()[0] # [1, 2, 4] print my_dict.get(my_list) # None print my_dict.get(MyList([1,2,3])) # None my_list[0] = 0 # __hash_() is 6 again, but for different elements print my_dict.keys()[0] # [0, 2, 4] print my_dict.get(my_list) # 'a' 

ruby

In Ruby, he allowed the use of a list as a key. The Ruby list is called Array and dict is Hash , but the syntax is very similar to Python:

 my_list = [1] my_hash = { my_list => 'a'} puts my_hash[my_list] #=> 'a' 

But if this list is changed, dict no longer finds the corresponding value, even if the key is still in the dict:

 my_list << 2 puts my_list #=> [1,2] puts my_hash.keys.first #=> [1,2] puts my_hash[my_list] #=> nil 

You can force dict to compute the key hashes again:

 my_hash.rehash puts my_hash[my_list] #=> 'a' 
+6
source

The hashset calculates the hash of the object and is based on this hash, saves the object in the structure for quick search. As a result, by contract, when an object is added to the dictionary, the hash cannot be changed . Most good hash functions will depend on the number of elements and the elements themselves.

The tuple is unchanged, therefore, after construction, the values ​​cannot change, and therefore the hash cannot change (or, at least, a good implementation should not allow the hash to be changed).

The list on the other hand has been changed: later you can add / remove / change items. As a result, a hash can change a breach of contract.

Thus, all objects that cannot guarantee a hash function that remains stable after adding an object violate the contract and, therefore, are not good candidates. Because to search, the dictionary will first calculate the hash of the key and determine the correct bucket. If the key is changed, this can lead to false negatives: the object is in the dictionary, but it can no longer be restored, because the hash is different from the other, therefore, a search is made for another bucket, except where the object was originally added to.

+3
source

Since the list is changed, but the tuple is not. When you store a hash of a value in, for example, dict, if the object changes, the stored hash value will not be detected, so it will remain the same. The next time you start looking for an object, the dictionary will try to find it using the old hash value, which no longer matters.

To prevent this, python does not allow you to have mutable elements.

0
source

I would like to add the following aspect, as it is not yet covered by other answers.

There is nothing wrong with making mutable objects hashed; it is simply not unique, and therefore it must be defined and implemented sequentially by the programmer (and not the programming language).

Please note that you can implement the __hash__ method for any custom class that allows you to store its instances in contexts where types of hashed ones are required (for example, keys or key sets).

Hash values ​​are commonly used to decide if two objects represent the same thing. Therefore, consider the following example. You have a list with two items: l = [1, 2] . Now you add the item to the list: l.append(3) . And now you have to answer the following question: are they all the same? Both yes and no are valid answers. Yes, this is the same list and no, it no longer has content.

Thus, the answer to this question is up to you as a programmer, and therefore you will have to manually implement hash methods for your mutable types.

-1
source

Based on Python Glossary

A hashable object if it has a hash value that never changes during its life cycle (it needs the __hash __ () method) and can be compared with other objects (it needs the __eq __ () method). Hashable objects that are compared equal must have the same hash value.

All unused Pythons built-in objects are hashed; mutable containers (e.g. lists or dictionaries) are not.

-1
source

Source: https://habr.com/ru/post/1264160/


All Articles