I have a class called GraphEdge that I would like to uniquely define in a set (built-in set type) by its tail and head members, which are set via __init__ .
If I do not define __hash__ , I see the following behavior:
>>> E = GraphEdge('A', 'B') >>> H = GraphEdge('A', 'B') >>> hash(E) 139731804758160 >>> hash(H) 139731804760784 >>> S = set() >>> S.add(E) >>> S.add(H) >>> S set([('A', 'B'), ('A', 'B')])
A lot of people don’t know that E and H match by my definition, since they have different hashes (this is what the set type uses to determine uniqueness, as far as I know), so it adds as separate elements. Therefore, I define a rather naive hash function for GraphEdge as follows:
def __hash__( self ): return hash( self.tail ) ^ hash( self.head )
Now the above works as expected:
>>> E = GraphEdge('A', 'B') >>> H = GraphEdge('A', 'B') >>> hash(E) 409150083 >>> hash(H) 409150083 >>> S = set() >>> S.add(E) >>> S.add(H) >>> S set([('A', 'B')])
But it is clear that ('A', 'B') and ('B', 'A') will return the same hash in this case, so I would expect that I could not add ('B', 'A') to a set already containing ('A', 'B') . But this is not what happens:
>>> E = GraphEdge('A', 'B') >>> H = GraphEdge('B', 'A') >>> hash(E) 409150083 >>> hash(H) 409150083 >>> S = set() >>> S.add(E) >>> S.add(H) >>> S set([('A', 'B'), ('B', 'A')])
So, is a hash type set or not? If so, how is the last scenario possible? If not, why doesn't the first script work (no __hash__ )? Did I miss something?
Edit: For reference to future readers, I have already defined __eq__ (also based on tail and head ).