Unordered collection for unpackable objects?

Question

Unordered collection for unpackable objects?

I have a dict where some of the values are not hashed. I need to somehow compare the two disordered groups so that they contain equal elements. I cannot use lists because the equality of the list takes into account the order, but sets will not work because dicts are not hashed. I looked at the python docs and the only thing that looks useful is a view view that hashes in some circumstances, but in this case it doesn’t help any of the values, which is an object that contains the lists themselves, which means the recorder isn’t will hash.

Is there a standard container for such situations, or should I just use lists and scroll through each item in both lists and ensure that the equal item is somewhere in another list?

+6

python collections

Macha Nov 30 '11 at 20:46

source share

2 answers

I think the easiest way is to use lists.

 group_1 = my_dict_1.values() group_2 = my_dict_2.values()

Your comparison will not be as simple as if it were an order, or if the values were hashed, but the following should work:

 def contain_the_same(group_1, group_2): for item in group_1: if item not in group_2: return False else: group_2.pop(group_2.index(item)) if len(group_2) != 0: return False return True

This should be able to handle non-removable objects simply:

 >>> contain_the_same([1,2,3], [1,2,3]) True >>> contain_the_same([1,2,3], [1,2,3,4]) False >>> contain_the_same([1,2,[3,2,1]], [1,2,[3,2,1]]) True >>> contain_the_same([5,1,2,[3,2,1,[1]]], [1,[3,2,1,[1]],2,5]) True

Caution: this will return false if there are duplicates in one list but no other. This will require some modification if you want to make it valid.

Edit: Even easier:

 sorted(my_dict_1.values()) == sorted(my_dict_1.values())

It looks like it's twice as fast as my contain_the_same function:

 >>> timeit("contain_the_same([5,1,2,[3,2,1,[1]]], [1,[3,2,1,[1]],2,5])", "from __main__ import contain_the_same", number=10000)/10000 8.868489032757054e-06 >>>timeit("sorted([5,1,2,[3,2,1,[1]]]) == sorted([1,[3,2,1,[1]],2,5])", number=10000)/10000 4.928951884845034e-06

Although it would not be so easy to extend to the case when duplicates are allowed.

+2

Wilduck Nov 30 '11 at 21:20

source share

Raymond hettinger · Accepted Answer · 2011-11-30T23:13:46+0000

If duplicate entries do not exist, the usual options are:

If items are hashed: set(a) == set(b)
If elements are ordered: sorted(a) == sorted(b)
If you have equality: len(a) == len(b) and all(x in b for x in a)

If you have duplicates and their diversity, choose:

If items are hashed: Counter(a) == Counter(b)
If elements are ordered: sorted(a) == sorted(b)
If you have equality: len(a) == len(b) and all(a.count(x) == b.count(x) for x in a)

Unordered collection for unpackable objects?

More articles: