Unordered collection for unpackable objects?

I have a dict where some of the values โ€‹โ€‹are not hashed. I need to somehow compare the two disordered groups so that they contain equal elements. I cannot use lists because the equality of the list takes into account the order, but sets will not work because dicts are not hashed. I looked at the python docs and the only thing that looks useful is a view view that hashes in some circumstances, but in this case it doesnโ€™t help any of the values, which is an object that contains the lists themselves, which means the recorder isnโ€™t will hash.

Is there a standard container for such situations, or should I just use lists and scroll through each item in both lists and ensure that the equal item is somewhere in another list?

+6
source share
2 answers

If duplicate entries do not exist, the usual options are:

  • If items are hashed: set(a) == set(b)

  • If elements are ordered: sorted(a) == sorted(b)

  • If you have equality: len(a) == len(b) and all(x in b for x in a)

If you have duplicates and their diversity, choose:

  • If items are hashed: Counter(a) == Counter(b)

  • If elements are ordered: sorted(a) == sorted(b)

  • If you have equality: len(a) == len(b) and all(a.count(x) == b.count(x) for x in a)

+11
source

I think the easiest way is to use lists.

 group_1 = my_dict_1.values() group_2 = my_dict_2.values() 

Your comparison will not be as simple as if it were an order, or if the values โ€‹โ€‹were hashed, but the following should work:

 def contain_the_same(group_1, group_2): for item in group_1: if item not in group_2: return False else: group_2.pop(group_2.index(item)) if len(group_2) != 0: return False return True 

This should be able to handle non-removable objects simply:

 >>> contain_the_same([1,2,3], [1,2,3]) True >>> contain_the_same([1,2,3], [1,2,3,4]) False >>> contain_the_same([1,2,[3,2,1]], [1,2,[3,2,1]]) True >>> contain_the_same([5,1,2,[3,2,1,[1]]], [1,[3,2,1,[1]],2,5]) True 

Caution: this will return false if there are duplicates in one list but no other. This will require some modification if you want to make it valid.

Edit: Even easier:

 sorted(my_dict_1.values()) == sorted(my_dict_1.values()) 

It looks like it's twice as fast as my contain_the_same function:

 >>> timeit("contain_the_same([5,1,2,[3,2,1,[1]]], [1,[3,2,1,[1]],2,5])", "from __main__ import contain_the_same", number=10000)/10000 8.868489032757054e-06 >>>timeit("sorted([5,1,2,[3,2,1,[1]]]) == sorted([1,[3,2,1,[1]],2,5])", number=10000)/10000 4.928951884845034e-06 

Although it would not be so easy to extend to the case when duplicates are allowed.

+2
source

Source: https://habr.com/ru/post/902790/


All Articles