Removing duplicates in Python list by id

I build large lists of high-level objects when parsing a tree. However, after this step I need to remove duplicates from the list, and I found this new step very slow in Python 2 (this was acceptable, but still a bit slower in Python 3). However, I know that on separate objects there is a separate identifier. For this reason, I was able to get much faster code by following these steps:

  • add all objects to the list during parsing;
  • sort the list using the option key=id;
  • iterate over the sorted list and delete the item if the previous one has the same identifier.

Thus, I have working code that now runs smoothly, but I wonder if I can accomplish this task more directly in Python.

Example. . Let two identical objects be built with the same value, but with a different identifier (for example, I’ll take fractions.Fractionit to rely on the standard library):

from fractions import Fraction
a = Fraction(1,3)
b = Fraction(1,3)

Now, if I try to accomplish what I want to do using pythonical list(set(...)), I get the wrong result because it {a,b}saves only one of two values ​​(which are identical but have a different identifier).

Now my question is: what is the most pythonic, reliable, short and fast way to remove duplicates by id, and not duplicates by value? Ordering a list does not matter if it needs to be changed.

+4
source share
2 answers

__eq__, id, . , , __hash__.

class My_obj:
    def __init__(self, val):
        self.val = val

    def __hash__(self):
        return hash(self.val)

    def __eq__(self, arg):
        return id(self) == id(arg)

    def __repr__(self):
        return str(self.val)

:

a = My_obj(5)
b = My_obj(5)

print({a, b})
{5, 5}
+3

, id , python , :

a = "foo"
b = "foo"
print(a is b)

True

, ( -), id.

:

from fractions import Fraction
a = Fraction(1,3)
b = Fraction(1,3)

d = dict()

d[id(a)] = a
d[id(b)] = b

print(d.values())

:

dict_values([Fraction(1, 3), Fraction(1, 3)])
+2

Source: https://habr.com/ru/post/1661274/


All Articles