Numpy.unique acts weird with numpy.array objects

This question is related to (but not the same). numpy.unique generates a list unique in what respect? "

Setup:

import numpy as np from functools import total_ordering @total_ordering class UniqueObject(object): def __init__(self, a): self.a = a def __eq__(self, other): return self.a == other.a def __lt__(self, other): return self.a < other.a def __hash__(self): return hash(self.a) def __str__(self): return "UniqueObject({})".format(self.a) def __repr__(self): return self.__str__() 

Expected behavior of np.unique :

 >>> np.unique([1, 1, 2, 2]) array([1, 2]) >>> np.unique(np.array([1, 1, 2, 2])) array([1, 2]) >>> np.unique(map(UniqueObject, [1, 1, 2, 2])) array([UniqueObject(1), UniqueObject(2)], dtype=object) 

This is not a problem; it works. But this does not work as expected:

 >>> np.unique(np.array(map(UniqueObject, [1, 1, 2, 2]))) array([UniqueObject(1), UniqueObject(1), UniqueObject(2), UniqueObject(2)], dtype=object) 

Why is np.array with a dtype = object handled differently than a python list with objects?

I.e:

 objs = map(UniqueObject, [1, 1, 2, 2]) np.unique(objs) != np.unique(np.array(objs)) #? 

I am running numpy 1.8.0.dev-74b08b3 and Python 2.7.3

+4
source share
1 answer

Following the source of np.unique , it seems that the actually adopted branch has the form

 else: ar.sort() flag = np.concatenate(([True], ar[1:] != ar[:-1])) return ar[flag] 

which simply sorts the terms and then takes those that are not equal to the previous one. But shouldn't that work? .. oops. It is on me. Your source code is defined by __ne__ , and I accidentally deleted it by deleting the total_ordering -ed comparisons.

 >>> UniqueObject(1) == UniqueObject(1) True >>> UniqueObject(1) != UniqueObject(1) True 

Entering __ne__ back:

 >>> UniqueObject(1) != UniqueObject(1) False >>> np.array(map(UniqueObject, [1,1,2,2])) array([UniqueObject(1), UniqueObject(1), UniqueObject(2), UniqueObject(2)], dtype=object) >>> np.unique(np.array(map(UniqueObject, [1,1,2,2]))) array([UniqueObject(1), UniqueObject(2)], dtype=object) 
+3
source

Source: https://habr.com/ru/post/1479369/


All Articles