Numpy: how to remove rows common to 2 matrices

The problem is very simple: I have two 2d np.array, and I want to get a third array that contains only rows that are not common to the last two.

eg:

X = np.array([[0,1],[1,2],[4,5],[5,6],[8,9],[9,10]])
Y = np.array([[5,6],[9,10]])

Z = function(X,Y)
Z = array([[0, 1],
          [1, 2],
          [4, 5],
          [8, 9]])

I tried np.delete(X,Y,axis=0), but it does not work ...

+6
source share
4 answers
Z = np.vstack(row for row in X if row not in Y)
+2
source

The numpy_indexed package (disclaimer: I am the author of it) extends the standard operations of the numpy array set to multidimensional use cases, such as these, with good efficiency:

import numpy_indexed as npi
Z = npi.difference(X, Y)
+1
source

a views -

# Based on http://stackoverflow.com/a/41417343/3293881 by @Eric
def setdiff2d(a, b):
    # check that casting to void will create equal size elements
    assert a.shape[1:] == b.shape[1:]
    assert a.dtype == b.dtype

    # compute dtypes
    void_dt = np.dtype((np.void, a.dtype.itemsize * np.prod(a.shape[1:])))
    orig_dt = np.dtype((a.dtype, a.shape[1:]))

    # convert to 1d void arrays
    a = np.ascontiguousarray(a)
    b = np.ascontiguousarray(b)
    a_void = a.reshape(a.shape[0], -1).view(void_dt)
    b_void = b.reshape(b.shape[0], -1).view(void_dt)

    # Get indices in a that are also in b
    return np.setdiff1d(a_void, b_void).view(orig_dt)

-

In [81]: X
Out[81]: 
array([[ 0,  1],
       [ 1,  2],
       [ 4,  5],
       [ 5,  6],
       [ 8,  9],
       [ 9, 10]])

In [82]: Y
Out[82]: 
array([[ 5,  6],
       [ 9, 10]])

In [83]: setdiff2d(X,Y)
Out[83]: 
array([[0, 1],
       [1, 2],
       [4, 5],
       [8, 9]])
0
Z = np.unique([tuple(row) for row in X + Y])
-1

Source: https://habr.com/ru/post/1016711/


All Articles