Multiple axis boolean masking with numpy

I want to apply boolean masking to both rows and columns.

WITH

X = np.array([[1,2,3],[4,5,6]])
mask1 = np.array([True, True])
mask2 = np.array([True, True, False])
X[mask1, mask2]

I expect the conclusion to be

array([[1,2],[4,5]])

instead

array([1,5])

It is known that

X[:, mask2]

but this is not a solution for the general case.

I would like to know how it works under the hood and why in this case the result array([1,5]).

+4
source share
3 answers

X[mask1, mask2]described in the Boolean Array Indexing Doc as equivalent

In [249]: X[mask1.nonzero()[0], mask2.nonzero()[0]]
Out[249]: array([1, 5])
In [250]: X[[0,1], [0,1]]
Out[250]: array([1, 5])

Essentially, it gives you X[0,0]and X[1,1](matching 0s and 1s).

Instead, you want:

In [251]: X[[[0],[1]], [0,1]]
Out[251]: 
array([[1, 2],
       [4, 5]])

np.ix_ - a handy tool for creating the right combination of sizes

In [258]: np.ix_([0,1],[0,1])
Out[258]: 
(array([[0],
        [1]]), array([[0, 1]]))
In [259]: X[np.ix_([0,1],[0,1])]
Out[259]: 
array([[1, 2],
       [4, 5]])

- - , .

: X[mask1[:,None], mask2]

:

obj.nonzero(). ix_ - .

In [260]: X[np.ix_(mask1, mask2)]
Out[260]: 
array([[1, 2],
       [4, 5]])
In [261]: np.ix_(mask1, mask2)
Out[261]: 
(array([[0],
        [1]], dtype=int32), array([[0, 1]], dtype=int32))

ix_:

    if issubdtype(new.dtype, _nx.bool_):
        new, = new.nonzero()

, X[np.ix_(mask1, [0,2])]

+3

, , np.where:

>>> X[:, np.where(mask1)[0]][np.where(mask2)[0]]
array([[1, 2],
       [4, 5]])

@user2357112 np.ix_. :

>>> X[np.ix_(np.where(mask1)[0], np.where(mask2)[0])]
array([[1, 2],
       [4, 5]])

- , , :

>>> X[np.where(mask1[:, None] * mask2)]
array([1, 2, 4, 5])

>>> X[np.where(mask1[:, None] * mask2)].reshape(2, 2)
array([[1, 2],
       [4, 5]])
+1

You must use the module numpy.ma. In particular, you can use mask_rowcols:

X = np.array([[1,2,3],[4,5,6]])
linesmask = np.array([True, True])
colsmask = np.array([True, True, False])

X = X.view(ma.MaskedArray)
for i in range(len(linesmask)):
    X.mask[i][0] = not linemask[i]
for j in range(len(colsmask)):
    X.mask[0][j] = not colsmask[j]
X = ma.mask_rowcols(X)
-2
source

Source: https://habr.com/ru/post/1670128/


All Articles