Massive Index Based Mask Matrix

Question

Massive Index Based Mask Matrix

How to hide an array based on actual index values?

That is, if I have a 10 x 10 x 30 matrix, and I want to hide the array when the first and second indices are equal to each other.

For example, [1, 1 , :] must be masked because 1 and 1 are equal to each other, but [1, 2, :] should not, because they do not.

I only ask for it with the third dimension, because it looks like my current problem and can complicate the situation. But my main question is: how to mask arrays based on index values?

+6

python arrays numpy

aleph4 Sep 17 '13 at 10:03

source share

2 answers

In your special case, which requires masking diagonals, you can use the np.identity() function, which returns them diagonally. Since you have a third dimension, we must add this third dimension to the unit:

 m.mask = np.identity(10)[...,None]*np.ones((1,1,30))

There may be a better way to build this array, but it basically np.identity(10) 30 of the np.identity(10) array. For example, this is equivalent to:

 np.dstack((np.identity(10),)*30)

but slower:

 In [30]: timeit np.identity(10)[...,None]*np.ones((1,1,30)) 10000 loops, best of 3: 40.7 µs per loop In [31]: timeit np.dstack((np.identity(10),)*30) 1000 loops, best of 3: 219 µs per loop

And @Ophion suggestions

 In [33]: timeit np.tile(np.identity(10)[...,None], 30) 10000 loops, best of 3: 63.2 µs per loop In [71]: timeit np.repeat(np.identity(10)[...,None], 30) 10000 loops, best of 3: 45.3 µs per loop

0

askewchan Sep 17 '13 at 22:12

source share

askewchan · Accepted Answer · 2013-09-17T22:28:42+0000

In general, you can use np.meshgrid to access the value of indices:

 i, j, k = np.meshgrid(*map(np.arange, m.shape), indexing='ij') m.mask = (i == j)

The advantage of this method is that it works for arbitrary Boolean functions on i , j and k . This is a bit slower than using the special case of identity .

 In [56]: %%timeit ....: i, j, k = np.meshgrid(*map(np.arange, m.shape), indexing='ij') ....: i == j 10000 loops, best of 3: 96.8 µs per loop

As @Jaime points out, meshgrid supports the sparse parameter, which does not do much duplication, but in some cases requires a bit more caution, since they are not broadcast. This will save memory and speed up work a little. For instance,

 In [77]: x = np.arange(5) In [78]: np.meshgrid(x, x) Out[78]: [array([[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]), array([[0, 0, 0, 0, 0], [1, 1, 1, 1, 1], [2, 2, 2, 2, 2], [3, 3, 3, 3, 3], [4, 4, 4, 4, 4]])] In [79]: np.meshgrid(x, x, sparse=True) Out[79]: [array([[0, 1, 2, 3, 4]]), array([[0], [1], [2], [3], [4]])]

So, you can use the sparse version as it says, but you must force the translation as such:

 i, j, k = np.meshgrid(*map(np.arange, m.shape), indexing='ij', sparse=True) m.mask = np.repeat(i==j, k.size, axis=2)

And acceleration:

 In [84]: %%timeit ....: i, j, k = np.meshgrid(*map(np.arange, m.shape), indexing='ij', sparse=True) ....: np.repeat(i==j, k.size, axis=2) 10000 loops, best of 3: 73.9 µs per loop

Massive Index Based Mask Matrix

More articles: