Replace numpy matrix elements with submatrices

Given that I have a square index matrix, such as:

idxs = np.array([[1, 1], [0, 1]]) 

and an array of square matrices of the same size as one other (not necessarily the same size as idxs ):

 mats = array([[[ 0. , 0. ], [ 0. , 0.5]], [[ 1. , 0.3], [ 1. , 1. ]]]) 

I would like to replace each index in idxs with the corresponding matrix in mats to get:

 array([[ 1. , 0.3, 1. , 0.3], [ 1. , 1. , 1. , 1. ], [ 0. , 0. , 1. , 0.3], [ 0. , 0.5, 1. , 1. ]]) 

mats[idxs] gives me a nested version:

 array([[[[ 1. , 0.3], [ 1. , 1. ]], [[ 1. , 0.3], [ 1. , 1. ]]], [[[ 0. , 0. ], [ 0. , 0.5]], [[ 1. , 0.3], [ 1. , 1. ]]]]) 

and so I tried using reshape , but "in vain! mats[idxs].reshape(4,4) returns:

 array([[ 1. , 0.3, 1. , 1. ], [ 1. , 0.3, 1. , 1. ], [ 0. , 0. , 0. , 0.5], [ 1. , 0.3, 1. , 1. ]]) 

If this helps, I found that skimage.util.view_as_blocks is the exact inverse of what I need (it can convert my desired result into a nested form mats[idxs] ).

Is there a (hopefully, very) quick way to do this? For the application, my mats will still have only a few small matrices, but my idxs will be a square matrix of order 2 ^ 15, in which case I will replace more than a million indexes to create a new matrix of order 2 ^ 16.

Thank you very much for your help!

+2
source share
1 answer

We index on the first axis of the input array with these indices. To get 2D output, we just need to rearrange the axes and then change them. So the approach would be with np.transpose / np.swapaxes and np.reshape , so -

 mats[idxs].swapaxes(1,2).reshape(-1,mats.shape[-1]*idxs.shape[-1]) 

Run Example -

 In [83]: mats Out[83]: array([[[1, 1], [7, 1]], [[6, 6], [5, 8]], [[7, 1], [6, 0]], [[2, 7], [0, 4]]]) In [84]: idxs Out[84]: array([[2, 3], [0, 3], [1, 2]]) In [85]: mats[idxs].swapaxes(1,2).reshape(-1,mats.shape[-1]*idxs.shape[-1]) Out[85]: array([[7, 1, 2, 7], [6, 0, 0, 4], [1, 1, 2, 7], [7, 1, 0, 4], [6, 6, 7, 1], [5, 8, 6, 0]]) 

Performance improvement with np.take for repeated indexes

With repeating metrics, for performance, we better use np.take by indexing on axis=0 . Let both of these approaches and the times when idxs have many duplicate indexes be listed.

Function Definitions -

 def simply_indexing_based(mats, idxs): ncols = mats.shape[-1]*idxs.shape[-1] return mats[idxs].swapaxes(1,2).reshape(-1,ncols) def take_based(mats, idxs):np.take(mats,idxs,axis=0) ncols = mats.shape[-1]*idxs.shape[-1] return np.take(mats,idxs,axis=0).swapaxes(1,2).reshape(-1,ncols) 

Runtime Test -

 In [156]: mats = np.random.randint(0,9,(10,2,2)) In [157]: idxs = np.random.randint(0,10,(1000,1000)) # This ensures many repeated indices In [158]: out1 = simply_indexing_based(mats, idxs) In [159]: out2 = take_based(mats, idxs) In [160]: np.allclose(out1, out2) Out[160]: True In [161]: %timeit simply_indexing_based(mats, idxs) 10 loops, best of 3: 41.2 ms per loop In [162]: %timeit take_based(mats, idxs) 10 loops, best of 3: 27.3 ms per loop 

Thus, we see a general improvement of 1.5x+ .

To get an idea of ​​the improvement with np.take , let time be only part of the indexing -

 In [168]: %timeit mats[idxs] 10 loops, best of 3: 22.8 ms per loop In [169]: %timeit np.take(mats,idxs,axis=0) 100 loops, best of 3: 8.88 ms per loop 

For this data its 2.5x+ . Not bad!

+3
source

Source: https://habr.com/ru/post/1014439/


All Articles