Numpy repeat for 2d array

Given two arrays, let's say

arr = array([10, 24, 24, 24, 1, 21, 1, 21, 0, 0], dtype=int32) rep = array([3, 2, 2, 0, 0, 0, 0, 0, 0, 0], dtype=int32) 

np.repeat (arr, rep) returns

 array([10, 10, 10, 24, 24, 24, 24], dtype=int32) 

Is there a way to replicate this functionality for a set of 2D arrays?

It is indicated

 arr = array([[10, 24, 24, 24, 1, 21, 1, 21, 0, 0], [10, 24, 24, 1, 21, 1, 21, 32, 0, 0]], dtype=int32) rep = array([[3, 2, 2, 0, 0, 0, 0, 0, 0, 0], [2, 2, 2, 0, 0, 0, 0, 0, 0, 0]], dtype=int32) 

Is it possible to create a function that vectorizes?

PS: The number of repetitions in each row does not have to be the same. I fill out each row of results to make sure they are the same size.

 def repeat2d(arr, rep): # Find the max length of repetitions in all the rows. max_len = rep.sum(axis=-1).max() # Create a common array to hold all results. Since each repeated array will have # different sizes, some of them are padded with zero. ret_val = np.empty((arr.shape[0], maxlen)) for i in range(arr.shape[0]): # Repeated array will not have same num of cols as ret_val. temp = np.repeat(arr[i], rep[i]) ret_val[i,:temp.size] = temp return ret_val 

I know about np.vectorize, and I know that it does not provide any performance advantages for the normal version.

+5
source share
2 answers

So, do you have a different repetition array for each row? But the total number of repetitions per line is the same?

Just do repeat on flattened arrays and get back to the correct number of lines.

 In [529]: np.repeat(arr,rep.flat) Out[529]: array([10, 10, 10, 24, 24, 24, 24, 10, 10, 24, 24, 24, 24, 1]) In [530]: np.repeat(arr,rep.flat).reshape(2,-1) Out[530]: array([[10, 10, 10, 24, 24, 24, 24], [10, 10, 24, 24, 24, 24, 1]]) 

If the repetitions of each line change, we are faced with the problem of filling lines of variable length. This arises in other matters. I don’t remember all the details, but I think the solution goes along this line:

Modify rep so the numbers are different:

 In [547]: rep Out[547]: array([[3, 2, 2, 0, 0, 0, 0, 0, 0, 0], [2, 2, 2, 1, 0, 2, 0, 0, 0, 0]]) In [548]: lens=rep.sum(axis=1) In [549]: lens Out[549]: array([7, 9]) In [550]: m=np.max(lens) In [551]: m Out[551]: 9 

create goal:

 In [552]: res = np.zeros((arr.shape[0],m),arr.dtype) 

create an indexing array - details should be developed:

 In [553]: idx=np.r_[0:7,m:m+9] In [554]: idx Out[554]: array([ 0, 1, 2, 3, 4, 5, 6, 9, 10, 11, 12, 13, 14, 15, 16, 17]) 

flat indexed assignment:

 In [555]: res.flat[idx]=np.repeat(arr,rep.flat) In [556]: res Out[556]: array([[10, 10, 10, 24, 24, 24, 24, 0, 0], [10, 10, 24, 24, 24, 24, 1, 1, 1]]) 
+4
source

Another solution similar to @hpaulj's solution:

 def repeat2dvect(arr, rep): lens = rep.sum(axis=-1) maxlen = lens.max() ret_val = np.zeros((arr.shape[0], maxlen)) mask = (lens[:,None]>np.arange(maxlen)) ret_val[mask] = np.repeat(arr.ravel(), rep.ravel()) return ret_val 

Instead of storing indexes, I create a bool mask and use the mask to set values.

+1
source

Source: https://habr.com/ru/post/1258271/


All Articles