Numpy array size versus concatenation speed

I concatenate the data into a numpy array as follows:

xdata_test = np.concatenate((xdata_test,additional_X))

This is done a thousand times. Arrays have dtype float32, and their sizes are shown below:

xdata_test.shape   :  (x1,40,24,24)        (x1 : [500~10500])   
additional_X.shape :  (x2,40,24,24)        (x2 : [0 ~ 500])

The problem is that when x1more than ~ 2000-3000, concatenation takes much longer.

The graph below shows the concatenation time depending on size x2:

x2 vs time usage

Is this a memory issue or a basic numpy feature?

+4
source share
2 answers

numpy, stack concatenate . , numpy (. numpy)

, . , :

l = []
for additional_X in ...:
    l.append(addiional_X)
xdata_test = np.concatenate(l)

, .

NB: , .

+6

, , , , .

  • :

    max_x = 0
    for arr in list_of_arrays:
        max_x += arr.shape[0]
    
  • -, :

    final_data = np.empty((max_x,) + xdata_test.shape[1:], dtype=xdata_test.dtype)
    

    (max_x, 40, 24, 24), .

  • , numpy:

    curr_x = 0
    for arr in list_of_arrays:
        final_data[curr_x:curr_x+arr.shape[0]] = arr
        curr_x += arr.shape[0]
    

, / .

, N , .

+5

Source: https://habr.com/ru/post/1623952/


All Articles