Numpy array size versus concatenation speed

Question

I concatenate the data into a numpy array as follows:

xdata_test = np.concatenate((xdata_test,additional_X))

This is done a thousand times. Arrays have dtype float32, and their sizes are shown below:

xdata_test.shape   :  (x1,40,24,24)        (x1 : [500~10500])   
additional_X.shape :  (x2,40,24,24)        (x2 : [0 ~ 500])

The problem is that when x1more than ~ 2000-3000, concatenation takes much longer.

The graph below shows the concatenation time depending on size x2:

Is this a memory issue or a basic numpy feature?

+4

MJ.Shin Jan 12 '16 at 15:19

2 answers

Benoit Seguin · Answer 1 · 2016-01-12T15:39:26+0000

numpy, stack concatenate . , numpy (. numpy)

, . , :

l = []
for additional_X in ...:
    l.append(addiional_X)
xdata_test = np.concatenate(l)

, .

NB: , .

Imanol Luengo · Answer 2 · 2016-01-12T15:44:53+0000

, , , , .

:

max_x = 0
for arr in list_of_arrays:
    max_x += arr.shape[0]

-, :

final_data = np.empty((max_x,) + xdata_test.shape[1:], dtype=xdata_test.dtype)

(max_x, 40, 24, 24), .

, numpy:

curr_x = 0
for arr in list_of_arrays:
    final_data[curr_x:curr_x+arr.shape[0]] = arr
    curr_x += arr.shape[0]

, / .

, N , .