Changing the numpy function output array in place

Question

Changing the numpy function output array in place

I am trying to write a function that performs a mathematical operation on an array and returns a result. A simplified example might be:

def original_func(A): return A[1:] + A[:-1]

To speed things up and avoid allocating a new output array for each function call, I would like to have the output array as an argument and change it in place:

 def inplace_func(A, out): out[:] = A[1:] + A[:-1]

However, when calling these two functions as follows:

 A = numpy.random.rand(1000,1000) out = numpy.empty((999,1000)) C = original_func(A) inplace_func(A, out)

the original function is apparently twice as fast as the in-place function. How can this be explained? Should a function in place be faster since it does not need to allocate memory?

+6

function python arrays numpy in-place

halvorlu Sep 23 '11 at 13:34

source share

3 answers

If you want to perform an operation on site, do

  def inplace_func (A, out):
     np.add (A [1:], A [: - 1], out)

This does not create any time series (that is A[1:] + A[:-1] ).

All Numpy binary operations have corresponding functions, check the list here: http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs

+12

pv. Sep 23 '11 at 20:28

source share

I agree with Olivers's explanation. If you want to perform the inplace operation, you need to manually iterate over your array. It will be much slower, but if you need speed, you can resort to Cython, which will give you the speed of a pure C implementation.

-1

rocksportrocker Sep 23 '11 at 15:34

source share

Olivier verdier · Accepted Answer · 2011-09-23T13:58:09+0000

I think the answer is as follows:

In both cases, you compute A[1:] + A[:-1] , and in both cases you actually create an intermediate matrix.

However, in the second case, what happens is that you explicitly copy the entire large new allocated array into the reserved memory. Copying such an array occurs at about the same time as the original operation, so you actually double the time.

To summarize, in the first case you will do:

 compute A[1:] + A[:-1] (~10ms)

In the second case, you do

 compute A[1:] + A[:-1] (~10ms) copy the result into out (~10ms)

Changing the numpy function output array in place

More articles: