Make numpy.sum () return the sum of the matrices instead of a single number

I am doing a rather complicated summation using a matrix with numpy. The matrix form is matrix.shape = (500, 500) , and the array form is arr.shape = (25,) . The operation is as follows:

 totalsum = np.sum([i * matrix for i in arr]) 

Here is what I do not understand:

np.sum() is very slow and returns a single float, float64 . Performing the same operation with Python sum.() , I.e.

 totalsum2 = sum([i*matrix for i in arr]) 

Preserves the shape of the matrix. That is, the resulting figure totalsum2.shape() = (500, 500) . A?

It also seems strange to me that np.sum() takes longer than sum() , especially when we work with numpy ndarrays.

What exactly is going on here? How can np.sum() summarize the above values ​​compared to sum() ?

I would like np.sum() to keep the shape of the matrix. How can I set the size so that np.sum() size of the matrix and does not return a single float?

+6
source share
3 answers

You must call np.sum with an additional axis parameter set to 0 (summation along the 0 axis, i.e. the one created according to your understanding of the list)

 totalsum = np.sum([i * matrix for i in arr], 0) 

Alternatively, you can omit the parentheses, so np.sum evaluates the generator.

 totalsum = np.sum(i * matrix for i in arr) 
+5
source

The regular sum () task of Python takes each item in this list and adds them together. When arrays of the same size are added together, you simply add them in different ways. For instance:

 test1 = np.array([[4,3],[2,1]]) test2 = np.array([[8,9],[1,1]]) print test1 + test2 

Returns

 [[12,12] [3,2]] 

While with np.sum you add along an axis or axes. If you want to store things in an array and want to use np.sum, you will want to project your operation (multiply by i in the array) into the third dimension, and then use np.sum (axis = 2).

This can be done using:

 np.sum(matrix[:,:,np.newaxis] * array[np.newaxis,np.newaxis,:],axis=2) 
+4
source
 [i*matrix for i in arr] # list of matrices 

The list above is a list of matrices, so when using the sum, it will add arrays.

 In [6]: matrix = np.array([[1,2],[3,4]]) In [7]: matrix Out[7]: array([[1, 2], [3, 4]]) In [9]: [i * matrix for i in (2,4,8)] Out[9]: [array([[2, 4], [6, 8]]), array([[ 4, 8], [12, 16]]), array([[ 8, 16], [24, 32]])] 

Please check the help for np.sum

  File: /home/ale/.virtualenvs/ml/local/lib/python2.7/site-packages/numpy/core/fromnumeric.pyaxis=None, dtype=None, out=None, keepdims=False) Docstring: Sum of array elements over a given axis. Parameters ---------- a : array_like Elements to sum. axis : None or int or tuple of ints, optional Axis or axes along which a sum is performed. The default (`axis` = `None`) is perform a sum over all the dimensions of the input array. `axis` may be negative, in which case it counts from the last to the first axis. .. versionadded:: 1.7.0 

It says that if you do not define an axis, it will be summed over all dimensions. Example:

 In [4]: np.sum(np.array([[1,2],[3,4]])) # 1 + 2 + 3 + 4... Out[4]: 10 

Why does np.sum take longer? that in the expression [i*matrix for i in arr] you create a new array for each i , which then np.sum will sum over all arrays.

There may be other reasons, but I guess that is.

+2
source

Source: https://habr.com/ru/post/986509/


All Articles