The `out` arguments in` numpy.einsum` cannot work properly

I have two codes. The first one is:

A = np.arange(3*4*3).reshape(3, 4, 3) P = np.arange(1, 4) A[:, 1:, :] = np.einsum('j, ijk->ijk', P, A[:, 1:, :]) 

and result A :

 array([[[ 0, 1, 2], [ 6, 8, 10], [ 18, 21, 24], [ 36, 40, 44]], [[ 12, 13, 14], [ 30, 32, 34], [ 54, 57, 60], [ 84, 88, 92]], [[ 24, 25, 26], [ 54, 56, 58], [ 90, 93, 96], [132, 136, 140]]]) 

Second:

 A = np.arange(3*4*3).reshape(3, 4, 3) P = np.arange(1, 4) np.einsum('j, ijk->ijk', P, A[:, 1:, :], out=A[:,1:,:]) 

and result A :

 array([[[ 0, 1, 2], [ 0, 0, 0], [ 0, 0, 0], [ 0, 0, 0]], [[12, 13, 14], [ 0, 0, 0], [ 0, 0, 0], [ 0, 0, 0]], [[24, 25, 26], [ 0, 0, 0], [ 0, 0, 0], [ 0, 0, 0]]]) 

So the result is different. Here I want to use out to save memory. Is this a bug in numpy.einsum ? Or am I missing something?

By the way, my numpy version is 1.13.3.

+5
source share
2 answers

I have not previously used this new out parameter, but have worked with einsum in the past and have a general idea of ​​how it works (or at least is used).

It seems to me that it initializes the out array to zero before the iteration begins. This will take into account all 0 in block A[:,1:,:] . If instead I start a separate out array, the desired values ​​are inserted

 In [471]: B = np.ones((3,4,3),int) In [472]: np.einsum('j, ijk->ijk', P, A[:, 1:, :], out=B[:,1:,:]) Out[472]: array([[[ 3, 4, 5], [ 12, 14, 16], [ 27, 30, 33]], [[ 15, 16, 17], [ 36, 38, 40], [ 63, 66, 69]], [[ 27, 28, 29], [ 60, 62, 64], [ 99, 102, 105]]]) In [473]: B Out[473]: array([[[ 1, 1, 1], [ 3, 4, 5], [ 12, 14, 16], [ 27, 30, 33]], [[ 1, 1, 1], [ 15, 16, 17], [ 36, 38, 40], [ 63, 66, 69]], [[ 1, 1, 1], [ 27, 28, 29], [ 60, 62, 64], [ 99, 102, 105]]]) 

The Python einsum doesn't tell me much, except how it decides to pass the out array to part c (like one from the tmp_operands list):

c_einsum (einsum_str, * tmp_operands, ** einsum_kwargs)

I know that it sets the equivalent of c-api np.nditer , using str to define axes and iterations.

He repeats something like this section in an iteration tutorial:

https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.nditer.html#reduction-iteration

Pay particular attention to the it.reset() step. This sets the buffer out to 0 before iteration. Then it iterates over the elements of the input arrays and the output array, writing the calculation values ​​to the output element. Since it executes the sum of products (for example, out[:] += ... ), it should start from scratch.

I have a little idea of ​​what is actually happening, but it seems logical to me that it should reset the output buffer. If this array matches one of the inputs, this will eventually ruin the calculation.

Therefore, I do not think that this approach will work and save you. A clean buffer is required to accumulate the results. After that, he or you can write the values ​​back to A But, given the nature of the dot product, you cannot use the same array for input and output.

 In [476]: A[:,1:,:] = np.einsum('j, ijk->ijk', P, A[:, 1:, :]) In [477]: A Out[477]: array([[[ 0, 1, 2], [ 3, 4, 5], [ 12, 14, 16], [ 27, 30, 33]], ....) 
+4
source

In the C source code for einsum , there is a section that will take the array specified by out and do some zeroing.

But in the Python source code , for example, there are execution paths that call the tensordot function before they ever converge to the arguments for calling c_einsum .

This means that some operations can be precomputed (thus changing your array A at some abbreviations) using tensordot before any sub-option is set to zero by a zero setter inside the C code for einsum.

Another way: in each pass, when performing the following reduction operations, NumPy has many options. Use tensordot directly without going into level C einsum code yet? Or prepare the arguments and go to level C (which will require rewriting some subtask of the output array with all zeros)? Or reorder operations and retest?

Depending on the order it chooses for these optimizations, you may get unexpected subarrays with zeros.

It’s best not to try to be smart and use the same array for output. You say that this is because you want to save memory. Yes, in some special cases, einsum surgery can be performed on site. But at present, it does not detect if this is the case, and try to avoid setting it to zero.

And in a huge number of cases, rewriting to one of the input arrays in the middle of the general operation will cause many problems, like trying to add to a list that you directly iterate over, etc.

+3
source

Source: https://habr.com/ru/post/1273654/


All Articles