I have not previously used this new out parameter, but have worked with einsum in the past and have a general idea of ββhow it works (or at least is used).
It seems to me that it initializes the out array to zero before the iteration begins. This will take into account all 0 in block A[:,1:,:] . If instead I start a separate out array, the desired values ββare inserted
In [471]: B = np.ones((3,4,3),int) In [472]: np.einsum('j, ijk->ijk', P, A[:, 1:, :], out=B[:,1:,:]) Out[472]: array([[[ 3, 4, 5], [ 12, 14, 16], [ 27, 30, 33]], [[ 15, 16, 17], [ 36, 38, 40], [ 63, 66, 69]], [[ 27, 28, 29], [ 60, 62, 64], [ 99, 102, 105]]]) In [473]: B Out[473]: array([[[ 1, 1, 1], [ 3, 4, 5], [ 12, 14, 16], [ 27, 30, 33]], [[ 1, 1, 1], [ 15, 16, 17], [ 36, 38, 40], [ 63, 66, 69]], [[ 1, 1, 1], [ 27, 28, 29], [ 60, 62, 64], [ 99, 102, 105]]])
The Python einsum doesn't tell me much, except how it decides to pass the out array to part c (like one from the tmp_operands list):
c_einsum (einsum_str, * tmp_operands, ** einsum_kwargs)
I know that it sets the equivalent of c-api np.nditer , using str to define axes and iterations.
He repeats something like this section in an iteration tutorial:
https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.nditer.html#reduction-iteration
Pay particular attention to the it.reset() step. This sets the buffer out to 0 before iteration. Then it iterates over the elements of the input arrays and the output array, writing the calculation values ββto the output element. Since it executes the sum of products (for example, out[:] += ... ), it should start from scratch.
I have a little idea of ββwhat is actually happening, but it seems logical to me that it should reset the output buffer. If this array matches one of the inputs, this will eventually ruin the calculation.
Therefore, I do not think that this approach will work and save you. A clean buffer is required to accumulate the results. After that, he or you can write the values ββback to A But, given the nature of the dot product, you cannot use the same array for input and output.
In [476]: A[:,1:,:] = np.einsum('j, ijk->ijk', P, A[:, 1:, :]) In [477]: A Out[477]: array([[[ 0, 1, 2], [ 3, 4, 5], [ 12, 14, 16], [ 27, 30, 33]], ....)