Multiply and add inside an oversized array

I have an array A that has the form (M, N), now I would like to do the operation

R = (A[:,newaxis,:] * A[newaxis,:,:]).sum(2)

which the array should give (MxM). Now the problem is that the array is quite large and I get a memory error because the MxMxN array will not fit into memory.

What would be the best strategy for this? C? map()? or is there a special function for this?

Thank you david

+6
source share
1 answer

I'm not sure how big you are arrays, but the following is equivalent:

 R = np.einsum('ij,kj',A,A) 

And it can be quite a lot faster and significantly less than memory:

 In [7]: A = np.random.random(size=(500,400)) In [8]: %timeit R = (A[:,np.newaxis,:] * A[np.newaxis,:,:]).sum(2) 1 loops, best of 3: 1.21 s per loop In [9]: %timeit R = np.einsum('ij,kj',A,A) 10 loops, best of 3: 54 ms per loop 

If I increase the size of A to (500,4000) , np.einsum runs the calculation in about 2 seconds, while the original wording shreds my machine to a halt due to the size of the temporary array that it should create.

Update

As @Jaime noted in the comments, np.dot(A,AT) also an equivalent formulation of the problem and can even be faster than the solution np.einsum . Full credit to him to indicate this, but if he does not publish it as an official decision, I would like to pull it into the main answer.

+7
source

Source: https://habr.com/ru/post/946783/


All Articles