Performing the sum of external products on sparse matrices

I am trying to implement the following equation using a scipy sparse package:

W = x[:,1] * y[:,1].T + x[:,2] * y[:,2].T + ... 

where x and y are nxm csc_matrix. Basically, I am trying to multiply each col x by each col of y and sum the resulting nxn matrices together. Then I want to make all nonzero elements 1.

This is my current implementation:

  c = sparse.csc_matrix((n, n)) for i in xrange(0,m): tmp = bam.id2sym_thal[:,i] * bam.id2sym_cort[:,i].T minimum(tmp.data,ones_like(tmp.data),tmp.data) maximum(tmp.data,ones_like(tmp.data),tmp.data) c = c + tmp 

This implementation has the following problems:

  • Memory usage seems to have exploded. As far as I understand, memory should only increase with the fact that c becomes less sparse, but I see that the cycle starts consuming> 20 GB of memory with = 10 000, m = 100 000 (each line of x and y has only about 60 non- zero elements).

  • I use a python loop which is not very efficient.

My question is: is there a better way to do this? Controlling memory usage is my first problem, but it would be great to do it faster!

Thanks!

+6
source share
2 answers

Please note that the sum of the external products in the form you described is just the same as multiplying the two matrices together. In other words,

 sum_i X[:,i]*Y[:,i].T == X*YT 

So, just multiply the matrices together.

 Z = X*YT 

For n = 10000 and m = 100000 and where each column has one nonzero element in both X and Y, it calculates almost instantly on my laptop.

+3
source

In terms of memory and performance, this may be the main candidate for using Cython .

There is a section in the following document that describes its use with sparse scipy matricies:

http://folk.uio.no/dagss/cython_cise.pdf

0
source

Source: https://habr.com/ru/post/894314/


All Articles