How to get scipy.csr sparse matrix as normal dense matrix without toDense ()?

I have a problem with sparse matrices in scipy. I want to use them as a normal matrix, but not with the todense () function. I am new to this area, I don’t know how I can get the same result when I want to multiply a sparse matrix, but without a visible matrix ... I think that a sparse matrix is ​​used only for faster calculation, so it should be possible to do this is without a sparse matrix:

sparse_matrix * 5 == sparase_matrix.todense () * 5 == no_sparse_matrix * 5

data = np.ones(5178) indices = [34,12,545,23...,25,18,29] Shape:5178L indptr = np.arange(5178 + 1) sparse_matrix = sp.csr_matrix((data, indices, indptr), shape = (5178, 3800)) 

It is right? sparse_matrix * 5 == sparase_matrix.todense() * 5 == data * 5 ?

My goal is to get the same result as when reducing a sparse matrix without using a sparse matrix? Is it possible? How can i do this?


edit: about my intention: My problem is that I want to port python code to java, and my java libary for linear algebra does not provide sparse matrix operands.

So I have to do the same in java without sparse matrices. I was not sure if I could use a data array instead of a sparse matrix.

In the source code, a sparse matrix is ​​multiplied by another matrix. To pass this into java, I just multiply the sparse matrix data array by another matrix. Is it correct?

-1
source share
1 answer

It is not entirely clear what you are asking for, but here I am thinking.

Let’s just experiment with a simple array:

Start with 3 arrays (I took them from another sparse matrix, but that doesn't matter):

 In [165]: data Out[165]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], dtype=int32) In [166]: indices Out[166]: array([1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3], dtype=int32) In [167]: indptr Out[167]: array([ 0, 3, 7, 11], dtype=int32) In [168]: M=sparse.csr_matrix((data,indices,indptr),shape=(3,4)) 

These arrays were assigned to 3 attributes of the new matrix.

 In [169]: M.data Out[169]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], dtype=int32) In [170]: M.indices Out[170]: array([1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3], dtype=int32) In [171]: M.indptr Out[171]: array([ 0, 3, 7, 11], dtype=int32) 

Now try multiplying the .data attribute:

 In [172]: M.data *= 3 

Low and lo, we multiplied the array of 'whole'

 In [173]: MA Out[173]: array([[ 0, 3, 6, 9], [12, 15, 18, 21], [24, 27, 30, 33]], dtype=int32) 

Of course, we can also multiply the matrix directly. That is, multiplication by a constant is determined for sparse csr matrices:

 In [174]: M *= 2 In [175]: MA Out[175]: array([[ 0, 6, 12, 18], [24, 30, 36, 42], [48, 54, 60, 66]], dtype=int32) In [176]: M.data Out[176]: array([ 6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66], dtype=int32) 

Out of curiosity, consider the source array. He has also changed. So M.data points to the same array. Change one, change the other.

 In [177]: data Out[177]: array([ 6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66], dtype=int32) 

So, when a matrix is ​​created in this way, it can be multiplied by a scalar in several ways.

What's better? Direct multiplication of the .data attribute can be faster than matrix multiplication. But you must know the differences between directly controlling .data and using certain mathematical operations for the entire matrix. For example, M*N performs matrix multiplication. You really need to understand the data structure of the matrix before you try to directly change its internals.

The ability to modify data , the original array, depends on creating the matrix in this way and maintaining this pointer reference. If you defined it using coo (or coo style inputs), the data link will not be saved. And M1 = M*2 not going to pass this link to M1 .

Get code that works with normal math sparse operations. Later, if you still give out more speed, you can dig out the insides and optimize the selected operations.

+2
source

Source: https://habr.com/ru/post/989692/


All Articles