Setting elements in .data attribute to zero unpleasant behavior in scipy.sparse

I get unpleasant behavior when I set the values โ€‹โ€‹in .data of csr_matrix zero. Here is an example:

 from scipy import sparse a = sparse.csr_matrix([[0,0,2,0], [1,1,0,0],[0,3,0,0]]) 

Output:

 >>> aA array([[0, 0, 2, 0], [1, 1, 0, 0], [0, 3, 0, 0]]) >>> a.data array([2, 1, 1, 3]) >>> a.data[3] = 0 # setting one element to zero >>> aA array([[0, 0, 2, 0], [1, 1, 0, 0], [0, 0, 0, 0]]) >>> a.data array([2, 1, 1, 0]) # however, this zero is still considered part of data # what I would like to see is: # array([2, 1, 1]) >>> a.nnz # also `nnz` tells me that there 4 non-zero elements # which is incorrect, I would like 3 as an output 4 >>> a.nonzero() # nonzero method does follow the behavior I expected (array([0, 1, 1], dtype=int32), array([2, 0, 1], dtype=int32)) 

What is the best practice in the above situation? Should I avoid setting .data elements to zero? Wrong way .nnz find the number of zeros?

+3
source share
1 answer

Sparse matrices in scipy (at least CSC and CSR) have a .eliminate_zeros() method to handle these situations. Run

 a.eliminate_zeros() 

every time you a.data with a.data and he has to take care of that.

+2
source

Source: https://habr.com/ru/post/1262424/


All Articles