Zeroing elements of a sparse matrix based on another matrix (Matrix Package)

I have W , which is a 4 million row sparse binary matrix. I use the Matrix package. I would like to be able to calculate the following:

 W2 = W %*% W #W2 becomes a dgCMatrix W2@x [ W2@x > 1 ] = 1 W2 = W2 - W W2@x [ W2@x < 0 ] = 0 

Unfortunately, the third line in this operation completely destroys my computer. I can calculate lines (1) and (2) just fine, but when I try to calculate line (3), R requires a lot more RAM than I have available. I am sure that W2 - W is more sparse than W2 .

Is there any algorithm in vector form that allows you to reset the positions of W2 , which are equal to 1 in W ? Is there an efficient way to implement this in R?

+4
source share
1 answer

I assume a matrix of size 4,000,000x4,000,000, otherwise row 1 will return the error "Internal dimensions A and B must match."

I'm having difficulty replicating your issues. See below.

 > library(Matrix) > W<-rsparsematrix(nrow=4000000,ncol=4000000,density = .0000001) > W<-W>0 > str(W) Formal class 'lgCMatrix' [package "Matrix"] with 6 slots ..@ i : int [1:1600000] 623428 717198 3216269 3398149 3888958 3970651 3106201 61257 370389 3031066 ... ..@ p : int [1:4000001] 0 2 3 3 4 5 6 6 6 7 ... ..@ Dim : int [1:2] 4000000 4000000 ..@ Dimnames:List of 2 .. ..$ : NULL .. ..$ : NULL ..@ x : logi [1:1600000] TRUE FALSE TRUE TRUE FALSE TRUE ... ..@ factors : list() > W2 <- W %*% W > str(W2) Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ..@ i : int [1:638322] 908991 1031349 2979756 1924552 3421130 992757 1375889 2872056 3161609 3389210 ... ..@ p : int [1:4000001] 0 0 0 0 0 0 0 0 0 0 ... ..@ Dim : int [1:2] 4000000 4000000 ..@ Dimnames:List of 2 .. ..$ : NULL .. ..$ : NULL ..@ x : num [1:638322] 1 0 0 0 0 1 1 1 1 0 ... ..@ factors : list() > W2@x [ W2@x > 1 ] = 1 > W2 = W2 - W > W2@x [ W2@x < 0 ] = 0 > str(W2) Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ..@ i : int [1:2238320] 623428 717198 3216269 3398149 3888958 3970651 3106201 61257 370389 908991 ... ..@ p : int [1:4000001] 0 2 3 3 4 5 6 6 6 7 ... ..@ Dim : int [1:2] 4000000 4000000 ..@ Dimnames:List of 2 .. ..$ : NULL .. ..$ : NULL ..@ x : num [1:2238320] 0 0 0 0 0 0 0 0 0 1 ... ..@ factors : list() 

Significantly, your line 2 does nothing in my example, because W% *% W returns only 1 and 0.

0
source

Source: https://habr.com/ru/post/1383952/


All Articles