Creating a sparse matrix from a list of sparse vectors

I have a list of sparse vectors (in R). I need to convert this list to a sparse matrix. Running this loop through the loop takes a lot of time.

sm<-spMatrix(length(tc2),n.col) for(i in 1:length(tc2)){ sm[i,]<-(tc2[i])[[1]]; } 

Is there a better way?

+4
source share
3 answers

Here is a two-step solution:

  • Use lapply() and as(..., "sparseMatrix") to convert a list of sparseVectors to a list from a single column of sparseMatrices .

  • Use do.call() and cBind() to combine sparse cBind() in one sparse matrix .


 require(Matrix) # Create a list of sparseVectors ss <- as(c(0,0,3, 3.2, 0,0,0,-3), "sparseVector") l <- replicate(3, ss) # Combine the sparseVectors into a single sparseMatrix l <- lapply(l, as, "sparseMatrix") do.call(cBind, l) # 8 x 3 sparse Matrix of class "dgCMatrix" # # [1,] . . . # [2,] . . . # [3,] 3.0 3.0 3.0 # [4,] 3.2 3.2 3.2 # [5,] . . . # [6,] . . . # [7,] . . . # [8,] -3.0 -3.0 -3.0 
+5
source

Thanks to Josh O'Brien for suggesting a solution: create 3 lists and then create sparseMatrix. I include the code for this here:

 vectorList2Matrix<-function(vectorList){ nzCount<-lapply(vectorList, function(x) length( x@j )); nz<-sum(do.call(rbind,nzCount)); r<-vector(mode="integer",length=nz); c<-vector(mode="integer",length=nz); v<-vector(mode="integer",length=nz); ind<-1; for(i in 1:length(vectorList)){ ln<-length(vectorList[[i]]@i); if(ln>0){ r[ind:(ind+ln-1)]<-i; c[ind:(ind+ln-1)]<-vectorList[[i]]@j+1 v[ind:(ind+ln-1)]<-vectorList[[i]]@x ind<-ind+ln; } } return (sparseMatrix(i=r,j=c,x=v)); } 
+2
source

This script, cbind with a bunch of vectors, is perfectly tuned to dump information directly into a sparse, column-oriented matrix ( dgCMatrix class).

There will be a function that will do this:

 sv.cbind <- function (...) { input <- lapply( list(...), as, "dsparseVector" ) thelength <- unique(sapply(input,length)) stopifnot( length(thelength)==1 ) return( sparseMatrix( x=unlist(lapply(input,slot,"x")), i=unlist(lapply(input,slot,"i")), p=c(0,cumsum(sapply(input,function(x){length( x@x )}))), dims=c(thelength,length(input)) ) ) } 

From a quick test, this looks about 10 times faster than forcing + cbind :

 require(microbenchmark) xx <- lapply( 1:10, function (k) { sparseVector( x=rep(1,100), i=sample.int(1e4,100), length=1e4 ) } ) microbenchmark( do.call( sv.cbind, xx ), do.call( cBind, lapply(xx,as,"sparseMatrix") ) ) # Unit: milliseconds # expr min lq mean median uq max neval cld # do.call(sv.cbind, xx) 1.398565 1.464517 1.540172 1.49487 1.55911 3.455421 100 a # do.call(cBind, lapply(xx, as, "sparseMatrix")) 16.037890 16.356268 16.956326 16.59854 17.49956 20.256253 100 b 
+2
source

Source: https://habr.com/ru/post/1390737/


All Articles