R: matrix for indices

Question

R: matrix for indices

I have a matrix like

[,1] [,2] [1,] 1 3 [2,] 4 6 [3,] 11 12 [4,] 13 14

I want to convert this matrix to a vector like this:

 # indices 1-6, 11-14 = 1, gap indices 7-10 = 0 xx <- c(1,1,1,1,1,1,0,0,0,0,1,1,1,1)

Idea: the matrix has values from 1 to 14. And the length of the vector is also 14. If you accept the first column as the beginning and the second column as the end, then for those ranges that are present in the matrix, i.e. 1-3, 4-6, 11-12, 13-4 (or equivalently 1-6, 11-14), I want the values for these indices to be 1 in my output vector. And the space 7-10 in my matrix should have a value of 0 at indices 7-10 in my output vector. (Thanks for editing)

However, sometimes the matrix does not give the last value in the matrix. However, I always know the size after the conversion, say, in this case, 20. Then the resulting vector should look like this:

 # indices 1-6, 11-14 = 1, gap indices 7-10 = 0, indices 15-20 = 0 xx <- c(1,1,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0,0,0)

How can I do this without a loop? My matrix is quite long, I tried to use the loop slowly.

+1

matrix r

user1938809 Jun 15 '13 at 7:21

source share

3 answers

@ Arun answer seems better.

Now that I understand the problem (or me?). Here is a solution in the R base that uses the idea that only continuous sequences of zeros need to be supported.

 find.ones <- function (mat) { ones <- rep(0, max(mat)) ones[c(mat)] <- 1 ones <- paste0(ones, collapse="") ones <- gsub("101", "111", ones) ones <- as.numeric(strsplit(ones, "")[[1]]) ones }

In the original OP example:

 m <- matrix(c(1, 3, 4, 6, 11, 12, 13, 14), ncol=2, byrow=TRUE) find.ones(m) [1] 1 1 1 1 1 1 0 0 0 0 1 1 1 1

To evaluate the solution, make the matrix large enough:

 set.seed(10) m <- sample.int(n=1e6, size=5e5) m <- matrix(sort(m), ncol=2, byrow=TRUE) head(m) [,1] [,2] [1,] 1 3 [2,] 4 5 [3,] 9 10 [4,] 11 13 [5,] 14 18 [6,] 22 23 system.time(ones <- find.ones(m)) user system elapsed 1.167 0.000 1.167

+1

asb Jun 15 '13 at 7:24

source share

Throwing it here, it uses the R base and should be somewhat fast, since the inevitable loop is processed by rep :

 zero.lengths <- m[,1] - c(0, head(m[,2], -1)) - 1 one.lengths <- m[,2] - m[,1] + 1 rep(rep(c(0, 1), nrow(m)), as.vector(rbind(zero.lengths, one.lengths)))

Or another solution using sequence :

 out <- integer(m[length(m)]) # or `integer(20)` following OP edit. one.starts <- m[,1] one.lengths <- m[,2] - m[,1] + 1 one.idx <- sequence(one.lengths) + rep(one.starts, one.lengths) - 1L out[one.idx] <- 1L

+1

flodel Jun 15 '13 at 11:49

source share

Arun · Accepted Answer · 2013-06-15T07:47:43+0000

Here is the answer using the IRanges package:

 require(IRanges) xx.ir <- IRanges(start = xx[,1], end = xx[,2]) as.vector(coverage(xx.ir)) # [1] 1 1 1 1 1 1 0 0 0 0 1 1 1 1

If you specify the value of min and max entire length of your vector, then:

 max.val <- 20 min.val <- 1 c(rep(0, min.val-1), as.vector(coverage(xx.ir)), rep(0, max.val-max(xx)))

R: matrix for indices

More articles: