Sum of length intervals from an integer vector

Let's say I have this integer vector :

 > int.vec [1] 1 2 3 5 6 7 10 11 12 13 

(created from int.vec <- c(1:3,5:7,10:13) )

I am looking for a function that will return the sum of the lengths of all the intervals in this vector.

So basically for int.vec this function will return:

 3+3+4 = 10 
+5
source share
3 answers

We can create a grouping variable, taking the difference of adjacent elements, check if it is equal to 1, get cumsum , use tapply to get the result of length and sum .

 sum(tapply(int.vec,cumsum(c(TRUE,diff(int.vec) !=1)), FUN=length)) #[1] 10 

Or use table and sum

 sum(table(int.vec,cumsum(c(TRUE,diff(int.vec) !=1)))) #[1] 10 

Or we split "int.vec" with the grouping variable obtained from cumsum ( split , very fast) and get the length each list element using lengths (another quick option) - contributed by @Frank

 sum(lengths(split(int.vec, cumsum(c(0,diff(int.vec)>1))))) 

NOTE. No packages are used. This will be useful for identifying the individual length each component (if necessary), simply by removing the sum wrapper.


Based on additional information from @Symbolix's solution, the expected OP output is only length for vector .

 NROW(int.vec) #[1] 10 

. This will also work if we work with data.frame . But, as I mentioned above, it seems that the OP needs to identify both the length each interval and the length . This solution provides both.

+4
source

The cgwtools package has a seqle function that may be useful here.

 library(cgwtools) int.vec <- c(1:3,5:7,10:13) seqle(int.vec) # Run Length Encoding # lengths: int [1:3] 3 3 4 # values : int [1:3] 1 5 10 

The result is list , so you can simply get and sum the values โ€‹โ€‹of the "length" with:

 sum(seqle(int.vec)$lengths) # [1] 10 
+12
source
 length(int.vec) # 10 

Your intervals are sequences of numbers, x1:xn , x1:xm , x1:xp , where the length of each vector (or the interval in this case) is n , m and p respectively.

The length of the entire vector is length(x1:xn) + length(x1:xm) + length(x1:xp) , which is the same as length(n + m + p) .


Now, if we are really interested in the length of each individual sequence vector, we can do

 int.vec <- c(1:3,5:7,10:13) ## use run-length-encoding (rle) to find sequences where the difference == 1 v <- rle(diff(int.vec) == 1)[[1]] v[v!=1] + 1 # [1] 3 3 4 

And as @AHandcartAndMohair pointed out, if you are working with a list, you can use lengths

 int.list <- list(c(1:3), c(5:7), c(10:13)) lengths(int.list) # [1] 3 3 4 
+12
source

Source: https://habr.com/ru/post/1245450/


All Articles