Collapse the data table.

I have a data table:

> (mydt <- data.table(id=c(1,1,1,1,2,2), time=1:6, v1=letters[1:6], v2=LETTERS[1:6], key=c("id","time"))) id time v1 v2 1: 1 1 a A 2: 1 2 b B 3: 1 3 c C 4: 1 4 d D 5: 2 5 e E 6: 2 6 f F 

I want to “collapse this” (is this the correct term here?), To the “change” table: object 1 changed 3 times (from timestamps from 1 to 2, from 2 to 3 and from 3 to 4), object 2 changed once ( time 5-6); I'm interested in initial v1 and final v2 . So, the result should be:

 > (res <- data.table(beg.time=c(1,2,3,5), end.time=c(2,3,4,6), v1=c('a','b','c','e'), v2=c('B','C','D','F'), key=c("beg.time","end.time"))) beg.time end.time v1 v2 1: 1 2 a B 2: 2 3 b C 3: 3 4 c D 4: 5 6 e F 
+6
source share
1 answer

Thanks for the reproducible example! Here is a shot at him.

First, note that you can use the following head-tail idiom to place vector entries that are spaced apart from each other:

 x <- letters[1:5] cbind(head(x, -1), tail(x, -1)) # [,1] [,2] # [1,] "a" "b" # [2,] "b" "c" # [3,] "c" "d" # [4,] "d" "e" cbind(head(x, -2), tail(x, -2)) # [,1] [,2] # [1,] "a" "c" # [2,] "b" "d" # [3,] "c" "e" 

Then we can use the by data.table functionality to perform this operation on a group.

 mydt[,{ ## if there just one row in the group of ID's, return nothing if (.N == 1) return(NULL) else { list( ## head and tail take the first and last parts of a vector ## this will place an element next to its subsequent element beg.time = head(time, -1), end.time = tail(time, -1), v1 = head(v1, -1), v2 = tail(v2, -1) ## group by ID )}}, by = id] # id beg.time end.time v1 v2 # 1: 1 1 2 a B # 2: 1 2 3 b C # 3: 1 3 4 c D # 4: 2 5 6 e F 
+8
source

Source: https://habr.com/ru/post/954001/


All Articles