How can I merge setdiff using data.table?

Let's say I have two tables:

library(data.table) set.seed(1) tab1 <- data.table( let = rep(letters[1:2], each = 3), num = rep(1:3, 2), val = rnorm(6), key = c("let", "num") ) tab2 <- data.table( let = rep(letters[1:2], each = 2), num = rep(1:2, 2), val = rnorm(4), key = c("let", "num") ) 

Table 1:

 > tab1 let num val 1: a 1 -0.6264538 2: a 2 0.1836433 3: a 3 -0.8356286 4: b 1 1.5952808 5: b 2 0.3295078 6: b 3 -0.8204684 

Table 2:

 > tab2 let num 1: a 1 2: a 2 3: b 1 4: b 2 

Is there a way to “merge” these tables to get all the results in tab1 that are not in tab2 ?

  let num val 1: a 3 -0.8356286 2: b 3 -0.8204684 
+5
source share
2 answers

In this case, it is equivalent to an anti-compound:

 tab1[!tab2, on=c("let", "num")] 

But setdiff() will only be the first line for each let,num . This is noted for v1.9.8, FR # 547 .

+11
source

One solution would be to merge and delete rows where there are values ​​from tab2

 d<-as.data.frame(merge(tab1,tab2,all=T)) t<-is.na(d[,4]) d[t,][,-4] let num val.x 3 a 3 -0.8356286 6 b 3 -0.8204684 

Using data.table :

 merge(tab1,tab2,all=T)[is.na(val.y),1:3,with=F] let num val.x 1: a 3 -0.8356286 2: b 3 -0.8204684 
0
source

Source: https://habr.com/ru/post/1235760/


All Articles