R data.table J behavior

I'm still puzzled by data behavior. table J.

> DT = data.table(A=7:3,B=letters[5:1]) > DT AB 1: 7 e 2: 6 d 3: 5 c 4: 4 b 5: 3 a > setkey(DT, A, B) > DT[J(7,"e")] AB 1: 7 e > DT[J(7,"f")] AB 1: 7 f # <- there is no such line in DT 

but in DT there is no such line. Why do we get this result?

+6
source share
2 answers

The data table J(7, 'f') is literally a single-line data.table that you connect to your data.table . When you call x[i] , you look at each line in i and find all matches for that in x . By default, NA used for rows in i that don't match anything, which is easier to see by adding another column to the DT :

 DT <- data.table(A=7:3,B=letters[5:1],C=letters[1:5]) setkey(DT, A, B) DT[J(7,"f")] # ABC # 1: 7 f NA 

What you see is the only line in J that doesn't match anything in DT . To prevent data.table from reporting a mismatch, you can use nomatch=0

 DT[J(7,"f"), nomatch=0] # Empty data.table (0 rows) of 3 cols: A,B,C 
+5
source

Perhaps adding an extra column sheds light on what happens.

 DT[, C:=paste0(A, B)] DT[J(7,"e")] ### ABC ### 1: 7 e 7e DT[J(7,"f")] ### ABC ### 1: 7 f NA 

This is the same behavior as without J :

 setkey(DT, B) DT["a"] ### BAC ### 1: a 3 3a DT["A"] ### BAC ### 1: A NA NA 

You can use the nomatch argument to change this behavior.

 DT[J(7,"f"), nomatch=0L] ### Empty data.table (0 rows) of 3 cols: A,B,C 
+2
source

Source: https://habr.com/ru/post/969562/


All Articles