R data.table setkey by dplyr :: group_by?

I use data.table and dplyr together. I recently noticed that dplyr :: group_by can also set a key in a data.table object.

# R version 3.1.0    
library(data.table) # 1.9.2
library(dplyr) # 0.1.3

dt <- data.table(A=rep(c("a", "b"), times=c(2, 3)), B = rep(1, 5))
tables()
#      NAME NROW MB COLS KEY
# [1,] dt      5  1 A,B
# Total: 1MB

group_by(dt, A)
tables()
#      NAME NROW MB COLS KEY
# [1,] dt      5  1 A,B  A
# Total: 1MB

I wonder why this is happening. Is this intended? as I know, Hadley is trying to make dplyr compatible with data.table.

(If possible, I would also like to know how the key is implemented in data.table. It is very interesting why setkey can change it in place?)

thank


At the request of G. Grothendieck:

library(data.table)
dt <- data.table(A = rep(c("a", "b"), times=c(2, 3)),
                 B = rep(1, 5))
dplyr::group_by(dt, A)
# Source: local data table [5 x 2]
# Groups: A
#
# Error in if (is.na(rows) || rows > getOption("dplyr.print_max")) { :
#   missing value where TRUE/FALSE needed

tables()
#      NAME NROW MB COLS KEY
# [1,] dt      5  1 A,B  A
# Total: 1MB

I use these two packages quite often, I would like to know all the details in order to avoid errors.

+4
source share

Source: https://habr.com/ru/post/1538189/


All Articles