calling unique on the keyboard data.table you will have unique lines for each group. In the case of duplicate rows, the first will be accepted. When I need to take the last one instead (in general, the last temporary transaction), I use .SD[.N]
library(data.table) library(microbenchmark) dt <- data.table(id=sample(letters, 10000, T), var=rnorm(10000), key="id") microbenchmark(unique(dt), dt[, .SD[.N], by=id]) Unit: microseconds expr min lq median uq max neval unique(dt) 570.882 586.1155 595.8975 608.406 3209.122 100 dt[, .SD[.N], by = id] 6532.739 6637.7745 6694.3820 6776.968 208264.433 100
Do you know a faster way to do the same?
source share