A bit late for the party, but with dplyr the situation may be a little different. Borrowing a crayola solution for data.table:
dat1 <- microbenchmark( dtbl<- data.table(diamonds)[, list(depth=mean(depth), table=mean(table)), by=color][order(- depth)], dplyr_dtbl <- arrange(summarise(group_by(tbl_dt(diamonds),color), depth = mean(depth) , table = mean(table)),-depth), dplyr_dtfr <- arrange(summarise(group_by(tbl_df(diamonds),color), depth = mean(depth) , table = mean(table)),-depth), times = 20, unit = "ms" )
The results show that dplyr with tbl_dt is slightly slower than the data.table method. However dplyr with data.frame is faster:
expr min lq median uq max neval data.table 9.606571 10.968881 11.958644 12.675205 14.334525 20 dplyr_data.table 13.553307 15.721261 17.494500 19.544840 79.771768 20 dplyr_data.frame 4.643799 5.148327 5.887468 6.537321 7.043286 20
Note. I obviously changed the names so that the microdetection results are more readable.
source share