I would like to select the row with the maximum value in each group with dplyr.
Firstly, I generate some random data to show my question
set.seed(1) df <- expand.grid(list(A = 1:5, B = 1:5, C = 1:5)) df$value <- runif(nrow(df))
In plyr, I can use a custom function to select this line.
library(plyr) ddply(df, .(A, B), function(x) x[which.max(x$value),])
In dplyr, I use this code to get the maximum value, but not for rows with the maximum value (in this case, column C is used).
library(dplyr) df %>% group_by(A, B) %>% summarise(max = max(value))
How could I achieve this? Thanks for any suggestion.
sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 [3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C [5] LC_TIME=English_Australia.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] dplyr_0.2 plyr_1.8.1 loaded via a namespace (and not attached): [1] assertthat_0.1.0.99 parallel_3.1.0 Rcpp_0.11.1 [4] tools_3.1.0
r greatest-n-per-group dplyr plyr
Bangyou Jun 16 '14 at 6:00 2014-06-16 06:00
source share