I have a data frame that looks like this:
> df <- data_frame(g = c('A', 'A', 'B', 'B', 'B', 'C'), x = c(7, 3, 5, 9, 2, 4)) > df Source: local data frame [6 x 2] gx 1 A 7 2 A 3 3 B 5 4 B 9 5 B 2 6 C 4
I know how to add a column with a maximum x value for each g group:
> df %>% group_by(g) %>% mutate(x_max = max(x)) Source: local data frame [6 x 3] Groups: g gx x_max 1 A 7 7 2 A 3 7 3 B 5 9 4 B 9 9 5 B 2 9 6 C 4 4
But I need to get the maximum x value for each g group, excluding the x value in each row.
In this example, the desired result will look like this:
Source: local data frame [6 x 3] Groups: g gx x_max x_max_exclude 1 A 7 7 3 2 A 3 7 7 3 B 5 9 9 4 B 9 9 5 5 B 2 9 9 6 C 4 4 NA
I thought I could use row_number() to remove certain elements and the maximum number of remaining, but delete warning messages and get the wrong -Inf output:
> df %>% group_by(g) %>% mutate(x_max = max(x), r = row_number(), x_max_exclude = max(x[-r])) Source: local data frame [6 x 5] Groups: g gx x_max r x_max_exclude 1 A 7 7 1 -Inf 2 A 3 7 2 -Inf 3 B 5 9 1 -Inf 4 B 9 9 2 -Inf 5 B 2 9 3 -Inf 6 C 4 4 1 -Inf Warning messages: 1: In max(c(4, 9, 2)[-1:3]) : no non-missing arguments to max; returning -Inf 2: In max(c(4, 9, 2)[-1:3]) : no non-missing arguments to max; returning -Inf 3: In max(c(4, 9, 2)[-1:3]) : no non-missing arguments to max; returning -Inf
What is the most readable, concise, efficient way to get this output in dplyr? Any insight into why my attempt using row_number() does not work is also much appreciated. Thanks for the help.