Problem
Using dplyr::summarize_at()(or equivalent), I would like to get a summary table in which the columns are first sorted by the order of grouping of variables (G) , and then by (V) , and finally, the applied order of functions (F) . The default order is determined first by G, then F, and finally V.
Example
The code:
library(purrr)
library(dplyr)
q025 <- partial(quantile, probs = 0.025, na.rm = TRUE)
q975 <- partial(quantile, probs = 0.975, na.rm = TRUE)
vars_to_summarize <- c("height", "mass")
my_summary <- starwars %>%
filter(skin_color %in% c("gold", "green")) %>%
group_by(skin_color) %>%
summarise_at(vars_to_summarize, funs(q025, mean, q975))
Results in:
my_summary
## A tibble: 2 x 7
## skin_color height_q025 mass_q025 height_mean mass_mean height_q975 mass_q975
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 gold 167.000 75.0 167 75 167.00 75.0
## 2 green 79.375 22.7 169 NA 204.75 110.4
And the desired order of variables should be:
skin_color, height_q025, height_mean, height_q975, mass_q025, mass_mean, mass_q975
I would like to use something like this (naively simple) code:
my_summary %>%
select(everything(), starts_with(vars_to_summarize))
But that will not work. Even this code does not work as I expect (although this is not the general solution I'm looking for):
my_summary %>%
select(everything(),
starts_with(vars_to_summarize[1]),
starts_with(vars_to_summarize[2]))
Most likely, everything()should always be the last argument in select().
, :
- N ( "gr_" ),
group_by(), - L ( "var_" ),
- M ( "fun _" ).
:
gr_1, gr_2, ..., gr_N,
var_1_fun_1, var_1_fun_2, ..., var_1_fun_M,
var_2_fun_1, var_2_fun_2, ..., var_2_fun_M,
...,
var_L_fun_1, var_L_fun_2, ..., var_L_fun_M