Background
To speed up the generation of grouped reports across multiple tables; as I do most of this, while in the dplyrworkflow, I developed a simple function that generates the required metrics
generate_summary_tbl <- function(dataset, group_column, summary_column) {
group_column <- enquo(group_column)
summary_column <- enquo(summary_column)
dataset %>%
group_by(!!group_column) %>%
summarise(
mean = mean(!!summary_column),
sum = sum(!!summary_column)
) %>%
ungroup -> smryDta
return(smryDta)
}
Example
The function works as desired:
>> mtcars %>%
... generate_summary_tbl(group_column = am, summary_column = mpg)
am mean sum
<dbl> <dbl> <dbl>
1 0 17.14737 325.8
2 1 24.39231 317.1
Problem
I would like to conditionally include the name of the column that went through summary_column = mpgin the results.
Results Examples useColName = TRUE
When called with, the useColName = TRUEresults must match:
>> mtcars %>%
... generate_summary_tbl(group_column = am, summary_column = mpg,
useColName = TRUE)
am mean_am sum_am
<dbl> <dbl> <dbl>
1 0 17.14737 325.8
2 1 24.39231 317.1
The difference is the presence of a suffix in variable names , etc. _am mean_am
Ugly solution
Partial, ugly solution I have setNames:
generate_summary_tbl <-
function(dataset,
group_column,
summary_column,
useColName = TRUE) {
group_column <- enquo(group_column)
summary_column <- enquo(summary_column)
dataset %>%
group_by(!!group_column) %>%
summarise(mean = mean(!!summary_column),
sum = sum(!!summary_column)) %>%
ungroup -> smryDta
if (useColName) {
setNames(smryDta,
c(deparse(substitute(
group_column
)),
paste(
names(smryDta)[2:length(smryDta)], paste0("_", deparse(substitute(
group_column
)))
))) -> smryDta
}
return(smryDta)
}
Example
. , . , .
mtcars %>%
generate_summary_tbl(group_column = am, summary_column = mpg, useColName = TRUE)
`~am` `mean _~am` `sum _~am`
<dbl> <dbl> <dbl>
1 0 17.14737 325.8
2 1 24.39231 317.1
, quo lazyeval?