Trying to plunge into the Custom score used by dplyr , but without success. I need a short function that returns the summary statistics (N, mean, sd, median, IQR, min, max) for a given set of variables.
A simplified version of my function ...
my_summarise <- function(df = temp,
to.sum = 'eg1',
...){
results <- summarise_(df,
n = ~n(),
mean = mean(~to.sum, na.rm = TRUE))
return(results)
}
And by running it with some dummy data ...
set.seed(43290)
temp <- cbind(rnorm(n = 100, mean = 2, sd = 4),
rnorm(n = 100, mean = 3, sd = 6)) %>% as.data.frame()
names(temp) <- c('eg1', 'eg2')
mean(temp$eg1)
[1] 1.881721
mean(temp$eg2)
[1] 3.575819
my_summarise(df = temp, to.sum = 'eg1')
n mean
1 100 NA
N is calculated, but the average is not, can not understand why.
Ultimately, I would like my function to be more general, line by line ...
my_summarise <- function(df = temp,
group.by = 'group'
to.sum = c('eg1', 'eg2'),
...){
results <- list()
df <- dplyr::select_(df, .dots = c(group.by, to.sum))
results$all <- summarise_each(df,
funs(n = ~n(),
mean = mean(~to.sum, na.rm = TRUE)))
results$by.group <- group_by_(df, ~to.group) %>%
summarise_each(df,
funs(n = ~n(),
mean = mean(~to.sum, na.rm = TRUE)))
return(results)
}
... but before moving on to this more complex version (which I used this example for guidance), I need the evaluation to work in a simple version first, since this is a stumbling block, the call dplyr::select()works fine.
, .