Using dplyr inside a function, custom evaluation

Trying to plunge into the Custom score used by dplyr , but without success. I need a short function that returns the summary statistics (N, mean, sd, median, IQR, min, max) for a given set of variables.

A simplified version of my function ...

my_summarise <- function(df = temp,
                         to.sum = 'eg1',
                         ...){
    ## Summarise
    results <- summarise_(df,
                          n = ~n(),
                          mean = mean(~to.sum, na.rm = TRUE))
    return(results)
}

And by running it with some dummy data ...

set.seed(43290)
temp <- cbind(rnorm(n = 100, mean = 2, sd = 4),
              rnorm(n = 100, mean = 3, sd = 6)) %>% as.data.frame()
names(temp) <- c('eg1', 'eg2')
mean(temp$eg1)
  [1] 1.881721
mean(temp$eg2)
  [1] 3.575819
my_summarise(df = temp, to.sum = 'eg1')
    n mean
1 100   NA

N is calculated, but the average is not, can not understand why.

Ultimately, I would like my function to be more general, line by line ...

my_summarise <- function(df = temp,
                         group.by = 'group'
                         to.sum = c('eg1', 'eg2'),
                         ...){
    results <- list()
    ## Select columns
    df <- dplyr::select_(df, .dots = c(group.by, to.sum))
    ## Summarise overall
    results$all <- summarise_each(df,
                                  funs(n = ~n(),
                                       mean = mean(~to.sum, na.rm = TRUE)))
    ## Summarise by specified group
    results$by.group <- group_by_(df, ~to.group) %>%
                        summarise_each(df,
                                       funs(n = ~n(),
                                       mean = mean(~to.sum, na.rm = TRUE)))        
    return(results)
}

... but before moving on to this more complex version (which I used this example for guidance), I need the evaluation to work in a simple version first, since this is a stumbling block, the call dplyr::select()works fine.

, .

+4
1

, , lazyeval.

, ~mean(eg1, na.rm = TRUE). :

my_summarise <- function(df = temp,
                         to.sum = 'eg1',
                         ...){
  ## Summarise
  results <- summarise_(df,
                        n = ~n(),
                        mean = lazyeval::interp(~mean(x, na.rm = TRUE),
                                                x = as.name(to.sum)))
  return(results)
}

, :

  • , , ~n(), , ~.
  • , (~mean(eg1, na.rm = TRUE)).
  • lazyeval::interp , interp, , .

, , interp(~mean(x, na.rm = TRUE), x = to.sum). ~mean("eg1", na.rm = TRUE), eg1 . as.name, vignette("nse").

+7

Source: https://habr.com/ru/post/1657593/


All Articles