Program call group_by () on a variable variable

Using dplyr, I would like to generalize [sic] to a variable that I can change (for example, in a loop or apply-style command).

Entering names directly directly works:

library(dplyr) ChickWeight %>% group_by( Chick, Diet ) %>% summarise( mw = mean( weight ) ) 

But group_by not written to take the character vector, so conveying the results is more difficult.

 v <- "Diet" ChickWeight %>% group_by( c( "Chick", v ) ) %>% summarise( mw = mean( weight ) ) ## Error 

I will post one solution, but it is curious to see how others solved it.

+6
source share
2 answers

The dplyr underscore functions can be useful for this:

 ChickWeight %>% group_by_( "Chick", v ) %>% summarise( mw = mean( weight ) ) 

From the new features in dplyr 0.3 :

You can now program with dplyr - each function using a non-standard grade (NSE) also has a double standard grade (SE) that ends with _ . For example, the SE version of filter () is called filter _ (). The SE version of each function has similar arguments, but they should be explicitly β€œquoted”.

+11
source

Here is one solution and how I came to it.

What does group_by expect?

 > group_by function (x, ..., add = FALSE) { new_groups <- named_dots(...) 

Down the rabbit holes:

 > dplyr:::named_dots function (...) { auto_name(dots(...)) } <environment: namespace:dplyr> > dplyr:::auto_name function (x) { names(x) <- auto_names(x) x } <environment: namespace:dplyr> > dplyr:::auto_names function (x) { nms <- names2(x) missing <- nms == "" if (all(!missing)) return(nms) deparse2 <- function(x) paste(deparse(x, 500L), collapse = "") defaults <- vapply(x[missing], deparse2, character(1), USE.NAMES = FALSE) nms[missing] <- defaults nms } <environment: namespace:dplyr> > dplyr:::names2 function (x) { names(x) %||% rep("", length(x)) } 

Using this information, how to solve the problem?

 # Naive solution fails: ChickWeight %>% do.call( group_by, list( Chick, Diet ) ) %>% summarise( mw = mean( weight ) ) # Slightly cleverer: do.call( group_by, list( x = ChickWeight, Chick, Diet, add = FALSE ) ) %>% summarise( mw = mean( weight ) ) ## But still fails with, ## Error in do.call(group_by, list(x = ChickWeight, Chick, Diet, add = FALSE)) : object 'Chick' not found 

The solution is to quote the arguments, so their evaluation is delayed until they are in an environment containing x tbl:

 do.call( group_by, list( x = ChickWeight, quote(Chick), quote(Diet), add = FALSE ) ) %>% summarise( mw = mean( weight ) ) ## Bingo! v <- "Diet" do.call( group_by, list( x = ChickWeight, quote(Chick), substitute( a, list( a = v ) ), add = FALSE ) ) %>% summarise( mw = mean( weight ) ) 
0
source

Source: https://habr.com/ru/post/982156/


All Articles