Extract column name in mutate_if call

I would like to get the column name in the mutate_if function mutate_if . In doing so, I want to find the value in a different table and fill in the missing values ​​using the search value. I tried using the quosure syntax, but it does not work. Is it possible to get the column name correctly?

Data examples

 df <- structure(list(x = 1:10, y = c(1L, 2L, 3L, NA, 1L, 2L, 3L, NA, 1L, 2L), z = c(NA, 2L, 3L, NA, NA, 2L, 3L, NA, NA, 2L), a = c("a", "b", "c", "d", "e", "a", "b", "c", "d", "e")), .Names = c("x", "y", "z", "a"), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame")) df_lookup <- tibble(x = 0L, y = 5L, z = 8L) 

Does not work

It does not work to extract the name anyway directly.

 df %>% mutate_if(is.numeric, funs({ x <- . x <- enquo(x) lookup_value <- df_lookup %>% pull(quo_name(x)) x <- ifelse(is.na(x), lookup_value, x) return(x) })) 

With an extra function, I can extract the name, but then the replacement no longer works.

 custom_mutate <- function(v) { v <- enquo(v) lookup_value <- df_lookup %>% pull(quo_name(v)) # ifelse(is.na((!!v)), lookup_value, (!!v)) } df %>% mutate_if(is.numeric, funs(custom_mutate(v = .))) 

Work

If I add df as an additional argument to my custom function, it works, but is there any way without this? It feels wrong, and not how dplyr should be ... Correct me if I am wrong;)
In addition to this, I have to use UQE instead !! and, as said in Programming with dplyr :

UQE () is used for expert use only.

 custom_mutate2 <- function(v, df) { v <- enquo(v) lookup_value <- df_lookup %>% pull(quo_name(v)) df %>% mutate(UQE(v) := ifelse(is.na((!!v)), lookup_value, (!!v))) %>% pull(!!v) } df %>% mutate_if(is.numeric, funs(custom_mutate2(v = ., df = df))) 

Expected Result

 # A tibble: 10 x 4 # xyza # <int> <int> <int> <chr> # 1 1 1 8 a # 2 2 2 2 b # 3 3 3 3 c # 4 4 5 8 d # 5 5 1 8 e # 6 6 2 2 a # 7 7 3 3 b # 8 8 5 8 c # 9 9 1 8 d # 10 10 2 2 e 
+5
source share
2 answers

You should use quo instead of enquo

 #enquo(.) : <quosure: empty> ~function (expr) { enexpr(expr) } ... #quo(.) : <quosure: frame> ~x <quosure: frame> ~y <quosure: frame> ~z 

In your example:

 mutate_if(df, is.numeric, funs({ lookup_value <- df_lookup %>% pull(quo_name(quo(.))) ifelse(is.na(.), lookup_value, .) })) # A tibble: 10 x 4 xyza <int> <int> <int> <chr> 1 1 1 8 a 2 2 2 2 b 3 3 3 3 c 4 4 5 8 d 5 5 1 8 e 6 6 2 2 a 7 7 3 3 b 8 8 5 8 c 9 9 1 8 d 10 10 2 2 e 
+4
source

Julien Nvarre's answer is absolutely correct (you need to use quo ), but since my first thought was to use enquo as enquo , I looked at why you should use quo instead:

If we look at the source for mutate_if , we will see how it is built:

 dplyr:::mutate_if #> function (.tbl, .predicate, .funs, ...) #> { #> funs <- manip_if(.tbl, .predicate, .funs, enquo(.funs), caller_env(), #> ...) #> mutate(.tbl, !(!(!funs))) #> } #> <environment: namespace:dplyr> 

Overriding the mutate_if function in dplyr with a little modification, I can insert a print() call so that I can examine the funs object passed to mutate :

 mutate_if <- function (.tbl, .predicate, .funs, ...) { funs <- dplyr:::manip_if(.tbl, .predicate, .funs, enquo(.funs), caller_env(), ...) print(funs) } 

Then running your code will use this modified mutate_if :: function

 df <- structure(list(x = 1:10, y = c(1L, 2L, 3L, NA, 1L, 2L, 3L, NA, 1L, 2L), z = c(NA, 2L, 3L, NA, NA, 2L, 3L, NA, NA, 2L), a = c("a", "b", "c", "d", "e", "a", "b", "c", "d", "e")), .Names = c("x", "y", "z", "a"), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame")) df_lookup <- tibble(x = 0L, y = 5L, z = 8L) df %>% mutate_if(is.numeric, funs({ x <- . x <- enquo(x) lookup_value <- df_lookup %>% pull(quo_name(x)) x <- ifelse(is.na(x), lookup_value, x) return(x) })) #> $x #> <quosure> #> expr: ^{ #> x <- x #> x <- enquo(x) #> lookup_value <- df_lookup %>% pull(quo_name(x)) #> x <- ifelse(is.na(x), lookup_value, x) #> return(x) #> } #> env: 0000000007FBBFA0 #> #> $y #> <quosure> #> expr: ^{ #> x <- y #> x <- enquo(x) #> lookup_value <- df_lookup %>% pull(quo_name(x)) #> x <- ifelse(is.na(x), lookup_value, x) #> return(x) #> } #> env: 0000000007FBBFA0 #> #> $z #> <quosure> #> expr: ^{ #> x <- z #> x <- enquo(x) #> lookup_value <- df_lookup %>% pull(quo_name(x)) #> x <- ifelse(is.na(x), lookup_value, x) #> return(x) #> } #> env: 0000000007FBBFA0 

Now we see that the list of functions passed to the mutate call has already replaced the column name for the variable . . This means that inside the operator there is a variable named x , y or z , the value of which comes from df .

Imagine a simple case, we have:

 library(rlang) x <- 1:10 quo(x) #> <quosure> #> expr: ^x #> env: 0000000007615318 enquo(x) #> <quosure> #> expr: ^<int: 1L, 2L, 3L, 4L, 5L, ...> #> env: empty 

From this, I hope you can extrapolate why you want to use quo , not enquo . You are behind the column name, which is the name of the variable provided to you by quo .

Thus, using quo instead of enquo and not assigning it to a variable first:

 mutate_if(df, is.numeric, funs({ lookup_value <- df_lookup %>% pull(quo_name(quo(.))) ifelse(is.na(.), lookup_value, .) })) 
+1
source

Source: https://habr.com/ru/post/1275456/


All Articles