Mutate_at definition error when using group_by

mutate_at () shows an estimation error when used with group_by () and when imputing a numeric vector for a column position as the first (.vars) argument.

  • The problem occurs when using R 3.4.2 and dplyr version 0.7.4
  • Works great when using R 3.3.2 and dplyr 0.5.0
  • Works well if .vars is a character vector (column name)

Example:

 # Create example dataframe Id <- c('10_1', '10_2', '11_1', '11_2', '11_3', '12_1') Month <- c(2, 3, 4, 6, 7, 8) RWA <- c(0, 0, 0, 1.579, NA, 0.379) dftest = data.frame(Id, Month, RWA) # Define column to fill NAs nacol = c('RWA') # Fill NAs with last period dftest_2 <- dftest %>% group_by(Id) %>% mutate_at(which(names(dftest) %in% nacol), funs(ifelse(is.na(.),0,.))) 

 Error in mutate_impl(.data, dots) : Evaluation error: object 'NA' not found. 

A more convincing example demonstrating the problem:

 # Create example dataframe Id <- c('10_1', '10_2', '11_1', '11_3', '11_3', '12_1') Month <- c(2, 3, 4, 6, 7, 8) RWA <- c(0, 0, 0, 1.579, NA, 0.379) dftest = data.frame(Id, Month, RWA) # Define column to fill NAs nacol = c('RWA') # Fill NAs with last period dftest_2 <- dftest %>% group_by(Id) %>% mutate_at(which(names(dftest) %in% nacol), funs(na.locf(., na.rm=F))) 
+2
source share
1 answer

The reason we get NA values ​​is because the output we get from which is 3, but we are grouped by "Id", and after that there are only 2 columns.

 dftest %>% group_by(Id) %>% mutate_at(which(names(dftest) %in% nacol)-1, funs(ifelse(is.na(.),0,.))) # A tibble: 6 x 3 # Groups: Id [6] # Id Month RWA # <fctr> <dbl> <dbl> #1 10_1 2 0.000 #2 10_2 3 0.000 #3 11_1 4 0.000 #4 11_2 6 1.579 #5 11_3 7 0.000 #6 12_1 8 0.379 

group_by is not required here, since we change the NA values ​​in other columns to 0

 dftest %>% mutate_at(which(names(dftest) %in% nacol), funs(ifelse(is.na(.),0,.))) 

This may be a mistake, and using a position-based approach is sometimes risky. A better option would be to go with names

 dftest %>% group_by(Id) %>% mutate_at(intersect(names(.), nacol), funs(replace(., is.na(.), 0))) 

NOTE. In all of these cases, group_by not required.


Another option is replace_na from tidyr

 dftest %>% tidyr::replace_na(as.list(setNames(0, nacol))) 
+2
source

Source: https://habr.com/ru/post/976304/


All Articles