Avoid error when using renaming in dplyr and column does not exist

Is there a smart way to use the rename function in dplyr if in some cases the renamed column does not exist?

For example, I would like the following not to lead to an error

mtcars%>%rename(miles_per_gallon=mpg,missing_varible=foo) 

(result: Error: Unknown variables: foo.)

but rather a framework with all possible renaming.

I am currently explicitly checking that a specific column exists before renaming

thanks

Yane

+10
source share
4 answers

Sometimes everything is in order not to do everything in dplyr . This may be one of those occasions. I would set a vector that works like a key:

 namekey <- c(mpg="miles_per_gallon", cyl="cylinders", disp="displacement", hp="horse_power", drat="rear_axle_ratio", wt="weight", qsec="quarter_mile_time", vs="v_s", am="transmission", gear="number_of_gears", carb="number_of_carburetors", foo="missing_variable") mtcars1 <- mtcars[,1:2] mtcars1$foo <- rnorm(nrow(mtcars1)) names(mtcars1) <- namekey[names(mtcars1)] head(mtcars1) # miles_per_gallon cylinders missing_variable # Mazda RX4 21.0 6 -0.9901081 # Mazda RX4 Wag 21.0 6 0.2338014 # Datsun 710 22.8 4 -0.3077473 # Hornet 4 Drive 21.4 6 1.1200518 # Hornet Sportabout 18.7 8 0.7482842 # Valiant 18.1 6 0.4206614 

Once you have the key, this is just one, easy-to-understand line of code that renames.

+5
source

The plyr package has rename() with the warn_missing parameter.

 plyr::rename( mtcars, replace = c(mpg="miles_per_gallon", foo="missing_varible"), warn_missing = FALSE ) 

If you use it, consider using requireNamespace() instead of library() , so its function names do not interfere with dplyr's.

+4
source

Perhaps this is not the intention of the designers, but you can use the scope verb rename_all and the recode dplyr function, which takes one or more key value pairs, such as old_name = "New Name" .

 library(dplyr) rename_all(iris, recode, Sepal.Length = "sepal_length", cyl = "cylinder") # sepal_length Sepal.Width Petal.Length Petal.Width Species # 1 5.1 3.5 1.4 0.2 setosa # 2 4.9 3.0 1.4 0.2 setosa # 3 4.7 3.2 1.3 0.2 setosa # 4 4.6 3.1 1.5 0.2 setosa # 5 5.0 3.6 1.4 0.2 setosa # 6 5.4 3.9 1.7 0.4 setosa # 7 4.6 3.4 1.4 0.3 setosa # 8 5.0 3.4 1.5 0.2 setosa # 9 4.4 2.9 1.4 0.2 setosa 
+2
source

Another solution that can safely work in dplyr without throwing an error using conditional evaluation {}. This will apply renaming if "foo" exists, but continue execution with the original df in the absence of a column named "foo".

 mtcars %>% {if("foo" %in% names(.)) rename(., missing_varible=foo) else .} %>% rename(miles_per_gallon=mpg) 
0
source

Source: https://habr.com/ru/post/1238188/


All Articles