Convert from lowercase to uppercase all values ​​in all character variables in dataframe

I have a mixed data format for character and numeric variables.

city,hs_cd,sl_no,col_01,col_02,col_03 Austin,1,2,,46,Female Austin,1,3,,32,Male Austin,1,4,,27,Male Austin,1,5,,20,Female Austin,2,2,,42,Female Austin,2,1,,52,Male Austin,2,3,,25,Male Austin,2,4,,22,Female Austin,3,3,,30,Female Austin,3,1,,65,Female 

I want to convert all lowercase characters in a dataframe to uppercase. Is there a way to do this in one shot without repeating it every character?

+66
string r uppercase
May 13 '13 at 7:12
source share
6 answers

Starting from the following data:

 df <- data.frame(v1=letters[1:5],v2=1:5,v3=letters[10:14],stringsAsFactors=FALSE) v1 v2 v3 1 a 1 j 2 b 2 k 3 c 3 l 4 d 4 m 5 e 5 n 

You can use:

 data.frame(lapply(df, function(v) { if (is.character(v)) return(toupper(v)) else return(v) })) 

What gives:

  v1 v2 v3 1 A 1 J 2 B 2 K 3 C 3 L 4 D 4 M 5 E 5 N 
+73
May 13 '13 at 7:22
source share

In the dplyr package, you can also use the mutate_all () function in combination with toupper (). This will affect both character classes and factors.

 library(dplyr) df <- mutate_all(df, funs=toupper) 
+42
May 20 '15 at 18:31
source share

A component comment for anyone using any of these answers. Juba's answer is great, as it is very selective if your variables are either numeric or character strings. If, however, you have a combination (e.g. a1, b1, a2, b2), etc. It will not convert characters correctly.

As @Trenton Hoffman notes,

 library(dplyr) df <- mutate_each(df, funs(toupper)) 

affects character classes and factors and works for "mixed variables"; for example, if your variable contains both a character and a numeric value (for example, a1), both will be converted to a coefficient. In general, this is not too worrying, but if you end up wanting to match data data.frames, for example

 df3 <- df1[df1$v1 %in% df2$v1,] 

where df1 was converted and df2 contains uncured data.frame or the like, this may cause some problems. The work around is what you need to run

 df2 <- df2 %>% mutate_each(funs(toupper), v1) #or df2 <- df2 %>% mutate_each(df2, funs(toupper)) #and then df3 <- df1[df1$v1 %in% df2$v1,] 

If you work with genomic data, it means that knowing this can come in handy.

+6
Jun 11 '15 at 2:09 on
source share

It's just using a function in R

 f <- apply(f,2,toupper) 

No need to check if the column is character or any other type.

+6
Nov 14 '17 at 10:32
source share

If you need to deal with data.frames, which include factors you can use:

 df = data.frame(v1=letters[1:5],v2=1:5,v3=letters[10:14],v4=as.factor(letters[1:5]),v5=runif(5),stringsAsFactors=FALSE) df v1 v2 v3 v4 v5 1 a 1 ja 0.1774909 2 b 2 kb 0.4405019 3 c 3 lc 0.7042878 4 d 4 md 0.8829965 5 e 5 ne 0.9702505 sapply(df,class) v1 v2 v3 v4 v5 "character" "integer" "character" "factor" "numeric" 

Use mutate_each_ to convert factors to character, and then convert everything to uppercase

  upper_it = function(X){X %>% mutate_each_( funs(as.character(.)), names( .[sapply(., is.factor)] )) %>% mutate_each_( funs(toupper), names( .[sapply(., is.character)] ))} # convert factor to character then uppercase 

Gives

  upper_it(df) v1 v2 v3 v4 1 A 1 JA 2 B 2 KB 3 C 3 LC 4 D 4 MD 5 E 5 NE 

While

 sapply( upper_it(df),class) v1 v2 v3 v4 v5 "character" "integer" "character" "character" "numeric" 
+1
Sep 19 '16 at 19:59
source share

Another alternative is to use a combination of mutate_if () and str_to_uper () functions, both from the tidyverse package:

 df %>% mutate_if(is.character, str_to_upper) -> df 

This converts all string variables in the data frame to uppercase. str_to_lower () does the opposite.

0
May 26 '19 at 20:28
source share



All Articles