Replace non-ascii characters with a specific string list without a loop in R

I want to replace non-ascii characters (for now, only Spanish) with their equivalent ascii. If I have "á", I want to replace it with "a" and so on.

I built this function (works fine), but I don't want to use a loop (including inner loops like sapply).

latin2ascii<-function(x) { if(!is.character(x)) stop ("input must be a character object") require(stringr) mapL<-c("á","é","í","ó","ú","Á","É","Í","Ó","Ú","ñ","Ñ","ü","Ü") mapA<-c("a","e","i","o","u","A","E","I","O","U","n","N","u","U") for(y in 1:length(mapL)) { x<-str_replace_all(x,mapL[y],mapA[y]) } x } 

Is there an elegant way to solve it? Any help, suggestion or modification appreciated

+6
source share
2 answers

gsubfn() in a package with the same name is really nice for this kind of thing:

 library(gsubfn) # Create a named list, in which: # - the names are the strings to be looked up # - the values are the replacement strings mapL <- c("á","é","í","ó","ú","Á","É","Í","Ó","Ú","ñ","Ñ","ü","Ü") mapA <- c("a","e","i","o","u","A","E","I","O","U","n","N","u","U") # ll <- setNames(as.list(mapA), mapL) # An alternative to the 2 lines below ll <- as.list(mapA) names(ll) <- mapL # Try it out string <- "ÍÓáÚ" gsubfn("[áéíóúÁÉÍÓÚñÑüÜ]", ll, string) # [1] "IOaU" 

Edit:

G. Grothendieck indicates that the base R also has a function for this:

 A <- paste(mapA, collapse="") L <- paste(mapL, collapse="") chartr(L, A, "ÍÓáÚ") # [1] "IOaU" 
+7
source

I like the version of Josh, but I thought that I could add another “vectorized” solution. It returns a vector of unstressed strings. It also depends only on base functions.

 x=c('íÁuÚ','uíÚÁ') mapL<-c("á","é","í","ó","ú","Á","É","Í","Ó","Ú","ñ","Ñ","ü","Ü") mapA<-c("a","e","i","o","u","A","E","I","O","U","n","N","u","U") split=strsplit(x,split='') m=lapply(split,match,mapL) mapply(function(split,m) paste(ifelse(is.na(m),split,mapA[m]),collapse='') , split, m) # "iAuU" "uiUA" 
+2
source

Source: https://habr.com/ru/post/916361/


All Articles