I have a vector of (human) names, all in capitals:
names <- c("FRIEDRICH SCHILLER", "FRANK O'HARA", "HANS-CHRISTIAN ANDERSEN")
To decapitalize (use only the first letters), I used
simpleDecap <- function(x) { s <- strsplit(x, " ")[[1]] paste0(substring(s, 1,1), tolower(substring(s, 2)), collapse=" ") } sapply(names, simpleDecap, USE.NAMES=FALSE) # [1] "Friedrich Schiller" "Frank O'hara" "Hans-christian Andersen"
But I also want to consider for ' and - . Using s <- strsplit(x, " |\\'|\\-")[[1]] , of course, finds the correct letters, but then - lost as a result of the collapse of ' and - . Therefore, I tried
simpleDecap2 <- function(x) { for (char in c(" ", "\\-", "\\'")){ s <- strsplit(x, char)[[1]] x <-paste0(substring(s, 1,1), tolower(substring(s, 2)), collapse=char) } return x }
sapply (names, simpleDecap, USE.NAMES = FALSE)
but this is even worse, of course, since the results are split one by one:
sapply(names, simpleDecap2, USE.NAMES=FALSE) # [1] "Friedrich schiller" "Frank o'Hara" "Hans-christian andersen"
I think the correct approach breaks into s <- strsplit(x, " |\\'|\\-")[[1]] , but the problem is paste= .