I tried to collapse all several (2 or more) whitespace in the elements of a vector into one using gsub() , for example:
x1 <- c(" abc", "abc ", "abc") gsub("\\s{2,}", " ", x1) [1] " abc" "abc " "abc"
But as soon as the vector contains NA , the failure fails:
x2 <- c(NA, " abc", "abc ", "abc") gsub("\\s{2,}", " ", x2) [1] NA " " " " " "
However, it works great if you use regular expressions like Perl:
gsub("\\s{2,}", " ", x2, perl = TRUE) [1] NA " abc" "abc " "abc"
Does anyone have any suggestions as to why R's own regular expressions behave this way? I am using R 3.1.1 on Linux x86-64 if this helps.
source share