gsubfn
Try the gsubfn solution:
> library(gsubfn) > strapply(x, ".*\\d(\\w*)|$", ~ if (nchar(z)) z else NA, simplify = TRUE) [1] NA "Alabama" "Alaska" "Arizona" "Arkansas" [6] "California" "Colorado" "Connecticut" "Delaware" "Florida" [11] "Georgia"
It matches the last digit followed by the characters of the word, and returns the characters of the word, or if it doesn't match the end of the line (to make sure it matches something). If the first match is successful, return it; otherwise the backlink will be empty, so return NA.
Note that a formula is a short way to write function(z) if (nchar(z)) z else NA
, and this function can alternately replace the formula with a bit more keystrokes.
gsub
A similar strategy may also work using only the direct gsub
, but requires two lines and a slightly more complex regex. Here we use the second alternative to eliminate matches with the first alternative:
> s <- gsub(".*\\d(\\w*)|.*", "\\1", x) > ifelse(nchar(s), s, NA) [1] NA "Alabama" "Alaska" "Arizona" "Arkansas" [6] "California" "Colorado" "Connecticut" "Delaware" "Florida" [11] "Georgia"
EDIT: minor improvements
source share