You need to use regular expressions to identify unwanted characters. For the most easily readable code, you want str_replace_all from the str_replace_all package, although gsub from the R base also works.
The exact regex depends on what you are trying to do. You could just delete the specific characters you asked in the question, but it’s much easier to remove all the punctuation marks.
x <- "a1~!@#$%^&*(){}_+:\"<>?,./;'[]-=" #or whatever str_replace_all(x, "[[:punct:]]", " ")
(The basic equivalent of R is gsub("[[:punct:]]", " ", x) .)
An alternative is to replace all non-alphanumeric characters.
str_replace_all(x, "[^[:alnum:]]", " ")
Please note that determining what constitutes a letter or a number or a punctuation mark is slightly dependent on your language, so you may need to experiment a bit to get exactly what you want.
Richie Cotton Apr 24 '12 at 9:01 2012-04-24 09:01
source share