I want to remove from the string all characters that are not numbers, minus signs or decimal points.
I imported data from Excel using read.xlsthat include some weird characters. I need to convert them to numeric. I'm not very familiar with regular expressions, so you need an easier way to do the following:
excel_coords <- c(" 19.53380ݰ", " 20.02591°", "-155.91059°", "-155.8154°")
unwanted <- unique(unlist(strsplit(gsub("[0-9]|\\.|-", "", excel_coords), "")))
clean_coords <- gsub(do.call("paste", args = c(as.list(unwanted), sep="|")),
replacement = "", x = excel_coords)
> clean_coords
[1] "19.53380" "20.02591" "-155.91059" "-155.8154"
A bonus if someone tells me why these symbols appeared in some of my data (degree signs are part of the original Excel worksheet, and the rest are not).
source
share