Here's the problem: your vector is a symbol in mode, so of course it's not a number. This last element is interpreted as the string "NaN". Using is.nan will only make sense if the vector is numeric. If you want the value to be lost in the character vector (so that it is correctly processed by regression functions), use (without quotes), NA_character_ .
> tester1 <- c("2", "2", "3", "4", "2", "3", NA_character_) > tester1 [1] "2" "2" "3" "4" "2" "3" NA > is.na(tester1) [1] FALSE FALSE FALSE FALSE FALSE FALSE TRUE
Neither "NA" nor "NaN" are truly absent in symbol vectors. If for some reason there were βNaNβ values ββin the factor variable, you could just use boolean indexing:
tester1[tester1 == "NaN"] = "NA" # but that would not really be a missing value either # and it might screw up a factor variable anyway. tester1[tester1=="NaN"] <- "NA" Warning message: In `[<-.factor`(`*tmp*`, tester1 == "NaN", value = "NA") : invalid factor level, NAs generated ########## tester1 <- factor(c("2", "2", "3", "4", "2", "3", NaN)) > tester1[tester1 =="NaN"] <- NA_character_ > tester1 [1] 2 2 3 4 2 3 <NA> Levels: 2 3 4 NaN
This last result may be unexpected. There is a remaining βNaNβ level, but none of the elements are βNaNβ. Instead, the element that was "NaN" is now the real missing value, indicated in print as.
source share