/ Grep filter functions behave strangely

Take the following code to select only alphanumeric strings from a list of strings:

isValid = function(string){
  return(grep("^[A-z0-9]+$", string))
}

strings = c("aaa", "test@test.com", "", "valid")

print(Filter(isValid, strings))

Output signal [1] "aaa" "test@test.com".

Why is it "valid"not displayed, and why is it issued "test@test.com"?

+4
source share
2 answers

The function Filtertakes a logical vector, you specified a numerical value. Use grepl:

isValid = function(string){
  return(grepl("^[A-z0-9]+$", string))
}

strings = c("aaa", "test@test.com", "", "valid")

print(Filter(isValid, strings))
[1] "aaa"   "valid"

Why didn't it work grep? This is due to the fact that the R-cast of numerical values ​​to a logical and strange value Filter.

Here, what happened grep("^[A-z0-9]+$", string)correctly returns 1 4. This is a match index for the first and fourth elements.

, Filter. as.logical(unlist(lapply(x, f))).

, isValid(strings[1]), isValid(strings[2]) . :

[[1]]
[1] 1

[[2]]
integer(0)

[[3]]
integer(0)

[[4]]
[1] 1

unlist , 1 1 TRUE TRUE. , :

strings[which(c(TRUE, TRUE))]

strings[c(1,2)]
[1] "aaa"           "test@test.com"

, Filter:)

+5

, ..

isValid <- function(string){
  v1 <- string[!string %in% grep('[[:punct:]]', string, value = TRUE)] 
  return(v1[v1 != ''])
  }
isValid(strings)
#[1] "aaa"   "valid"
+2

Source: https://habr.com/ru/post/1651681/


All Articles