Find matching patterns from patterns list with grepl

I used grepl to check if a string contains any pattern from a set of patterns (I used '|' to separate patterns). Reverse search did not help. How to define a set of patterns that match?

Additional information: . This can be solved by writing a loop, but it is very time consuming, since my set has> 100,000 lines. Is it possible to optimize it?

For example: let the line be a <- "Hello"

pattern <- c("ll", "lo", "hl")

pattern1 <- paste(pattern, collapse="|") # "ll|lo|hl"

grepl(a, pattern=pattern1) # returns TRUE

grepl(pattern, pattern=a) # returns FALSE 'n' times - n is 3 here
+4
source share
2 answers

You are looking str_detectfrom the package stringr:

library(stringr)

str_detect(a, pattern)
#[1]  TRUE  TRUE FALSE

If you have multiple lines, for example a = c('hello','hola','plouf'), you can do:

lapply(a, function(u) pattern[str_detect(u, pattern)])
+7
source

base R lookahead, (?=), . gregexpr .

## changed your string so the second pattern matches twice
a <- "Hellolo"
pattern <- c("ll", "lo", "hl")
pattern1 <- sprintf("(?=(%s))", paste(pattern, collapse=")|(")) #  "(?=(ll)|(lo)|(hl))"

attr(gregexpr(pattern1, a, perl=T)[[1]], "capture.start")
# [1,] 3 0 0
# [2,] 0 4 0
# [3,] 0 6 0

, 2 6 , 1 3 ..

+1

Source: https://habr.com/ru/post/1599101/


All Articles