Regular expression to keep matches, delete others

I am having problems with this regex. Consider the following vector.

> vec <- c("new jersey", "south dakota", "virginia:chincoteague",
           "washington:whidbey island", "new york:main")

Of those lines that contain :, I would like to leave only mainafter :, as a result of which

[1] "new jersey" "south dakota" "new york:main"

So far, I managed to get there with this ugly enclosed nightmare, which, of course, is far from optimal.

> g1 <- grep(":", vec)
> vec[ -g1[grep("main", grep(":", vec, value = TRUE), invert = TRUE)] ]
# [1] "new jersey"    "south dakota"  "new york:main"

How can I write one regex to save :main, but delete others containing :?

+4
source share
2 answers

Using |(select one containing :mainor which does not contain at all :):

> vec <- c("new jersey", "south dakota", "virginia:chincoteague",
+            "washington:whidbey island", "new york:main")
> grep(":main|^[^:]*$", vec)
[1] 1 2 5
> vec[grep(":main|^[^:]*$", vec)]
[1] "new jersey"    "south dakota"  "new york:main"
+6
source

:

^[^:]+(?::main.*)?$

R-, -

grepl("^[^:]+(?::main.*)?$", subject, perl=TRUE);

  • ^ ,
  • [^:]+ ,
  • non-capture (?::main.*)? , main ,
  • $ ,
+3

Source: https://habr.com/ru/post/1545933/


All Articles