B)A" I would like to delete the entire pu...">

R Regex: delete only the next character after>

I have the following line in R:

string1 = "A((..A>B)A"

I would like to delete the entire punctuation, and the letter immediately after >, that is>B

Here is the result I want:

output = "AAA"

I tried using gsub()as follows:

output = gsub("[[:punct:]]","", string1)

But it gives AABA, which immediately saves the next character.

+4
source share
3 answers

You speak

delete all punctuation and letter immediately after>

Punctuation matches with [[:punct:]], and a letter can match with [[:alpha:]], so you can use the regular expression TRE with gsub:

string1 = "A((..A>B)A"
gsub(">[[:alpha:]]|[[:punct:]]", "", string1)
# => [1] "AAA"

- R

, > char, [[:punct:]], , .

:

  • >[[:alpha:]] - a >
  • | -
  • [[:punct:]] - .
+1

, , , >.

gsub('(?<=>).|[[:punct:]]', '', "A((..A>B)A", perl=TRUE)
## [1] "AAA"
+7

This example also works with a slightly less complex regular expression without using perl:

gsub("[[:punct:]]|>(.)", "", "A((..A>B)A")
[1] "AAA"
+2
source

Source: https://habr.com/ru/post/1681634/


All Articles