R: How does gsub handle spaces?

I have a character string "ab b cde" , i.e. "ab[space]b[space]cde" . I want to replace space-b and space-c with spaces, so the output line is "ab[space][space][space][space]de" . I cannot figure out how to get rid of the second "b" without deleting the first. I tried:

 gsub("[\\sb,\\sc]", " ", "ab b cde", perl=T) 

but it gives me "a[spaces]de" . Any pointers? Thanks.

Edit: consider a more complex task: I want to convert the string "akui i ii" ie "akui[space]i[space]ii" to "akui[spaces|" by removing "space-i" and "space-ii" .

+4
source share
3 answers

You can use lookbehind matching as follows:

 gsub("(?<=\\s)i+", " ", "akui i ii", perl=T) 

Edit: lookbehind there is still a path demonstrated by another example from your original post. Hope this helps.

+2
source

[\sb,\sc] means "one character in the middle of a space, b ,,, space, c ". You probably want something like (\sb|\sc) , which means "space followed by b , or space followed by c " or \s[bc] , which means "space followed by b or c . "

 s <- "ab b cde" gsub( "(\\sb|\\sc)", " ", s, perl=TRUE ) gsub( "\\s[bc]", " ", s, perl=TRUE ) gsub( "[[:space:]][bc]", " ", s, perl=TRUE ) # No backslashes 

To delete multiple letter instances (as in the second example), enable + after deleting the letter.

 s2 <- "akui i ii" gsub("\\si+", " ", s2) 
+6
source

This is a simple solution.

  gsub("\\s[bc]", " ", "ab b cde", perl=T) 

This will give you what you want.

+5
source

Source: https://habr.com/ru/post/1396364/


All Articles