Matching regular expressions with a comma limited to non-white space

I am trying to replace commas limited to non-white space with a space while keeping the other commas intact (in R).

Imagine that I have:

j<-"Abc,Abc, and c" 

I want too:

 "Abc Abc, and c" 

This almost works:

 gsub("[^ ],[^ ]"," " ,j) 

But it removes characters on each side of the commas to give:

 "Ab bc, and c" 
+5
source share
4 answers

You can use the PCRE regular expression with negative lookbehind and lookahead:

 j <- "Abc,Abc, and c" gsub("(?<!\\s),(?!\\s)", " ", j, perl = TRUE) ## => [1] "Abc Abc, and c" 

Watch the regex demo

More details

  • (?<!\\s) - there should be no spaces immediately before ,
  • , - letter
  • (?!\\s) - there can be no space immediately after ,

An alternative solution is to compare with,, enclosed with layers:

 j <- "Abc,Abc, and c" gsub("\\b,\\b", " ", j) ## => [1] "Abc Abc, and c" 

See another demo of R.

+5
source

You can use backlinks as follows:

 gsub("([^ ]),([^ ])","\\1 \\2" ,j) [1] "Abc Abc, and c" 

() in a regular expression captures characters adjacent to a comma. \\1 and \\2 return these captured values ​​in the order in which they were captured.

+3
source

We can try

 gsub(",(?=[^ ])", " ", j, perl = TRUE) #[1] "Abc Abc, and c" 
+3
source

Perhaps it also works:

 library("stringr") j<-"Abc,Abc, and c" str_replace(j,"(\\w+),([\\w]+)","\\1 \\2") 
0
source

Source: https://habr.com/ru/post/1264883/


All Articles