Delete periods before the first comma in a line

How to remove periods before the first comma in these lines?

xx <- c("fefe.3. fregg, ff, 34.gr. trgw", "fefe3. fregg, ff, 34.gr. trgw", "fefe3 fregg, ff, 34.gr. tr.gw") 

Required Conclusion:

  "fefe3 fregg, ff, 34.gr. trgw" "fefe3 fregg, ff, 34.gr. trgw" "fefe3 fregg, ff, 34.gr. tr.gw" 

I started with gsub("\\.","", xx)) , which removes all periods. How to change it to indicate "only period to the first comma"?

+4
source share
4 answers

I feel this is a hoax, but it works for this simple example ....

 xx <- c("fefe.3. fregg, ff, 34.gr. trgw", "fefe3. fregg, ff, 34.gr. trgw", "fefe3 fregg, ff, 34.gr. tr.gw") temp <- strsplit(xx, ",") sapply(seq_along(temp), function(x) { t1 <- gsub("\\.", "", temp[[x]][1]) paste(t1, temp[[x]][2], temp[[x]][-c(1, 2)], sep = ",") }) # [1] "fefe3 fregg, ff, 34.gr. trgw" "fefe3 fregg, ff, 34.gr. trgw" # [3] "fefe3 fregg, ff, 34.gr. tr.gw" 

The main idea above is that since you are only going to look for the period in the first fragment before the decimal point, why not split it and use the base gsub on this and then reassemble the parts, It is unlikely to be effective ....

+4
source

Try the following:

 gsub("\\.(.*,.*)","\\1", xx) [1] "fefe3 fregg, ff, 34.gr. trgw" [2] "fefe3 fregg, ff, 34.gr. trgw" [3] "fefe3 fregg, ff, 34.gr. tr.gw" 

The regular expression works as follows:

  • \\. looking for a period
  • (.*,.*) ,. (.*,.*) searches for a comma inside another text and groups it
  • \\1 belongs to the first group
+3
source

The gsubfn package uses gsubfn to extract the longest substring, starting at the beginning of the line and not containing commas. (It would be a whole line if there were no commas in it). Then it uses gsub to remove the periods inside it. (If it were desirable to remove only the first period inside the substring, change the gsub value to sub .)

 library(gsubfn) gsubfn("^[^,]*", ~ gsub("\\.", "", x), xx) 

Result:

 [1] "fefe3 fregg, ff, 34.gr. trgw" [2] "fefe3 fregg, ff, 34.gr. trgw" [3] "fefe3 fregg, ff, 34.gr. tr.gw" 
+1
source

I do not know about the speed or amount of input, but here an approach is used using the qdap beg2char and char2end :

 ## xx <- c("fefe.3. fregg, ff, 34.gr. trgw", ## "fefe3. fregg, ff, 34.gr. trgw", ## "fefe3 fregg, ff, 34.gr. tr.gw") library(qdap) paste0(gsub("\\.", "", beg2char(xx, ",")), char2end(xx, ",", include=TRUE)) ## > paste0(gsub("\\.", "", beg2char(xx, ",")), char2end(xx, ",", include=TRUE)) ## [1] "fefe3 fregg, ff, 34.gr. trgw" "fefe3 fregg, ff, 34.gr. trgw" ## [3] "fefe3 fregg, ff, 34.gr. tr.gw" 
+1
source

Source: https://habr.com/ru/post/1488924/


All Articles