How to make Capture Groups Beyond \ 9 get a link in R?

Is it possible to fix groups> 9 in a regular expression in R?

sub("(.+)-(.+)-(.+)-(.+)-(.+)-(.+)-(.+)-(.+)-(.+)", "\\1 & \\9",   
    "abc-02-03-04-05-06-07-08-09")

gives

[1] "abc & 09"

which is the expected result but

sub("(.+)-(.+)-(.+)-(.+)-(.+)-(.+)-(.+)-(.+)-(.+)-(.+)", "\\1 & \\10",   
    "abc-02-03-04-05-06-07-08-09-10")

[1] "abc & abc0"

fails because the expected result would be

[1] "abc & 10"

I need this for a function like the following, which works great for up to 9 formats, but no more:

x <- as.Date(c("2005-09-02", "2012-04-08"))

fmt <- "dddd, d.m.yy"

fmt <- gsub(pattern = "dddd", replacement = "\\\\1", x = fmt)
fmt <- gsub(pattern = "ddd", replacement = "\\\\2", x = fmt)
fmt <- gsub(pattern = "dd", replacement = "\\\\3", x = fmt)
fmt <- gsub(pattern = "d", replacement = "\\\\4", x = fmt)
fmt <- gsub(pattern = "mmmm", replacement = "\\\\5", x = fmt)
fmt <- gsub(pattern = "mmm", replacement = "\\\\6", x = fmt)
fmt <- gsub(pattern = "mm", replacement = "\\\\7", x = fmt)
fmt <- gsub(pattern = "m", replacement = "\\\\8", x = fmt)
fmt <- gsub(pattern = "yyyy", replacement = "\\\\9", x = fmt)
fmt <- gsub(pattern = "yy", replacement = "\\\\10", x = fmt)
fmt <- gsub(pattern = "y", replacement = "\\\\11", x = fmt)
fmt

sub("(.+)-(.+)-(.+)-0?(.+)-(.+)-(.+)-(.+)-0?(.+)-(.+)-(.+)-0?(.+)", fmt, 
    format(x, "%A-%a-%d-%d-%B-%b-%m-%m-%Y-%y-%y"))
+4
source share
2 answers

It is important to note that the limit is nine backlinks; you get unlimited captures. Using str_matchfrom stringr(or, more clunkily, regmatchesfrom the R base), you can always rebuild your code to avoid the need to use backlinks.

library(stringr)
(matches <- str_match(
  "abc-02-03-04-05-06-07-08-09-10", 
  "(.+)-(.+)-(.+)-(.+)-(.+)-(.+)-(.+)-(.+)-(.+)-(.+)")
)
##      [,1]                             [,2]  [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
## [1,] "abc-02-03-04-05-06-07-08-09-10" "abc" "02" "03" "04" "05" "06" "07" "08" "09"  "10" 
paste(matches[, 2], matches[, 11], sep = " & ")
## [1] "abc & 10"
+4
source

, .

elements <- c(1,10)
paste(strsplit("abc-02-03-04-05-06-07-08-09-10", '-')[[1]][elements], collapse=' & ')
## [1] "abc & 10"

sapply, :

sapply(strsplit("abc-02-03-04-05-06-07-08-09-10", '-'), function(x) paste(x[elements], collapse=' & '))
+2

Source: https://habr.com/ru/post/1568500/


All Articles