Regex_replace matches end of line twice

Consider the following program:

#include <iostream>
#include <regex>

int main(int argc, char* argv[]) {
  if (argc==4)
    std::cout << std::regex_replace(
        argv[1], std::regex(argv[2]), argv[3]
      ) << std::endl;
}

Launch

./a.out a_a_a '[^_]+$' b

gives the expected result a_a_b. But working

./a.out a_a_a '[^_]*$' b

displays a_a_bb.

boost::regex_replace has the same behavior.

I do not understand why the empty string after the last ais matched again when I already used $.

+4
source share
3 answers

The simple difference between a quantifier *and a quantifier +. *corresponds to the terminal letter a, as well as zero width at the end.

You can see it here:

[^_]*$

Not only does it correspond to the latter a, but also corresponds to zero width after that, and thus the result will bea_a_bb


, , :

[^_]*

a_a_a, :

bb_bb_bb

[^_]*


, [^_] a, * , : (= ) , , [^_]* a_a_a 6 : a a _ ..

a_a_a
^^^^^^
+1

( 0).

abc$$$ abc, , ^^^abc. , $ a$ (empty)$.

+1

I think because

+ means 1 or many (at least one occurrence for the match to succeed)
* means 0 or many (the match succeeds regardless of the presence of the search string)

So, [^_]+$only matches while [^_]*$matches the character a and empty, so it does double b.

0
source

Source: https://habr.com/ru/post/1685278/


All Articles