Std :: regex_match and lazy quantifier with strange behavior

I know that:
Quantitative Column: as little as possible (shortest match)

Also be aware that the constructor:

basic_regex( ...,
            flag_type f = std::regex_constants::ECMAScript );

A:
ECMAScriptsupports non-greedy matches,
and the regex ECMAScript "<tag[^>]*>.*?</tag>"
will only match until the first tag is closed ... en.cppreference

A:
No more than one parameter of grammar should be chosen from ECMAScript, basic, extended, awk, grep, egrep. If no grammar is selected, ECMAScriptit is considered selected ... en.cppreference

A:
Please note that it regex_matchwill only successfully match the regular expression for the entire sequence of characters, while it std::regex_searchwill successfully match the subsequence ... std :: regex_match


Here is my code: + Live

#include <iostream>
#include <string>
#include <regex>

int main(){

        std::string string( "s/one/two/three/four/five/six/g" );
        std::match_results< std::string::const_iterator > match;
        std::basic_regex< char > regex ( "s?/.+?/g?" );  // non-greedy
        bool test = false;

        using namespace std::regex_constants;

        // okay recognize the lazy operator .+?
        test = std::regex_search( string, match, regex );
        std::cout << test << '\n';
        std::cout << match.str() << '\n';
        // does not recognize the lazy operator .+?
        test = std::regex_match( string, match, regex, match_not_bol | match_not_eol );
        std::cout << test << '\n';
        std::cout << match.str() << '\n';
} 

and conclusion:

1
s/one/
1
s/one/two/three/four/five/six/g

Process returned 0 (0x0)   execution time : 0.008 s
Press ENTER to continue.

std::regex_matchshould not match anything and it should return 0using an unwanted quantifier.+?

In fact, there is a non-greedy .+? quantifier has the same meaning as the greedy , and the two /.+?/and /.+/correspond to the same line. They are different. So, the problem is why the question mark is ignored?

regex101

Quick test:

$ echo 's/one/two/three/four/five/six/g' | perl -lne '/s?\/.+?\/g?/ && print $&'
$ s/one/
$
$ echo 's/one/two/three/four/five/six/g' | perl -lne '/s?\/.+\/g?/ && print $&'
$ s/one/two/three/four/five/six/g


: std::basic_regex< char > regex ( "s?/.+?/g?" );
: std::basic_regex< char > regex ( "s?/.+/g?" );
std::regex_match. !
std::regex_search .
s? g? , /.*?/ - !

g++ --version
g++ (Ubuntu 6.2.0-3ubuntu11~16.04) 6.2.0 20160901
0
1

. regex_match , s?/.+?/g? , .

"" ( regex_search), , :

Non-greedy:

a.*?a: ababa
a|.*?a: a|baba
a.*?|a: a|baba  # ok, let try .*? == "" first
# can't go further, backtracking
a.*?|a: ab|aba  # lets try .*? == "b" now
a.*?a|: aba|ba
# If the regex were a.*?a$, there would be two extra backtracking
# steps such that .*? == "bab".

Greedy:

a.*?a: ababa
a|.*a: a|baba
a.*|a: ababa|  # try .* == "baba" first
# backtrack
a.*|a: abab|a  # try .* == "bab" now
a.*a|: ababa|

regex_match( abc ) regex_search( ^abc$ ).

+1

Source: https://habr.com/ru/post/1685294/


All Articles