Regexp lazy quantifier

I have a suggestion like this

a something* qbcw 

and I have to match a and q together for example

 (id_1: a, id_2: q) 

b only how

 (id_1: b) 

and c and w together, like (id_1: c id_2: w)

I tried using this regex

 (?:\b(?P<id_1>a|b|c)\b(?:.*?)(?P<id_2>q|w)?\b) 

Because of the lazy operator . *? regexp matches only the first part of the sentence, matching only

 (id_1: a, id_1: b, id_1: c) 

Living example

If we use the greedy operator so that the expression becomes

 (?:\b(?P<id_1>a|b|c)\b(?:.*)(?P<id_2>q|w)?\b) 

Living example

It matches

 (id_1: a) 

all after matching . * .

If the second part is required (lazy on . * ):

 (?:\b(?P<id_1>a|b|c)\b(?:.*?)(?P<id_2>q|w)\b) 

Living example

it matches sentences like

 (id_1: a, id_2: q);(id_1: b, id_2: w) 

as was expected.

You can use a regular expression that "prefers" to match the entire sentence (including the optional part) or matches only the first part ONLY if there is no additional file.

EDIT: Sorry that the regular expressions contained some errors in them.

Last regex:

 (?:\b(?P<id_1>a|b|c)\b(?:.*?)(?P<id_2>q|w)\b) 

and this requires both groups to be mandatory. It matches “something” w, but it does not match “something” or just “a”. I need to match "something" w "as well as" a "and" w "and get the corresponding group respectively:

 (id_1: a , id_2: w) ; (id_1: a, id_2: none) ; (id_1:a , id_2: w) 

I think the required regular expression is:

 (?:\b(?P<id_1>a|b|c)\b(?:.*?)(?P<id_2>q|w)?\b) 

but in the sentence “something” w, it simply matches “a” (due to the lazy on. * operator).

I also updated all live examples.

+5
source share
1 answer

False coincidence of points is the main cause of the problem, since this requires a finite boundary.

If you need to match text that is not specific text, you can use 2 things: either a moderate greedy token or unroll-the-loop regex.

If you have variables, can you use a moderate greedy token and make the second capture group optional with ? quantifier:

 \b(?P<id_1>a|b|c)\b(?:(?!\b(?:a|b|c|q|w)\b).)*(?P<id_2>q|w)?\b ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^ 

See regex demo

+1
source

Source: https://habr.com/ru/post/1237559/


All Articles