Mix in regular expression -? inside and outside brackets

This is a regex:

(a)?b\1c 

doesn't match bc so far this one:

 (a?)b\1c 

corresponds to this. Why is this? I thought these statements are identical.

+6
source share
3 answers

In your first example, (a)?b\1c , \1 belongs to your group (a) , this means you must have a :

enter image description here

  • abac will match
  • bac will match
  • bc will not match

In your second example (a?)b\1c , \1 refers to (a?) , Where a is optional:

enter image description here

  • abac will match
  • bac will not match
  • bc will match

Back link doesn't care about your external ? (in the first example), he only cares about what is inside the bracket .

+6
source

This is a bit confusing, but let's see, I'll start with the second regular expression:

 (a?)b\1c 

When he tries to match bc , he first tries (a?) , But since there is no a in bc , () will write an empty string "" , so when we later refer to it in a string using \1 , \1 will be empty line that is always possible.

Now we pass to the second case:

 (a)?b\1c 

(a) will try to match a , but fail, but since the whole group (a)? is optional, the regular expression continues, now it tries to find b OK, then \1 , but (a)? do not match with anything, even with an empty string, so the match does not work.

So the difference between the two regular expressions is that in (a?) capture group captures an empty string, which can be referenced later, and successfully matches with \1 , but (a)? creates an optional capture group that does not match anything that references it later using \1 will always fail unless the group matches a .

+3
source

In the first version, the brackets end with a , so \1 returns a .

In the second regular expression, parentheses catch a? so \1 returns a? which means "0 or 1 a ".

Since a is optional in the second regular expression, bc matches so well with the end of the second regular expression ( b\1c )

+2
source

Source: https://habr.com/ru/post/957285/


All Articles