Atomic group and non-capturing group

  • I was wondering how to understand the atomic group represented as (?>expr) ? What is it used for?

    In http://www.regular-expressions.info/atomic.html , the only example is expr interleaving, such as the regular expression a(?>bc|b)c matches abcc , but not abc . Are there examples with expr not interleaving?

  • Are an atomic group and the group represented as (?:expr) not captured, the same thing?

Please note that I am not limited to just one specific Regex fragrance.

+6
source share
2 answers

1) When Atomic groups are used, the regex mechanism will not be returned for further permutations if the full regular expression has not been matched for a given string. Whenever you use alternation, if the match is successful, the regular expression will immediately try to match the rest of the expression, but will track the position where other changes are possible. If the rest of the expression does not match, the regular expression will return to the previously marked position and try other combinations. If atomic grouping were used, the regex engine did not track the previous position and would simply refuse to match. The above example does not explain the purpose of using atomic groups. This simply clearly demonstrates the elimination of the retreat. Atomic groups will be used in certain scenarios where greedy quantifiers are used, and additional combinations are possible even if there is no alternation.

2) Atomic groups and groups not related to capture are different. Groups without capture simply do not preserve the meaning of matches. Atomic groups simply turn off backtracking if further combinations are needed.

For example, the regular expression a(?:bc|b)c matches both abcc and abc (without capturing a match), and a(?>bc|c)c matches only abcc . If the regular expression were a(?>b|bc)c , it would only match abc , and a(?:b|bc)c would still match both.

+6
source

Atomic groups (and possessive modifiers ) are useful to avoid catastrophic backtracking - which can be used by attackers to initiate denial of service attacks by smoothing server memory.

Non-exciting groups - it's just not exciting. The regex engine may return to a non-capture group; not into an atomic group.

+5
source

Source: https://habr.com/ru/post/891420/


All Articles