a.*?b should check for every character consumed if it matches the pattern (i.e. if the next is b ). This is called a retreat.
Using the a12b line a12b execution will look like this:
- Consume
a - Use the following 0 characters. Is the next a
b ? No. - Use the following character (
a1 ). Is the next a b ? No. - Use the following character (
a12 ). Is the next a b ? Yes! - Consume
b - Match
a[^b]*b consumes everything that is not b without asking itself questions, and because of this it is much faster for longer strings.
Using the a12b line a12b execution will look like this:
- Consume
a - Consume all that follows; it is not
b . ( a12 ) - Consume
b - Match
RegexHero has a reference function that demonstrates this using the .NET regex engine.
Besides the performance difference, they correspond to the same lines in your example.
However, there are situations when there is a difference between them. In line aa111b111b
(?<=aa.*?)b matches as b , and (?<=aa[^b]*)b matches only the first.
Vache source share