I think the regex matches left to right. So the first pattern that matches is the empty line before 7... If he finds 9 , he will really correspond to him greedy: and try to "eat" (which is the correct terminology) as many characters as possible.
If you request for:
>>> print(re.findall(r'9*',line)); ['', '', '', '', '9999', '', '', '', '']
It matches all empty lines between characters, and, as you can see, 9999 also matches.
The main reason is probably performance: if you are looking for a pattern in a string of 10M + characters, you are very happy if the pattern is already in the first ten-digit characters. You donβt want to waste your energy looking for a βbetterβ match ...
EDIT
If there are 0 or more cases, one group (in this case 9 ) is repeated zero or more times. In an empty line, characters are repeated exactly 0 times. If you want to match patterns where characters are repeated one or more times , you should use
9+
It leads to:
>>> print(re.search(r'9+', line)); <_sre.SRE_Match object; span=(4, 8), match='9999'>
re.search for a template that accepts an empty string is probably not that much useful, as it will always match the empty string before the actual start of the string.
source share