Python regex speed

regarding a regular expression (in particular python re), if we ignore the way the expression is written, is the length of the text the only factor in the time it takes to process the document? Or are there other factors (such as how the text is structured) that also play an important role?

+3
source share
2 answers

Significant length of text and its contents.

As an example, a regular expression a+bwill not be able to quickly match a string containing a million b, but slower on a string containing one million as. This is because in the second case more backtracking will be required.

import timeit
x = "re.search('a+b', s)"
print timeit.timeit(x, "import re;s='a'*10000", number=10)
print timeit.timeit(x, "import re;s='b'*10000", number=10)

Results:

6.85791902323
0.00795443275612
+4

, . ( ) (x+x+)+y .

xxxxxxxxxxy, 7- . xxxxxxxxxx (), 2558 .

xxxxxxxxxxxxxxy vs. xxxxxxxxxxxxxx 7 40958 .. ...

, , , . .

+6

Source: https://habr.com/ru/post/1766700/


All Articles