As you already know, re.match will check the pattern only at the beginning of the line, and re.search will check the entire line until it finds a match.
So, is there a difference between re.match('toto', s) and re.search('^toto', s) and what is it?
Let's do a little test:
#!/usr/bin/python import time import re p1 = re.compile(r'toto') p2 = re.compile(r'^toto') ssize = 1000 s1 = 'toto abcdefghijklmnopqrstuvwxyz012356789'*ssize s2 = 'titi abcdefghijklmnopqrstuvwxyz012356789'*ssize nb = 1000 i = 0 t0 = time.time() while i < nb: p1.match(s1) i += 1 t1 = time.time() i = 0 t2 = time.time() while i < nb: p2.search(s1) i += 1 t3 = time.time() print "\nsucceed\nmatch:" print (t1-t0) print "search:" print (t3-t2) i = 0 t0 = time.time() while i < nb: p1.match(s2) i += 1 t1 = time.time() i = 0 t2 = time.time() while i < nb: p2.search(s2) i += 1 t3 = time.time() print "\nfail\nmatch:" print (t1-t0) print "search:" print (t3-t2)
Two methods are tested using a string that does not match and a string that matches.
results:
succeed match: 0.000469207763672 search: 0.000494003295898 fail match: 0.000430107116699 search: 0.46605682373
What can we do with these results:
1) The results are similar when the template succeeds.
2) The performance is completely different when the template fails. This is the most important point because it means that re.search continues to check every line position, even if the template is bound when re.match stops immediately.
If you increase the size of the test line with an error, you will see that re.match does not take more time, but re.search depends on the size of the line.
source share