One solution I can think of is to create a combined pattern for the HTTP URLs and your pattern, and then filter the matches accordingly:
import re t = "http://www.egg1.com http://egg2.com egg3 egg4" p = re.compile('(http://\S+)|(egg\d)') for url, egg in p.findall(t): if egg: print egg
prints:
egg3
egg4
UPDATE: To use this idiom with re.sub() , just set the filter function:
p = re.compile(r'(http://\S+)|(egg(\d+))') def repl(match): if match.group(2): return 'spam{0}'.format(match.group(3)) return match.group(0) print p.sub(repl, t)
prints:
http://www.egg1.com http://egg2.com spam3 spam4
source share