Currently, the first part of your regex looks like this:
(?<=^\bHEADa|HEAD\b)
You have two alternatives; one corresponds to five characters, and the other corresponds to four, and why you get an error. Some flavors of regular expressions will allow you to do this, although they say they do not allow variable lengths for lookbehinds, but not Python. You can break it into two types, for example:
(?:(?<=^HEADa\b)|(?<=\bHEAD\b))
... but you probably don't need it. Try instead:
(?:^HEADa|\bHEAD)\b
Anything that falls under (.*?) Later will still be available through group # 1. If you really need all the text between the delimiters, you can capture this in group # 1 and the other group will become # 2 (or you can use named groups and should not keep track of numbers).
Generally speaking, lookbehind should never be your first resort. This might seem like an obvious tool to work with, but you're usually better off doing a direct match and extracting the part you want with the capture group. And this applies to all tastes, not just Python; just because you can do more with lookbehinds in other tastes does not mean you should.
By the way, you may have noticed that I redistributed your word boundaries; I think this is what you really intended.
source share