Regular expression error - don't repeat anything

I get an error when I use this expression:

re.sub(r"([^\s\w])(\s*\1)+","\\1","...") 

I checked the regex on RegExr and returns . , as was expected. But when I try in Python, I get this error message:

 raise error, v # invalid expression sre_constants.error: nothing to repeat 

Can someone explain?

+45
python regex
Sep 09 '10 at 9:00
source share
4 answers

This seems to be a python bug (works fine in vim). The source of the problem is the bit (\ s * ...) +. Basically, you cannot do (\s*)+ , which makes sense because you are trying to repeat something that might be null.

 >>> re.compile(r"(\s*)+") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/re.py", line 180, in compile return _compile(pattern, flags) File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/re.py", line 233, in _compile raise error, v # invalid expression sre_constants.error: nothing to repeat 

However (\s*\1) should not be null, but we only know this because we know that in \ 1. Python doesn't seem to ... this is weird.

+32
Sep 09 '10 at 9:42 on
source share

This is a Python error between "*" and special characters.

Instead

 re.compile(r"\w*") 

Try:

 re.compile(r"[a-zA-Z0-9]*") 

It works, but does not do the same regular expression.

This error seems to be fixed between 2.7.5 and 2.7.6.

+9
Oct 24 '11 at 8:24
source share

Not only is this a Python bug with * in fact, it can also happen when you pass a string as part of your regular expression that needs to be compiled, for example:

 import re input_line = "string from any input source" processed_line= "text to be edited with {}".format(input_line) target = "text to be searched" re.search(processed_line, target) 

this will lead to an error if the processed string contains some "(+)", for example, as you can find in chemical formulas or such character strings. the solution is to run away, but when you do it on the fly it may happen that you don't do it right ...

+2
Jun 20 '17 at 15:52
source share

Besides the detected and corrected error, I just notice that the error message sre_constants.error: nothing to repeat bit confusing. I tried to use r'?.*' As a template and thought it was complaining for some strange reason about * , but the problem is what ? is a way of saying "repeat zero or once." So I needed to say r'\?.*' To match the literal ?

0
05 Oct '17 at 19:22
source share



All Articles