Replacing characters in a regular expression

Using Python, I have the following lines:

['taxes.............................       .7        21.4    (6.2)','regulatory and other matters..................$   39.9        61.5        41.1','Producer contract reformation cost recoveries............................   DASH        26.3        28.3']

I need to replace each of the points with a space, but not periods in numbers. The result should look like this:

['taxes                                    .7        21.4    (6.2)','regulatory and other matters                  $   39.9        61.5        41.1','Producer contract reformation cost recoveries                               DASH        26.3        28.3']

I tried the following:

dots=re.compile('(\.{2,})(\s*?[\d\(\$]|\s*?DASH|\s*.)')
newlist=[]
for each in list:
    newline=dots.sub(r'\2'.replace('.',' '),each)
    newdoc.append(newline)

But this code does not save a space. Thanks!

+4
source share
2 answers

Use negative images inre.sub

>>> import re
>>> s = ['taxes.............................       .7        21.4    (6.2)','regulatory and other matters..................$   39.9        61.5        41.1','Producer contract reformation cost recoveries............................   DASH        26.3        28.3']
>>> [re.sub(r'(?<!\d)\.(?!\d)', ' ', i) for i in s]
['taxes                                    .7        21.4    (6.2)', 'regulatory and other matters                  $   39.9        61.5        41.1', 'Producer contract reformation cost recoveries                               DASH        26.3        28.3']
+6
source

If the input always looks like your pattern, you can also use a border.

Replace with \.\Bone space

It only checks to see if there is a word character after the period. This way it will match 0., but not0.0

. regex101

+1

Source: https://habr.com/ru/post/1626233/


All Articles