Does Regex.sub unexpectedly modify the replacement string with some kind of encoding?

I have a line of the path "... \\ JustStuff \\ 2017GrainHarvest_GQimagesTestStand \\ ..." which I insert into an existing text file instead of another line. I compile a regex pattern and find bounding text to get the location to insert, and then use regex.sub to replace it. I'm doing something like this ...

with open(imextXML, 'r') as file:
    filedata = file.read()
redirpath = re.compile("(?<=<directoryPath>).*(?=</directoryPath>)", re.ASCII)
filedatatemp = redirpath.sub(newdir,filedata)

The inserted text is spoiled, although instead of "\\ 20" ​​instead of "\ x8" and "\\" is replaced by "\" (single slash)

i.e. "... \\ JustStuff \\ 2017GrainHarvest_GQimagesTestStand \\ ..." becomes "... \\ JustStuff \ x817GrainHarvest_GQimagesTestStand \ ..."

What simple thing am I missing to fix it?

Update:

to break it even further, to copy and paste, to reproduce the problem ...

t2 = r'\JustStuff\2017GrainHarvest_GQimagesTestStand\te'
redirpath = re.compile("(?<=<directoryPath>).*(?=</directoryPath>)", re.ASCII)
temp = r"<directoryPath>aasdfgsdagewweags</directoryPath>"
redirpath.sub(t2,temp)

produces ...

>>'<directoryPath>\\JustStuff\x817GrainHarvest_GQimagesTestStand\te</directoryPath>'
+4
source share
1 answer

When you define the string you want to insert, add it rto indicate that it is a string literal of a string

>>> rex = re.compile('a')
>>> s = 'path\\with\\2017'
>>> sr = r'path\\with\\2017'
>>> rex.sub(s, 'ab')
'path\\with\x817b'
>>> rex.sub(sr, 'ab')
'path\\with\\2017b'
+2
source

Source: https://habr.com/ru/post/1693497/


All Articles