I have a special precedent that I do not yet know how to cover. I want to parse a string based on field_name / field_length. To do this, I define the regex as follows:
'(?P<%s>.{%d})' % (field_name, field_length)
And this is repeated for all fields.
I also have a regex to remove spaces to the right of each field:
self.re_remove_spaces = re.compile(' *$')
Thus, I can get each field as follows:
def dissect(self, str): data = { } m = self.compiled.search(str) for field_name in self.fields: value = m.group_name(field_name) value = re.sub(self.re_remove_spaces, '', value) data[field_name] = value return data
I need to do this processing for millions of rows, so it should be efficient.
It annoys me that I would rather remove the dissection + space in one step, using compiled.sub instead of compiled.search , but I don't know how to do it.
In particular, my question is:
How to perform regular expression substitution by combining it with named groups in Python regular expressions?
source share