I have text with several links "keyword1: serial numbers". what I need to go to "keyword2: serial numbers". I also need to save keyword2: the number in the dict, depending on the entry you entered at that time. I use regex for substitution, and I could parse the replacement of the link again, like
import re
parser=re.compile(keyword1:(\d+?)\.)
parser2=re.compile((keyword2:\d+\W))
db={}
for entry in entries:
parser.sub("keyword2\g<2>", entry)
db[entry]=parser2.search(entry)
but let's face it, this is inefficient, using both 2 regular expressions and two parses for each entry. I wonder if I can use the function to list matches (unique to serial numbers), use understanding to add keyword2 in front of them, and then save them / replace.
I know that finditer () will provide a list of matching objects, but then it won’t have the necessary functions, unless I go into tangled routes to get positions, replace them, etc.
The problem basically lies in the fact that I want to avoid parsing twice, for small text this is normal, but in a database with hundreds of thousands of records, it is a bad design to encode this way.