What you need is a string search algorithm where you have several patterns (strings from the old version of foo) that you want to find for the text (new version of foo). The Rabin-Karp algorithm is one such algorithm for this kind of problem. I adapted it to your problem:
def linematcher(haystack, needles, lineNumbers): f = open(needles) needles = [line.strip() for n, line in enumerate(f, 1) if n in lineNumbers] f.close() hsubs = set(hash(s) for s in needles) for n, lineWithNewline in enumerate(open(haystack), 1): line = lineWithNewline.strip() hs = hash(line) if hs in hsubs and line in needles: print "{0} ===> {1}".format(lineNumbers[needles.index(line)], n)
Assuming your two files are called old_foo.txt
and new_foo.txt
, you call this function as follows:
linematcher('new_foo.txt', 'old_foo.txt', [1, 3, 4])
When I tried to use your data, it printed:
1 ===> 1 3 ===> 4 4 ===> 6
source share