google-diff-match-patch can list tuples
The first element indicates whether this is insert (1), delete (-1), or equality (0). The second element indicates the affected text.
For instance:
diff_main("Good dog", "Bad dog") => [(-1, "Goo"), (1, "Ba"), (0, "d dog")]
So we just need to filter this list.
Python sample code:
Ignored_marks = re.compile('[ ,\.;:!\'"?-]+$') def unmark_minor_diffs(diffs): #diffs are list of tuples produced by google-diff-match-patch cooked_diffs = [] for (op, data) in diffs: if not Ignored_marks.match(data): cooked_diffs.append((op, data)) else: if op in (0, -1): cooked_diffs.append((0, data)) return cooked_diffs
source share