I think you want Leveshtein distance - this tells you how many changes (insert, delete or replace) are needed to convert one line to another.
For example, the difference between abcdeand abcdefis 1, because you insert fafter the last position in abcdeto get abcdef.
The difference between abcdeand is abcdfalso equal to 1, since you replace ein the first row fto get the second.
The difference between abcdeand abdeis 1, because you delete cin the first row to get the second.
Here is the implementation in Java .
source
share