I have 2 texts (maximum 4000 characters) of various lengths. And I need to get a similarity coefficient based on (partial) rephrasing. Please note that the same part of the texts may be in different positions in each text (So Levenshtein is not a solution).
The comparison process should also be:
- do not increase expo. with text size
- be performance friendly. :)
It seems that " adaptive local keyword alignment " is a possible solution.
Do you have an example implementation? PHP is the preferred language, but I can translate. :)
Do you have any other solution / idea / experience on this topic?
Thanks for your great help.
source
share