Dameru-Levenshtein distance for language specific features

For Dutch people, two β€œij” characters count as one letter, which is easily exchanged with β€œy”.

For the project I'm working on, I would like to have the Damerau-Levenshtein distance option , which calculates the distance between β€œij” and β€œy” as 1 instead of the current value of 2.

I tried this myself, but could not. My problem is that I have no idea how to handle the fact that both texts have different lengths. Does anyone have a sentence / code snippet on how to solve it?

Thank.

+3
source share
3 answers

. " " , "". , .

, , "", "gh" -f- . , "" , , . "" "ruf"? "", ""? o- "oe"?

-y- -ij-. , , , , -j- of -i-, -y-? , ?

-ij- , U00EC, .

?

+2

, D-L , - , .

( ), , .

,

, DL , , , ij y, .

+1

, - , , "ij" "gh" "th", - . Damerau-Levenshtein , , , , , ​​,

, , , "ij" "ij" , ( , ) ( ) .

Otherwise, you will need to search a bit, this will complicate the situation, but should not change the order of growth of the algorithm (I think) if you look only at a constant number of cells. However, there will be much more constant factors.

0
source

Source: https://habr.com/ru/post/1783396/


All Articles