Adist: various Levenshtein alignments depending on how the lines are entered

When using a function adistin R to calculate Levenshtein alignments between pairs of character strings, I get different results depending on whether I run one function for each pair or use vectors to enter several pairs at the same time. Why is this?

Example: Conversions for pairs of strings "knijpen" - "kneifen", "grijpen" - "greifen" and "lopen" - "laufen":

attr(adist("knijpen", "kneifen", counts = TRUE), "trafos")
#      [,1]      
# [1,] "MMIMSDMM"

attr(adist("grijpen", "greifen", counts = TRUE), "trafos")
#      [,1]      
# [1,] "MMIMSDMM"

attr(adist("lopen", "laufen", counts = TRUE), "trafos")
#      [,1]    
# [1,] "MSSIMM"

They agree with my own manual decisions. However, when I enter strings using vectors, I get a slightly different result:

dutch <- c("knijpen", "grijpen", "lopen")
german <- c("kneifen", "greifen", "laufen")
attr(adist(dutch, german, counts = TRUE), "trafos")
#      [,1]       [,2]       [,3]      
# [1,] "MMIMSDMM" "SSIMSDMM" "SSSSDMMM"
# [2,] "SSIMSDMM" "MMIMSDMM" "SSSSDMMM"
# [3,] "SSSIIMMM" "SSSIIMMM" "MSSIMMM" 

[3,3] attr(adist("lopen", "laufen", counts = TRUE), "trafos") (.. "MSSIMM"), M. ?

+2

Source: https://habr.com/ru/post/1675419/


All Articles