Levenshtein Algorithm with Custom Character Mapping

I want to use the Levenshtein algorithm to search a list of strings. I want to implement a personalized character mapping for entering Latin characters and searching for elements in Greek.

Display example

:

a = α, ά b = β i = ι,ί,ΐ,ϊ ... (etc) u = ου, ού 

So a search using abu in a list with

  • αbu
  • abού
  • αού (all Greek characters)

will cause all items to appear in the list. (item order is not a problem)

How to apply matching in the algorithm? ( this is where I start)

+4
source share
1 answer

I think the best way would be to preproject your characters to a certain form (like everything in Latin), and then use Levenshtein, as you would normally.

In pseudo code:

 int func(String latinStr, String greekStr) { String mappedStr = convertToLatin(greekStr); // eg now αβ would be ab return Levenstein(latinStr, mappedStr); } 

And in convertToLatin you can request a character dictionary with mappings for replacements and build a new line

+7
source

Source: https://habr.com/ru/post/1402865/


All Articles