How to compare the similarity of people's names using a metric?

I am especially working on a feature that allows with errors and pseudonyms of a person’s names . I did some research and found that there are a number of algorithms for String metric and phonetic libraries.

I tried some of all Jaro Winkler gives good results, as shown below.

compareStrings("elon musk","elon musk"))    --> 1.0 
compareStrings("elonmusk","elon musk"))     --> 0.98
compareStrings("elon mush","elon musk"))    --> 0.99
compareStrings("eln msuk","elon musk"))     --> 0.94
compareStrings("elon","elon musk"))         --> 0.89
compareStrings("musk","elon musk"))         --> 0.0  //This is bad, but can fix that.
compareStrings("mr elon musk","elon musk")) --> 0.81

The above implementation is from the Apocal commons Library . I wanted to know if there is a better implementation that best serves the purpose. Any help is appreciated.

Edit: @newuserua_ext @Trasher Thank you, I appreciate your time. I went through all the StackExchange Q&A related to this. And posted this question, focusing on the names of people.

+4
source share
2 answers

Consider the Double Metaphone . We successfully use it to find matches of "sounds" with names. You can find the Java implementation in Apache Commons:

https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/language/DoubleMetaphone.html

0
source

Levenshtein, , . , .

0

Source: https://habr.com/ru/post/1663297/


All Articles