Python search technology: word similarity

Question

Python search technology: word similarity

I want to get the percentage of similarity of two words, for example)

abcd versus zzabcdzz == 50% similarity

No need to be very precise. Is there any way to do this? I use python but don't want to rewrite other languages.

+3

python search search-engine similarity

Bin cin Feb 12 '11 at 6:01

source share

4 answers

python difflib

>>> s = SequenceMatcher(None, "abcd", "bcde")
>>> s.ratio()
0.75

+3

TigrisC 12 . '11 6:34

nltk:

http://www.opendocs.net/nltk/0.9.5/api/nltk.wordnet.similarity-module.html

+1

Asterisk 12 . '11 6:25

:

Python difflib.

difflib SequenceMatcher, , . :

def text_compare(text1, text2, isjunk=None):
    return difflib.SequenceMatcher(isjunk, text1, text2).ratio()

0

tzot 12 . '11 12:03

Mark Byers · Accepted Answer · 2011-02-12T06:04:23+0000

Try using python-Levenshteinto calculate the editing distance .

The Levenshtein Python C extension module contains functions for quick computation
Levenshtein (edit) distance and edit operations
string similarity
approximate median lines and usually line averaging
sequence of strings and establish similarity

, , . 4, - 8, 50%.

Python search technology: word similarity

More articles: