Paraphrase recognition using sentence-level affinity

I am a new member of NLP (natural language processing. As an initial project, I am developing a paraphrase recognizer (a system that can recognize two similar sentences). For this recognizer, I am going to apply various measures at three levels, namely: lexical, syntactic, semantic. At the lexical level, there are multiple similarity measures, such as cosine similarity, correspondence coefficient, jaccard coefficient ... etc. for these measures I use the simMetrics package developed by sheffield university .. itโ€™s a wonderful package for various similarity measures. It contains many similarity measures. But for measures of Levenshteinโ€™s distance and distance to the ryo-mikler of the code, the code is only at the symbol level * only. I need the code at the sentence level (i.e. given one word as a unit instead of a character).And also the code for the Manhattan distance is missing in SimMetrics ... I ask the experts to give me a proposal to develop the required code (or) to provide me with a code at the proposal level for the above mentioned measures.

Thank you in advance for your time and efforts to help me.

+3
source share
2 answers

I have been working in the field of NLP for several years, and I completely agree with those who provided answers / comments. This is really a hard nut to crack! But let me provide some pointers:

(1) : , - , , , , . : . -, / . .

(2) : . PCFG ( TAG. TAG = , CFG).

(3) : , Wordnet, . . , , ( ) " ", .

+3

, . - ( ), , chunking.

Python NLTK - , , . , : , . "", / .

+2

Source: https://habr.com/ru/post/1784234/


All Articles