NLP algorithm for "filling" search queries

I am trying to write an algorithm (which, I believe, will rely on natural language processing methods) to "populate" the list of search terms. There is probably a name for this kind of thing that I don't know about. What is the name of this problem, and what algorithm will give me the following behavior?

Input:

docs = [ "I bought a ticket to the Dolphin Watching cruise", "I enjoyed the Dolphin Watching tour", "The Miami Dolphins lost again!", "It was good going to that Miami Dolphins game" ], search_term = "Dolphin" 

Conclusion:

 ["Dolphin Watching", "Miami Dolphins"] 

It should be understood in principle that if Dolphin appears at all, it is almost always either in the Dolphin Observation bigrams or the Miami Dolphins. Solutions in Python are preferable.

+6
source share
2 answers

It should be understood in principle that if a Dolphin appears at all, it is almost always either in the “View Dolphins” or “Miami Dolphins” bits.

It looks like you want to identify the collocations in which the dolphin is located. There are various methods for finding collocation, the most popular of which is calculating the exact mutual information (PMI) between the conditions in your case, then select the terms with the highest PMI for Dolphin. You may recall the PMI from the sentiment analysis algorithm that I suggested earlier.

The Python implementation of various collocation search methods is included in NLTK as nltk.collocations . This area is covered by some depth in Manning and Schütze FSNLP (1999, but is still relevant for this topic).

+6
source

I used the Natural Language Toolkit in my NLP class at university with decent success. I think it has some tags that can help you identify which nouns and help you parse it into a tree. I don’t remember much, but I started there.

0
source

Source: https://habr.com/ru/post/898328/


All Articles