I am trying to identify the most popular keywords for a particular class of documents in my collection. Assuming the domain is “computer science” (which, of course, includes networking, computer architecture, etc.), What is the best way to save these domain-related keywords from text? I tried to use Wordnet, but I do not quite understand how best to use it to extract this information.
Is there any known list of words that I can use as a white list, given the fact that I do not know all the keywords for the domain in advance? Or are there good nlp / machine learning methods for identifying domain keywords?
source share