Using the dependency analyzer in Stanford coreNLP

I use Stanford coreNLP ( http://nlp.stanford.edu/software/corenlp.shtml ) to parse sentences and extract dependencies between words.

I managed to create a dependency graph, as in the example in the supplied link, but I do not know how to work with it. I can print the entire graph using the toString() method, but the problem I have is that methods that look for specific words in the graph, such as getChildList , require an IndexedWord object as a parameter. Now it’s clear why they do it because the nodes of the graph are of type IndexedWord, but it’s not clear to me how I create such an object to search for a specific node.

For example: I want to find node children that represent the word “problem” in my sentence. How do I create an IndexWord object that represents the word “problem” so that I can search for it on the chart?

+4
source share
1 answer

In general, you should not create your own IndexedWord objects. (They are used to represent “tokens,” that is, specific words in the text, rather than “types of words,” and therefore the query for the word “problem” —type of a word — is really invalid, in particular, a sentence can have several tokens of this type of word.)

There are several convenient methods that let you do what you want:

  • sg.getNodeByWordPattern (string pattern)
  • sg.getAllNodesByWordPattern (string pattern)

The first is a little dangerous, as it simply returns the first IndexedWord matching the pattern, or null if none exist. But this is most directly what you requested.

Some other ways to start:

  • sg.getFirstRoot () to find (at first, usually only) the root of the graph, and then go from there, for example, using the sg.getChildren (root) method.
  • sg.vertexSet () to get all IndexWord objects in the graph.
  • sg.getNodeByIndex (int), if you already know the input sentence, and therefore you can query for words by their integer index.

Usually these methods leave you iterating through the nodes. Indeed, the first two methods ... Node ... just do an iteration for you.

+10
source

Source: https://habr.com/ru/post/1381731/


All Articles