Why is the Stanford parser with nltk parsing a sentence incorrectly?

I use the Stanford parser with nltk in python and got help from Stanford Parser and NLTK to set up the Stanford nlp libraries.

from nltk.parse.stanford import StanfordParser from nltk.parse.stanford import StanfordDependencyParser parser = StanfordParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz") dep_parser = StanfordDependencyParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz") one = ("John sees Bill") parsed_Sentence = parser.raw_parse(one) # GUI for line in parsed_Sentence: print line line.draw() parsed_Sentence = [parse.tree() for parse in dep_parser.raw_parse(one)] print parsed_Sentence # GUI for line in parsed_Sentence: print line line.draw() 

I get the wrong parsing and dependency trees, as shown in the example below, it considers β€œsees” as a noun instead of a verb.

Parsing Tree Example Dependency Tree Example

What should I do? It works great when I change the sentence, for example (one = "John sees Bill"). The correct conclusion for this sentence can be seen here. The correct conclusion of the parse tree.

An example of the correct output is also shown below:

correctly disassembled

correct dependency syntax tree

+5
source share
1 answer

Again, the model is not perfect (see Python NLTK pos_tag does not return the correct part tag of speech ); P

You can try a "more accurate" parser using NeuralDependencyParser .

First, configure the parser correctly with the correct environment variables (see Stanford Parser and NLTK and https://gist.github.com/alvations/e1df0ba227e542955a8a ), then:

 >>> from nltk.internals import find_jars_within_path >>> from nltk.parse.stanford import StanfordNeuralDependencyParser >>> parser = StanfordNeuralDependencyParser(model_path="edu/stanford/nlp/models/parser/nndep/english_UD.gz") >>> stanford_dir = parser._classpath[0].rpartition('/')[0] >>> slf4j_jar = stanford_dir + '/slf4j-api.jar' >>> parser._classpath = list(parser._classpath) + [slf4j_jar] >>> parser.java_options = '-mx5000m' >>> sent = "John sees Bill" >>> [parse.tree() for parse in parser.raw_parse(sent)] [Tree('sees', ['John', 'Bill'])] 

Note that NeuralDependencyParser generates dependency trees:

enter image description here

+6
source

Source: https://habr.com/ru/post/1241327/


All Articles