Corenlp is too slow for bad input

Question

Corenlp is too slow for bad input

Corenlp analysis is too slow for poor input. It gives the following warnings and takes a long time to parse.

To enter: "The fourth son of Lincolns, Thomas" Thad "Lincoln, was born on April 4, 1853, and died of heart failure at the age of 18 on July 16, 1871.
He causes this error:


    Jul 24, 2015 4:03:42 PM edu.stanford.nlp.dcoref.RuleBasedCorefMentionFinder funkyFindLeafWithApproximateSpan
    WARNING: RuleBasedCorefMentionFinder: Failed to find head token:
    Tree is: (ROOT (S (NP (NP (NP (DT The) (NNS Lincolns) (POS ')) (JJ fourth) (NN son)) (, ,) (NP (NNP Thomas) () (NNP Tad) ('' '') (NNP Lincoln)) (, ,)) (VP (VP (VBD was) (VP (VBN born) (PP (IN on) (NP (NP (NNP April) (CD 4)) (, ,) (NP (CD 1853)) (, ,))))) (CC and) (VP (VBD died) (PP (IN of) (NP (NN heart) (NN failure))) (PP (IN at) (NP (NP (DT the) (NN age)) (PP (IN of) (NP (CD 18))))) (PP (IN on) (NP (NNP July) (CD 16))) (, ,) (NP (CD 1871)))) (. .)))
    token = |NP|0|, approx=0
    Jul 24, 2015 4:03:42 PM edu.stanford.nlp.dcoref.RuleBasedCorefMentionFinder funkyFindLeafWithApproximateSpan
    WARNING: RuleBasedCorefMentionFinder: Last resort: returning as head: 1871
    Jul 24, 2015 4:03:42 PM edu.stanford.nlp.dcoref.RuleBasedCorefMentionFinder findHead
    WARNING: Invalid index for head 34=34-0: originalSpan=[The Lincolns '], head=1871-35
    Jul 24, 2015 4:03:42 PM edu.stanford.nlp.dcoref.RuleBasedCorefMentionFinder findHead
    WARNING: Setting head string to entire mention

It took me 600.339 seconds to parse the cleared text of this document https://en.wikipedia.org/wiki/Abraham_Lincoln .
Is there any way to speed this up? Is there any option in corenlp to automatically skip bad sentences? or is there a way to set a time limit for parsing a sentence, after which the parser will automatically skip the sentence?

+4

parsing nlp stanford-nlp

alienCoder Jul 24 '15 at 10:52

source share

No one has answered this question yet.

See related questions:

2024

How do you parse and process HTML / XML in PHP?

2005

How do I parse a string in float or int?

1635

JSON parsing in JavaScript?