How do NER and POS mark text previously designated in Stanford CoreNLP?

I use the Stamed Recognizer (NER) tag and Part of of Speech (POS) tag in Stanford CoreNLP in my application. The problem is that my code encodes the text in advance, and then I need NER and POS to mark each token. However, I was able to learn how to do this using command line options, but not programmatically.

Can someone please tell me how programmatically I can use NER and POS-tactoxed text using Stanford CoreNLP?

Edit:

I use individual NER and POS instructions. Thus, my code was written in accordance with the instructions given in the Stanford NER and POS packages. But I have CoreNLP in my classpath. Thus, I have CoreNLP in my class path, but with the help of tutorials in NER and POS packages.

Edit:

I just found that there are instructions on how to set properties for CoreNLP here http://nlp.stanford.edu/software/corenlp.shtml , but I wish there was a quick way to do what I want using Stanford NER and POS tags, so I don’t need to transcode everything!

+4
source share
2 answers

If you set the property:

tokenize.whitespace = true

CoreNLP , PTB . :

ssplit.eolonly = true

.

+4

, - , , , , Sentence.toCoreLabelList :

String[] token_strs = {"John", "met", "Amy", "in", "Los", "Angeles"};
List<CoreLabel> tokens = edu.stanford.nlp.ling.Sentence.toCoreLabelList(token_strs);
for (CoreLabel cl : classifier.classifySentence(tokens)) {
  System.out.println(cl.toShorterString());
}

:

[Value=John Text=John Position=0 Answer=PERSON Shape=Xxxx DistSim=463]
[Value=met Text=met Position=1 Answer=O Shape=xxxk DistSim=476]
[Value=Amy Text=Amy Position=2 Answer=PERSON Shape=Xxx DistSim=396]
[Value=in Text=in Position=3 Answer=O Shape=xxk DistSim=510]
[Value=Los Text=Los Position=4 Answer=LOCATION Shape=Xxx DistSim=449]
[Value=Angeles Text=Angeles Position=5 Answer=LOCATION Shape=Xxxxx DistSim=199]
0

Source: https://habr.com/ru/post/1584106/


All Articles