Split clause conjunctions using core-nlp DocumentPreprocessor

I am trying to break this text into sentences using the DocumentPreprocessor core-nlps method.

Below is the code I'm using.

List<String> splitSentencesList = new ArrayList<>();
Reader reader = new StringReader(inputText);
DocumentPreprocessor dp = new DocumentPreprocessor(reader); 
 for(List<HasWord> sentence :dp){
               splitSentencesList.add(Sentence.listToString(sentence).toLowerCase().replace(" .", ""));} 

This works for most cases. But how do we deal with conjunctions within a sentence?

eg:

I like coffee and donuts for my breakfast.

Ideally, this should be further processed as:

I like coffee for my breakfast.
I like donuts for my breakfast.

One option is to make a regular expression rule for further separation. Is there a built-in method to achieve this in core-nlp.

any pointers to this are appreciated.

+4
source share
1 answer

: DocumentPreprocessor. . , (, , ), (, ).

. CoreNLP , .

Dependency analysis

, , , , .

+2

Source: https://habr.com/ru/post/1681687/


All Articles