Is there a way to use NLP or an existing library to add missing punctuation to bad user content?
For example, this line:
Today is Tuesday I went to work on Monday Friday was off
will become:
Today is Tuesday. I went to work on Monday. Friday was off.
I think this problem falls under the uncertainty of the scope of the proposal http://en.wikipedia.org/wiki/Sentence_boundary_disambiguation . I used the OpenNLP option and was pleased with the results.
I briefly talked about this problem (only with partial success).
; , , , , @Rahul . , . , :
, , , .
, , . , ?
, ( ).
, n-gram . LingPipe - . , ( ), , . : , 8-12 , ; , , .
, , , , . , (, ) ( n ).
Source: https://habr.com/ru/post/1536410/More articles:Removing all rows from the database except the first tow lines in oracle - sqlDestroy the foreground notification when the service is killed - javaУдалить из базы данных все выбранные строки, кроме первых двух - sqlHow to make basic remote procedure call (RPC) in Telegram? - rpcScala nested flattening arrays - collectionsUnable to interact with elements at adorner level - c #Determine the offset where the most constructive interference occurs - language-agnosticStarting the first time a node in Heo4j, the HA cluster fails even if it is allowed to create a cluster - neo4jiOS 7 Transparent tab bar and navigation bar - iosHow to configure my index to use BM25 in ElasticSearch using the JAVA API? - javaAll Articles