I would like the parse tree generated by the coreNLP R-package to be in data.tree R. The parse tree is created using the following code:
options( java.parameters = "-Xmx2g" )
library(NLP)
library(coreNLP)
#initCoreNLP() # change this if downloaded to non-standard location
initCoreNLP(annotators = "tokenize,ssplit,pos,lemma,parse")
## Some text.
s <- c("A rare black squirrel has become a regular visitor to a suburban garden.")
s <- as.String(s)
anno<-annotateString(s)
parse_tree <- getParse(anno)
parse_tree
The output parse tree is as follows:
> parse_tree
[1] "(ROOT\r\n (S\r\n (NP (DT A) (JJ rare) (JJ black) (NN squirrel))\r\n (VP (VBZ has)\r\n (VP (VBN become)\r\n (NP (DT a) (JJ regular) (NN visitor))\r\n (PP (TO to)\r\n (NP (DT a) (JJ suburban) (NN garden)))))\r\n (. .)))\r\n\r\n"
I found that after publishing Visualize the structure of the parse tree
. It hides the openNLP package created by the parse tree in the form of a tree. But the parse tree is different from the one generated by coreNLP, and the solution is not converted to the data.tree format that I want.
EDIT
By adding the two lines below, we can use the function provided in the publication Visualize the structure of the parse tree .
parse_tree <- gsub("[\r\n]", "", parse_tree)
parse_tree <- gsub("ROOT", "TOP", parse_tree)
library(igraph)
library(NLP)
parse2graph(parse_tree,
title = sprintf("'%s'", x), margin=-0.05,
vertex.color=NA, vertex.frame.color=NA,
vertex.label.font=2, vertex.label.cex=1.5, asp=0.5,
edge.width=1.5, edge.color='black', edge.arrow.size=0)
data.tree, data.tree