NLTK Convert tree to array?

To begin with, I turned the tree into a list: You insert the sentence already indicated and return the tree.

def LanguageCreateTree(tokenizedSentence):
    cp = nltk.RegexpParser(GRAMMAR)
    result = cp.parse(tokenizedSentence)
    result = str(result)
    print(result)

>>> A red cat with a hat
(S A/DT (VP red/VBN (NP cat/NN)) with/IN a/DT hat/JJ)

How can I make a list with lists in it based on this string? I need this to be able to make a list like this:

[['A','DT'], ['VP', ['red','VBN'], ['NP', ['cat','NN']]], ['with','IN'], ['a','DT'], ['hat','JJ']]]
+4
source share
1 answer

This is much simpler than you think :-) The NLTK class Treeis a list (more precisely, it is derived from a list class). And he has exactly the structure that you are. Just use the usual result list methods cp.parse(). Here's an example (building a tree on the fly for illustration):

>>> from nltk import Tree
>>> t = Tree.fromstring("(S A/DT (VP red/VBN (NP cat/NN)) with/IN a/DT hat/JJ)")

>>> print(t[1])
(VP red/VBN (NP cat/NN))
>>> print(t[1][0])   # Element 0 of the subtree at index 1
red/VBN

POS; . , Tree , , repr():

>>> print(repr(t[1]))
Tree('VP', ['red/VBN', Tree('NP', ['cat/NN'])])
+3

Source: https://habr.com/ru/post/1607146/


All Articles