How many leaves are there before the subtree?

I use nltk trees to read the syntactic syntax expressions of the text in the text (using Tree.fromstring()), and I after finding the position of the sheet of the given subtree in the big tree. In principle, I would like to oppose it leaf_treeposition().

In the tree, tI got a subtree np, what I want is an indexx to:

t.leaves()[x] == np.leaves()[0] # x = ???(t, np)

I would not want to use it t.leaves().index(...), because maybe there are several occurrences in the sentence np, and I need the right one, not the first one.

What I have is the position of the tree npinside t(being ParentedTree) np.treeposition(), such that:

t[np.treeposition()] == np

I think that a tedious solution would be to summarize the sheets for all left_siblings npat all levels. Or I could go through all the leaves until leaf_treeposition(leaf)it becomes equal np.treeposition()+"[0]"*, but that sounds suboptimal.

Is there a better way?

+4
source share
1 answer

Edit: In the end, there is a simple solution:

  • Build the tree position of the first sheet of your subtree.
  • Look at it in the list of all elements of the tree.

Setup:

>>> t = ParentedTree.fromstring('(S (NP (D the) (N dog)) (VP (V chased) (NP (D the) (N cat))))')
>>> np_pos = (1,1)
>>> np = t[np_pos]
>>> print(np)
(NP (D the) (N cat))

1 np np. tree treepositions ( 2) , , ( ) API Tree: order treepositions(). x, , - target_leafpos .

>>> target_leafpos = np.treeposition() + np.leaf_treeposition(0) # Step 1
>>> all_leaf_treepositions = t.treepositions("leaves")           # Step 2
>>> x = all_leaf_treepositions.index(target_leafpos)
>>> print(x)
3

, :

x = t.treepositions("leaves").index( np.treeposition()+np.leaf_treeposition(0) )
+2

Source: https://habr.com/ru/post/1623523/


All Articles