VP|<NP-PP> is one intolerant character. A vertical panel does not mean many options in the traditional sense. Rather, NLTK puts it there to indicate where the rule came from, i.e. "This new nonterminal symbol was derived from a combination of VP and NP-PP." This is a new manufacturing rule created by NLTK to convert your grammar to Chomskyβs normal form.
Take a look at the tree views, pre-CNF:
ROOT -> S S -> NP VP NP -> DT NNS DT -> 'the' NNS -> 'kids' VP -> VBD NP PP *** VBD -> 'opened' NP -> DT NN DT -> 'the' NN -> 'box' PP -> IN NP IN -> 'on' NP -> DT NN DT -> 'the' NN -> 'floor'
In particular, look at the VP -> VBD NP PP rule, which is NOT in the CNF (there should be exactly two nonterminal characters for any production rule on the RHS)
Two rules (7): VP|<NP-PP> -> NP PP and (8): VP -> VBD VP|<NP-PP> in your question are functionally equivalent to the more general rule VP -> VBD NP PP .
When a VP discovered, the rule application results in:
VBD VP|<NP-PP>
And, VP|<NP-PP> is the LHS of the created production rule, which leads to:
VBD NP PP
In particular, if you isolate the rule itself, you can take a look at a specific character (which is really the only one):
>>> tree.chomsky_normal_form() >>> prod = tree.productions() >>> x = prod[7] # VP|<NP-PP> -> NP PP >>> x.lhs().symbol() # Singular! u'VP|<NP-PP>'
source share