Can a LL table analyzer handle repetition without proper recursion?

I understand how the LL recursive descent handler can handle the rules of this form:

A = B*; 

with a simple loop that checks whether to continue the loop or not, based on whether the token matches the header in the terminal in FIRST-set B. However, I am interested, based on pairs of LL-parsers: how can the rules of this form work there? As far as I know, the only way to deal with a repetition like this in one is through recursion on the right, but it will ruin the associativity in cases where the tree of associations of the right associative interaction is not desirable.

I would like to know, because I'm currently trying to write a parser based on the LL (1) table, and I'm not sure how to handle such a case without changing the shape of the parse tree.

+6
source share
1 answer

Grammar

Let me decompose your EBNF grammar into a simple BNF and suppose b is the terminal and <e> is the empty string:

 A -> X X -> BX X -> <e> B -> b 

This grammar creates strings of finite b any length.

LL Table (1)

To build a table, we will need to generate the first and subsequent sets ( creating LL (1) analysis table ).

Sets first

First(α) is a set of terminals that start lines obtained from any line of characters in the grammar α .

 First(A) : b, <e> First(X) : b, <e> First(B) : b 

Follow the sets

Follow(A) is a set of terminals a that can immediately appear to the right of nonterminal A

 Follow(A) : $ Follow(X) : $ Follow(B) : b$ 

Table

Now we can build the table on the basis of sets, $ is the end of the input marker.

 +---+---------+----------+ | | b | $ | +---+---------+----------+ | A | A -> X | A -> X | | X | X -> BX | X -> <e> | | B | B -> b | | +---+---------+----------+ 

The action of the parser always depends on the top of the analysis stack and the next input symbol.

  • Terminal on top of the analysis stack:
    • Matches the input character: stack pop, advances to the next input character
    • No match: parsing error
  • Non-terminal over parsing stack:
    • The analysis table contains information about production: apply production to the stack
    • Cell is empty: parsing error
  • $ on top of the parsing stack:
    • $ is the input character: accept input
    • $ not an input character: parsing error

Analysis example

We analyze the input signal bb . The start column of the analysis contains the start character and the end marker A $ .

 +-------+-------+-----------+ | Stack | Input | Action | +-------+-------+-----------+ | A $ | bb$ | A -> X | | X $ | bb$ | X -> BX | | BX $ | bb$ | B -> b | | b X $ | bb$ | consume b | | X $ | b$ | X -> BX | | BX $ | b$ | B -> b | | b X $ | b$ | consume b | | X $ | $ | X -> <e> | | $ | $ | accept | +-------+-------+-----------+ 

Conclusion

As you can see, the rules of form A = B* can be analyzed without problems. The resulting parsing tree for entering bb will be:

parse tree

+3
source

Source: https://habr.com/ru/post/976895/


All Articles