ANTLR4 mutually left recursive parsing error

I have this ANTLR 4 grammar:

constantFixedExpresion : term (('+'|'-') term)+; term : factor (('*'|'//'|'REM')factor)+; factor : ('+'|'-')* ( wholeNumberConstant | constantFixedExpresion | 'TOFIXED' (stringConstant | bitCodeConstant) | identifier) ('FIT'constantFixedExpresion)*; 

I get the following error:

error (119): LanguageA.g4: The following rule sets are mutually left recursive [constantFixedExpresion, factor, term]

I tried so many ways, but can't fix it. What is the problem and how can I solve it?

+16
source share
1 answer

Antlr is an LL (*) analyzer that is in many ways "better" than an LL (k) analyzer , but still has many drawbacks. One of them is the fact that it cannot deal with left recursion (in fact, version 4 can deal with left recursion under the same rule). The error says that you have left grammar recursion, a curse for LL parsers.

This is caused by this construct in your grammar:

 constantFixedExpression: term ...; term: factor ...; factor: ('+' | '-')* (constantFixedExpression | ...) ...; 

Since the * operator means 0 or more, I can instantiate it with 0, so the parser will do this: "try constantFixedExpression , so it needs to try term , so it needs to try factor , so it needs to try constantFixedEXpression , so [... ] "and you have an infinite loop.


Fortunately, context-free formal grammars have an equivalent conversion to remove left recursion! This can be expressed in general:

 A -> Aa | b -- becomes -- A -> bR R -> aR | ε 

Or in Antlr notation:

 A: Aa | b; // becomes A: bR; R: (aR)?; 

More information on this process can be found in books on automation / grammar or on Wikipedia .


I will leave a correction to your grammar using refactoring to remove left recursion as your work. However, I want to touch on one more point: Antlr 4 can do left recursion! As I said before, version 4 can deal with left recursion under the same rule . There are ways to specify the priority and associativity of statements other than direct analysis, as you do in Antlr4. Let's see how it works:

 expr: NUMBER |<assoc=right> expr '^' expr | expr '*' expr | expr '/' expr | expr '+' expr | expr '-' expr; 

This is an example of a basic calculator grammar. The operators at the top are the operators with the highest priority, and the operators at the bottom are the ones with the lower priority. This means that 2+2*3 will be analyzed as 2+(2*3) and not (2+2)*3 . The construction <assoc=right> means an operator in right associativity, therefore 1^2^3 will be analyzed as 1^(2^3) and not (1^2)^3 .

As you can see, it is much easier to specify operators with left recursion, so Antlr 4 really helps in these moments! I recommend rewriting your grammar to use this feature.

+25
source

Source: https://habr.com/ru/post/976959/


All Articles