Assignment as an expression in Antlr grammar

I am trying to expand the grammar of a tiny language to handle assignment as an expression. Thus, it would be correct to write

a = b = 1; // -> a = (b = 1) a = 2 * (b = 1); // contrived but valid a = 1 = 2; // invalid 

Assignment differs from other operators in two aspects. This is the correct associative (not a big deal), and its left side should be variable. So I changed this grammar as follows

 statement: assignmentExpr | functionCall ...; assignmentExpr: Identifier indexes? '=' expression; expression: assignmentExpr | condExpr; 

This does not work because it contains a non-LL (*) solution. I also tried this option:

 assignmentExpr: Identifier indexes? '=' (expression | condExpr); 

but I got the same error. I'm interested in

  • This specific question
  • Given a grammar with a solution other than LL (*), how to find the two paths that cause the problem
  • How to fix it.
+4
source share
2 answers

The key point here is that you need to β€œassure” the parser that there is something in front of the expression that satisfies the expression. This can be done using a syntactic predicate (parts ( ... )=> in the add and mult ).

Quick demo:

 grammar TL; options { output=AST; } tokens { ROOT; ASSIGN; } parse : stat* EOF -> ^(ROOT stat+) ; stat : expr ';' -> expr ; expr : add ; add : mult ((('+' | '-') mult)=> ('+' | '-')^ mult)* ; mult : atom ((('*' | '/') atom)=> ('*' | '/')^ atom)* ; atom : (Id -> Id) ('=' expr -> ^(ASSIGN Id expr))? | Num | '(' expr ')' -> expr ; Comment : '//' ~('\r' | '\n')* {skip();}; Id : 'a'..'z'+; Num : '0'..'9'+; Space : (' ' | '\t' | '\r' | '\n')+ {skip();}; 

which will analyze the input:

 a = b = 1; // -> a = (b = 1) a = 2 * (b = 1); // contrived but valid 

to the following AST:

enter image description here

+1
source

I think you can change your grammar in such a way as to achieve the same without using syntactic predicates:

 statement: Expr ';' | functionCall ';'...; Expr: Identifier indexes? '=' Expr | condExpr ; condExpr: .... and so on; 

I modified Bart's example with this idea:

 grammar TL; options { output=AST; } tokens { ROOT; } parse : stat+ EOF -> ^(ROOT stat+) ; stat : expr ';' ; expr : Id Assign expr -> ^(Assign Id expr) | add ; add : mult (('+' | '-')^ mult)* ; mult : atom (('*' | '/')^ atom)* ; atom : Id | Num | '('! expr ')' ! ; Assign : '=' ; Comment : '//' ~('\r' | '\n')* {skip();}; Id : 'a'..'z'+; Num : '0'..'9'+; Space : (' ' | '\t' | '\r' | '\n')+ {skip();}; 

And for input:

 a=b=4; a = 2 * (b = 1); 

you get the following parse tree: enter image description here

+2
source

Source: https://habr.com/ru/post/1384884/


All Articles