Background
I am trying to write a simple grammar using AntlrWorks for Boolean equations that check a lot of values โโfor the existence (or absence) of certain elements. I created a combined lexer / parser grammar that gives the desired AST. I also wrote an accompanying tree grammar that seems to work (passes debug functions from AntlrWorks).
Problem
However, when I try to link them together in a test program (this is a lex, parsing and parsing in the same program), I get errors like ...
node from line 1:5 required (...)+ loop did not match anything at input 'and'
and
node from after line 1:8 mismatched tree node: UP expecting <DOWN>
As a performance test, I had a test program that outputs the results of toStringTree()
from the generated AST and toTokenTypeString()
from the resulting TreeNodeStream
.
What I discovered is that the listed values โโof the TreeNodeStream
token TreeNodeStream
do not match the values โโof the type of the token enumerated type in the code with the auto-generated tree.
Example
sample input: "true and false"
The output of the toStringTree () command from the tree provided by Parser: (and true false)
The output of toTokenTypeString()
from the TreeNodeStream associated with the above AST: 19 2 22 20 3 8
This token should be AND <DOWN> 'true' 'false' <UP> NEWLINE
But TreeParser sees it as CLOSEPAREN <DOWN> OR 'false' <UP> OPENPAREN
(based on looking at the output of the node token type and checking for the enum defined by in the grammar of the tree) and throws an error
1:5 required (...)+ loop did not match anything at input 'and'
Bottom line
Why is my tree analyzer not configured to correctly identify my AST?
Below is my source. I appreciate any feedback on the stupid mistakes I must have made :)
Grammar Lexer / Parser
grammar INTc; options { output=AST; ASTLabelType=CommonTree; } tokens { OR='or'; AND='and'; NOT='not'; ALLIN='+'; PARTIN='^'; NOTIN='!'; SET; OPENPAREN='('; CLOSEPAREN=')'; OPENSET='{'; CLOSESET='}'; } @header { package INTc; } @lexer::header { package INTc; } @members { } prog: stat+ ; stat: expr | NEWLINE ; expr : orExpr ; orExpr returns [boolean value] : a=andExpr(OR^ b=andExpr)* ; andExpr returns [boolean value] : a=notExpr (AND^ b=notExpr)* ; notExpr returns [boolean value] : a=atom | '!' a=atom -> ^(NOT atom) ; atom returns [boolean value] : ALLIN OPENSET ((INT)(','INT)*) CLOSESET -> ^(ALLIN ^(SET INT+)) | PARTIN OPENSET ((INT)(','INT)*) CLOSESET -> ^(PARTIN ^(SET INT+)) | NOTIN OPENSET ((INT)(','INT)*) CLOSESET -> ^(NOTIN ^(SET INT+)) | TIMERANGE | OPENPAREN! e=expr CLOSEPAREN! | 'true' | 'false' ; ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; DIGIT : ('0'..'9'); INT : DIGIT+ ; NEWLINE : '\r'? '\n' ; WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}; COMMENT : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;} | '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;} ;
Grammar tree
tree grammar INTcWalker; options { tokenVocab=INTc; ASTLabelType=CommonTree; } @header { package INTc; import java.util.ArrayList; import java.util.Arrays; } @members { ArrayList<String> intSet; boolean isFit = false; public boolean getResult() { return isFit; } public void setINTSet(ArrayList newSet) { intSet = newSet; isFit = false; } public ArrayList getINTSET(){return intSet;} } prog : stat+ ; stat : expr { isFit = $expr.value;
Testing program
public class setTest { public static void main(String args[]) throws Exception { INTcLexer lex = new INTcLexer(new ANTLRFileStream("input.txt")); CommonTokenStream tokens = new CommonTokenStream(lex); INTcParser parser = new INTcParser(tokens); INTcParser.prog_return r = parser.prog(); CommonTree t = (CommonTree)r.getTree(); CommonTreeNodeStream nodes = new CommonTreeNodeStream(t); INTcWalker evaluator = new INTcWalker(nodes); System.out.println(t.toStringTree()); System.out.println(nodes.toTokenTypeString()); nodes.reset(); try { evaluator.prog(); } catch (RecognitionException e) { e.printStackTrace(); } System.out.println(evaluator.getResult()); } }