ANTLR3 C Target - the parser returns β€œskips” from the root element

I am trying to use the ANTLR3 C Target to understand AST, but I am encountering some difficulties.

I have a simple SQL-like grammar file:

grammar sql; options { language = C; output=AST; ASTLabelType=pANTLR3_BASE_TREE; } sql : VERB fields; fields : FIELD (',' FIELD)*; VERB : 'SELECT' | 'UPDATE' | 'INSERT'; FIELD : CHAR+; fragment CHAR : 'a'..'z'; 

and it works as expected at ANTLRWorks.

In my C code, I have:

 const char pInput[] = "SELECT one,two,three"; pANTLR3_INPUT_STREAM pNewStrm = antlr3NewAsciiStringInPlaceStream((pANTLR3_UINT8) pInput,sizeof(pInput),NULL); psqlLexer lex = sqlLexerNew (pNewStrm); pANTLR3_COMMON_TOKEN_STREAM tstream = antlr3CommonTokenStreamSourceNew(ANTLR3_SIZE_HINT, TOKENSOURCE(lex)); psqlParser ps = sqlParserNew( tstream ); sqlParser_sql_return ret = ps->sql(ps); pANTLR3_BASE_TREE pTree = ret.tree; cout << "Tree: " << pTree->toStringTree(pTree)->chars << endl; ParseSubTree(0,pTree); 

This displays a flat tree structure when you use ->getChildCount and ->children->get for recursion through the tree.

 void ParseSubTree(int level,pANTLR3_BASE_TREE pTree) { ANTLR3_UINT32 childcount = pTree->getChildCount(pTree); for (int i=0;i<childcount;i++) { pANTLR3_BASE_TREE pChild = (pANTLR3_BASE_TREE) pTree->children->get(pTree->children,i); for (int j=0;j<level;j++) { std::cout << " - "; } std::cout << pChild->getText(pChild)->chars << std::endl; int f=pChild->getChildCount(pChild); if (f>0) { ParseSubTree(level+1,pChild); } } } 

Program output: Tree: SELECT one, two, three SELECT one, two, three

Now, if I modify the grammar file:

 sql : VERB ^fields; 

.. The call to ParseSubTree displays only the child nodes of the fields.

Program output: Tree: (SELECT one, two, three) one, two, three

My question is: why in the second case, Antlr just gives child nodes? (actually skipping the SELECT token) I would be very grateful if someone could give me any instructions to understand the tree returned by Antlr.

Useful information: AntlrWorks 1.4.2, Antlr C Target 3.3, MSVC 10

+2
source share
2 answers

Placement output=AST; in the parameters section it will not lead to the actual AST, it only forces ANTLR to create CommonTree tokens instead of CommonToken (or, in your case, equivalent C-structures).

If you use output=AST; , the next step is to set up the tree statements or rewrite the rules in the parser rules that give form to your AST.

See the previous Q & A for how to create the right AST.

For example, the following grammar (with rewriting rules):

 options { output=AST; // ... } sql // make VERB the root : VERB fields -> ^(VERB fields) ; fields // omit the comma from the AST : FIELD (',' FIELD)* -> FIELD+ ; VERB : 'SELECT' | 'UPDATE' | 'INSERT'; FIELD : CHAR+; SPACE : ' ' {$channel=HIDDEN;}; fragment CHAR : 'a'..'z'; 

will analyze the following input:

 UPDATE field, foo , bar 

to the following AST:

enter image description here

+2
source

I think it’s important that you understand that the tree you see in Antrlworks is not AST. The ".tree" in your code is AST, but may be different from what you expect. To create an AST, you need to specify the nodes using the ^ symbol in strategic places using rewrite rules.

You can read here

+2
source

Source: https://habr.com/ru/post/1386738/


All Articles