How can I perform an ANTLR parser action for each element in a rule that can match more than one element?

Question

How can I perform an ANTLR parser action for each element in a rule that can match more than one element?

I am trying to write an ANTLR parser rule that matches a list of things, and I want to write a parser action that can process each item in a list independently.

Some sample input for these rules:

$(A1 A2 A3)

I would like this to lead to an evaluation that contains a list of three MyIdentEvaluator objects — one for each of A1, A2, and A3.

Here is a snippet of my grammar:

 my_list returns [IEvaluator e] : { $e = new MyListEvaluator(); } '$' LPAREN op=my_ident+ { /* want to do something here for each 'my_ident'. */ /* the following seems to see only the 'A3' my_ident */ $e.Add($op.e); } RPAREN ; my_ident returns [IEvaluator e] : IDENT { $e = new MyIdentEvaluator($IDENT.text); } ;

I think my_ident defined correctly because I see three MyIdentEvaluators created, created as expected for my input line, but only the last my_ident ever added to the list (A3 in my input example).

How can I best process each of these elements independently, either by changing the grammar or by changing the parsing?

It also occurred to me that my vocabulary for these concepts is not what it should be, so if it looks like I'm misusing the term, I probably have one.

EDIT in response to Wayne's comment:

I tried using op+=my_ident+ . In this case, $op in my action becomes an IList (in C #), which contains instances of Antlr.Runtime.Tree.CommonTree . This gives me one entry for a matching token in $op , so I see three of my matches, but I don't have the MyIdentEvaluator instances that I really want. I was hoping that now I could find the rule attribute in the ANTLR docs that could help with this, but nothing helped me get rid of this IList .

Result...

Based on chollida's answer, I ended up with this, which works well:

 my_list returns [IEvaluator e] : { $e = new MyListEvaluator(); } '$' LPAREN (op=my_ident { $e.Add($op.e); } )+ RPAREN ;

The Add method is called for each match my_ident.

+4

parsing antlr grammar

Chris farmer Feb 01 '10 at 19:31

source share

2 answers

 my_list returns [IEvaluator e] : '$' LPAREN ops+=my_ident+ RPAREN { e = new MyListEvaluator(list_ops); } ;

I am doing something similar in Java and should check the generated code to find that ANTLR3 generates variables called "list_NAME" (where in this case NAME = ops), which is a list of all the returns of the value sub-token rule. I think this is the same in C #, although I could be wrong. You would expect the variable to be called simply by "ops", but this variable would contain only the last comparable value of the rule (at least in Java).

0

JoelPM May 29 '10 at 14:04

source share

chollida · Accepted Answer · 2010-02-02T18:22:36+0000

If I wrote this, I would split the individual match into a list template:

 my_list returns [IEvaluator e] : { $e = new MyListEvaluator(); } '$' LPAREN op=my_ident { $e.Add($op.e); } (opNext=my_ident { $e.Add($opNext.e); })* RPAREN ; my_ident returns [IEvaluator e] : IDENT { $e = new MyIdentEvaluator($IDENT.text); } ;

Here, instead of using Antlr built into + , we do iteration on our own. We map the first element and add it to the list, then we map the consecutive elements and save them.

How can I perform an ANTLR parser action for each element in a rule that can match more than one element?

More articles: