Parse program with tokens in fixed positions in a string

Question

Parse program with tokens in fixed positions in a string

I am new to Antlr and have to write a parser for legacy assembly code that can have row numbers in fixed columns. In addition, some columns matter - whether it's a comment, a continuation, etc. How can I detect them?

To give a few examples:

000001 proc proc1

000002 * comment

* comment without line numbers continuation marker set ==> X Arbitrary text as continuation

Thanks XAN

+5

antlr4

Antlr novis Sep 19 '14 at 17:20

source share

1 answer

Nilo paim · Answer 1 · 2015-06-10T20:20:02+0000

I came across something similar when programming Antlr grammar to parse Cobol sources. Cobol has some characteristics, such as yours (fixed columns, columns are significant, etc.).

The only solution I found for this problem was to “pre-process” the input and turned it into something that Antlr can analyze without problems!

Example: in Cobol, an asterisk in column 7 indicates that the line is a comment line; I changed it (the asterisk itself) to “→” and indicated in my grammar that “→” means that this line is a comment line.

Parse program with tokens in fixed positions in a string

More articles: