Parse program with tokens in fixed positions in a string

I am new to Antlr and have to write a parser for legacy assembly code that can have row numbers in fixed columns. In addition, some columns matter - whether it's a comment, a continuation, etc. How can I detect them?

To give a few examples:

000001 proc proc1

000002 * comment

* comment without line numbers continuation marker set ==> X Arbitrary text as continuation 

Thanks XAN

+5
source share
1 answer

I came across something similar when programming Antlr grammar to parse Cobol sources. Cobol has some characteristics, such as yours (fixed columns, columns are significant, etc.).

The only solution I found for this problem was to β€œpre-process” the input and turned it into something that Antlr can analyze without problems!

Example: in Cobol, an asterisk in column 7 indicates that the line is a comment line; I changed it (the asterisk itself) to β€œβ†’β€ and indicated in my grammar that β€œβ†’β€ means that this line is a comment line.

0
source

Source: https://habr.com/ru/post/1202996/


All Articles