My colleague PaulS asked me the following:
I am writing a parser for an existing language (SystemVerilog is an IEEE standard), and the specification has a rule in it that is similar in structure:
cover_point = [[data_type] identifier ':' ] 'coverpoint' identifier ';' ; data_type = 'int' | 'float' | identifier ; identifier = ?/\w+/? ;
The problem is that when analyzing the following legal line:
anIdentifier: coverpoint another_identifier;
anIdentifier matches data_type (via its identifier), which means that Grako looks for another identifier after it and then fails. Then it does not try to parse without the data_type part.
I can rewrite the rule as follows:
cover_point_rewrite = [data_type identifier ':' | identifier ':' ] 'coverpoint' identifier ';' ;
but I wonder if:
- it is intentional and
- if there is a better syntax?
Is this a PEG problem in general or an instrument (Grako)?
source share