Problems with the rules section G0 and G1 in grammar

I am trying to get what seems to be a very simple Marpa grammar. The code I'm using is below:

use strict; use warnings; use Marpa::R2; use Data::Dumper; my $grammar = Marpa::R2::Scanless::G->new( { source => \(<<'END_OF_SOURCE'), :start ::= ExprSingle ExprSingle ::= Expr AndExpr Expr ~ word AndExpr ~ word* word ~ [\w]+ :discard ~ ws ws ~ [\s]+ END_OF_SOURCE } ); my $reader = Marpa::R2::Scanless::R->new( { grammar => $grammar, } ); my $input = 'foo'; $reader->read(\$input); my $value = $reader->value; print Dumper $value; 

$VAR1 = \'foo'; . Thus, he recognizes one word just fine. But I want him to recognize a string of words

 my $input='foo bar' 

Now the script prints:

 Error in SLIF G1 read: Parse exhausted, but lexemes remain, at position 4 

I think this is due to the fact that ExprSingle uses the ~ (match) operator, which makes it part of the tokenization level, G0, instead of the structural level G1; rule :discard allows space between rules G1, not G0. Therefore, I change the grammar as follows:

 ExprSingle ::= Expr AndExpr 

Now the warning is not printed, but the resulting value is undef instead of what contains 'foo' and 'bar' . I honestly don’t know what this means, because before, an unsuccessful session discarded the actual error.

I tried to change the grammar to separate what, in my opinion, are the rules of G0 and G1, but still no luck:

 :start ::= ExprSingle ExprSingle ::= Expr AndExpr Expr ::= token AndExpr ::= token* token ~ word word ~ [\w]+ :discard ~ ws ws ~ [\s]+ 

The final value is still undef . trace_terminals shows both "foo" and "bar" are accepted as tokens. What do I need to do to fix this grammar (by which I mean to get a value containing the strings "foo" and "bar", and not just undef )?

+4
source share
1 answer

The rules return undef by default, so in your case, returning \ undef from $ reader-> value () means your parsing is complete. That is, returning undef means failure, while return \ undef means success when parsing evaluates to undef.

A good, quick way to start with more useful semantics is to add the following line:

: default :: = action => :: array

This forces the parsing to generate the AST.

+5
source

Source: https://habr.com/ru/post/1486705/


All Articles