There are two ways to handle errors in Marpa.
"Ruby slippers" Analysis
Marpa maintains a lot of context during the scan. We can use this context so that the parser can require some token, and we can decide whether we want to offer it to Marpa, even if it is not at the input. Consider, for example, a programming language that requires that any statement end with a semicolon. Then we can use the Ruby Slippers methods to enter semicolons in certain places, for example, at the end of a line or before a closing bracket:
use strict; use warnings; use Marpa::R2; use Data::Dump 'dd'; my $grammar = Marpa::R2::Scanless::G->new({ source => \q{ :discard ~ ws Block ::= Statement+ action => ::array Statement ::= StatementBody (STATEMENT_TERMINATOR) action => ::first StatementBody ::= 'statement' action => ::first | ('{') Block ('}') action => ::first STATEMENT_TERMINATOR ~ ';' event ruby_slippers = predicted STATEMENT_TERMINATOR ws ~ [\s]+ }, }); my $recce = Marpa::R2::Scanless::R->new({ grammar => $grammar }); my $input = q( statement; { statement } statement statement ); for ( $recce->read(\$input); $recce->pos < length $input; $recce->resume ) { ruby_slippers($recce, \$input); } ruby_slippers($recce, \$input); dd $recce->value; sub ruby_slippers { my ($recce, $input) = @_; my %possible_tokens_by_length; my @expected = @{ $recce->terminals_expected }; for my $token (@expected) { pos($$input) = $recce->pos; if ($token eq 'STATEMENT_TERMINATOR') {
In the ruby_slippers function ruby_slippers you can also calculate how often you needed to wash the token. If this count exceeds a certain value, you can refuse parsing by throwing an error.
Input skip
If your input may contain an unverified junk file, you can try to skip this if otherwise the token is not found. To do this, the $recce->resume method accepts an optional position argument where normal parsing will resume.
use strict; use warnings; use Marpa::R2; use Data::Dump 'dd'; use Try::Tiny; my $grammar = Marpa::R2::Scanless::G->new({ source => \q{ :discard ~ ws Sentence ::= WORD+ action => ::array WORD ~ 'foo':i | 'bar':i | 'baz':i | 'qux':i ws ~ [\s]+ }, }); my $recce = Marpa::R2::Scanless::R->new({ grammar => $grammar }); my $input = '1) Foo bar: baz and qux, therefore qux (foo!) implies bar.'; try { $recce->read(\$input) }; while ($recce->pos < length $input) {
While the same effect can be achieved by using a token :discard that matches anything, skipping in our client code allows us to stop parsing if we need to make too many attempts.
source share