Is there a good CPAN module for implementing machine states in text analysis?

When analyzing the text, I often need to implement mini-state machines in the general form as follows.

Is there a CPAN module that is considered โ€œbest practiceโ€ and is well suited for state logic logic like this in a simple and elegant way?

I would prefer the solutions to be less complicated than Parse::RecDescent , but if they do not exist and Parse::RecDescent much easier to apply to this problem than I thought, I really want to consider this instead of skating on my own, before so far.

An example of a common parsing code:

 my $state = 1; while (my $token = get_next_token()) { # Usually next line if ($state == 1) { do_state1_processing(); if (token_matches_transition_1_to_2($token)) { do_state_1_to_2_transition_processing(); $state == 2; next; } elsif (token_matches_transition_1_to_4($token)) { do_state_1_to_4_transition_processing(); $state == 4; next; } else { do_state1_continuation(); next; } } elsif ($state == 5) { do_state5_processing(); if (token_matches_transition_5_to_6($token)) { do_state_5_to_6_transition_processing(); $state == 6; next; } elsif (token_matches_transition_5_to_4($token)) { do_state_5_to_4_transition_processing(); $state == 4; next; } else { do_state5_continuation(); next; } } else { } } 
+6
source share
4 answers

I would recommend a look at Marpa and Marpa ::. XS

Just check out this simple calculator .

 my $grammar = Marpa::XS::Grammar->new( { start => 'Expression', actions => 'My_Actions', default_action => 'first_arg', rules => [ { lhs => 'Expression', rhs => [qw'Term'] }, { lhs => 'Term', rhs => [qw'Factor'] }, { lhs => 'Factor', rhs => [qw'Number'] }, { lhs => 'Term', rhs => [qw'Term Add Term'], action => 'do_add' }, { lhs => 'Factor', rhs => [qw'Factor Multiply Factor'], action => 'do_multiply' }, ], } ); 

You will need to implement the tokenizer yourself.

+3
source

You can use Class :: StateMachine :

 package Foo; use parent 'Class::StateMachine'; sub new { my $class = shift; Class::StateMachine::bless {}, $class, 'state_1'; } sub do_state_processing :OnState('state_1') { my $self = shift; if (...) { $self->event_1 } elsif (...) { $self->event_2 } ... } sub do_state_processing :OnState('state_2') { ... } sub event_1 :OnState('state_1') { my $self = shift; $self->state('state_2'); } sub event_2 :OnState('state_2') { my $self = shift; $self->state('state_3'); } sub enter_state :OnState('state_1') { print "entering state 1"; ... } sub enter_state :OnState('state_2') { ... } package main; my $sm = Foo->new; ... while (my $token = get_next_token()) { $sm->do_state_processing; } 

Although a text-specific module is likely to be more appropriate for your particular case.

+2
source

(With help) I wrote something a few years ago called the Perl Formal Language Toolkit so that it could serve as a kind of foundation, however I think what you really want is a tool similar to the Ragel Finite State Machine Compiler . Unfortunately, it is not displayed in Perl, and it is my desire to implement the Perl goal for Ragel , and also provide similar (but more Perl-oriented) functions for my bit decay module.

+1
source

I wrote Parser::MGC mainly because I found that I was trying to get Parse::RecDescent to make the right error messages, it was rather difficult, and I did not like its unusual custom built-in grammar in string quotes containing perl code, along with with different perl code. The P::MGC program is just perl code; but it is written as a recursive descent along a grammar structure similar to P::RD .

0
source

Source: https://habr.com/ru/post/908885/


All Articles