Is there a convenient way to write a regular expression that will try to match as many regular expressions as possible?
Example:
my $re = qr/a ([az]+) (\d+)/; match_longest($re, "a") => () match_longest($re, "a word") => ("word") match_longest($re, "a word 123") => ("word", "123") match_longest($re, "a 123") => ()
That is, $re is considered a sequence of regular expressions, and match_longest tries to match such a sequence. In a sense, a match never fails - it is just a question of how many matches succeeded. After an unsuccessful regular expression match, undef for parts that do not match.
I know that I can write a function that takes a sequence of regular expressions and creates one regular expression for the match_longest job. Here's an idea diagram:
Suppose you have three regular expressions: $r1 , $r2 and $r3 . The only regular expression to complete the match_longest job will have the following structure:
$r = ($r1 $r2 $r3)? | $r1 ($r2 $r3) | $r1 $r2 $r3?
Unfortunately, this is quadratic in the number of regular expressions. Is it possible to be more effective?
Erikr source share