Creating a recursive LPeg template

In normal PEG (parsing), this is a valid grammar:

values <- number (comma values)* number <- [0-9]+ comma <- ',' 

However, if I try to write this using LPeg, the recursive nature of this rule will fail:

 local lpeg = require'lpeg' local comma = lpeg.P(',') local number = lpeg.R('09')^1 local values = number * (comma * values)^-1 --> bad argument #2 to '?' (lpeg-pattern expected, got nil) 

Although in this simple example I could rewrite the rule so as not to use recursion, I have some existing grammars that I would prefer not to rewrite.

How can I write a self-reference rule in LPeg?

+5
source share
2 answers

Use grammar .

Using Lua variables, you can sequentially define patterns; each new pattern uses previously defined patterns. However, this method does not allow you to define recursive patterns. For recursive patterns, we need real grammars.

LPeg is a grammar with tables where each entry is a rule.

Calling lpeg.V (v) creates a template that represents a nonterminal (or variable) index v in the grammar. Since the grammar still does not exist when this function is evaluated, the result is an open reference to the corresponding rule.

A table is committed when it is converted to a template (either by calling lpeg.P, or by using the template in which it is expected). Then, each open link created by lpeg.V (v) is corrected to refer to a rule indexed by v in the table.

When a table is fixed, the result is a pattern that matches its initial rule. The record with index 1 in the table defines its initial rule. If this entry is a string, it is considered the name of the original rule. Otherwise, LPeg assumes that record 1 itself is the original rule.

As an example, the following grammar corresponds to lines a and b, which have the same number a and b:

 equalcount = lpeg.P{ "S"; -- initial rule name S = "a" * lpeg.V"B" + "b" * lpeg.V"A" + "", A = "a" * lpeg.V"S" + "b" * lpeg.V"A" * lpeg.V"A", B = "b" * lpeg.V"S" + "a" * lpeg.V"B" * lpeg.V"B", } * -1 

This is equivalent to the following grammar in the standard PEG notation:

  S <- 'a' B / 'b' A / '' A <- 'a' S / 'b' AA B <- 'b' S / 'a' BB 
+5
source

I know this is a late answer, but here is an idea how to return a link to a rule

 local comma = lpeg.P(',') local number = lpeg.R('09')^1 local values = lpeg.P{ lpeg.C(number) * (comma * lpeg.V(1))^-1 } local t = { values:match('1,10,20,301') } 

Basically, a primitive grammar is passed to lpeg.P (a grammar is just a distinguished table), which refers to the first rule by number instead of the name ie lpeg.V(1) .

The sample simply adds a simple lpeg.C capture on the terminal number and collects all these results in the local table t for later use. (Note that lpeg.Ct is not used, which is not very important, but still ... part of the sample that I think.)

0
source

Source: https://habr.com/ru/post/1203821/


All Articles