Are grammar expressions suitable for parsing the shell command language parsed?

The POSIX shell command language is not easy to parse, mainly because of the tight connection between lexing and parsing.

However, parsing expression grammars (PEGs) often does not require scanning. By combining lexing and parsing, it seems like I could avoid these problems. The language I use (Rust) has a well-supported PEG library. However, I know three difficulties that may make it inappropriate to use this library:

  • Shells should be able to parse line by line, rather than reading characters beyond the end of a line.
  • Aliases are purely lexical and can cause the token to be replaced by any sequence of other tokens in certain situations.
  • reserved shell words are recognized only in certain situations

Is the PEG suitable for parsing the shell command language with these requirements in mind or is it a more suitable parser for recursive descent?

+6
source share
1 answer

Yes, PEG can be used, and none of the problems you noticed should be a problem. In particular:

1) parsing by line: most PEG tools will not have a built-in white space skip. All empty space, including newlines, must be explicitly processed by you, which means that you can process the newline in any way.

2) You should not use the PEG parse tree as AST. Instead, you should go down to the parse tree and build an AST. For aliases, after the parsing is complete and you build your AST, you can discover the alias and insert the appropriate extension for the alias.

3) Reserved words are not reserved unless you reserve them. That is, if you have a context where there may be a reserved word or other alphanumeric character, you must first check the reserved words explicitly, and then an arbitrary alphanumeric character, because as soon as PEG decides that it has a match, It will not be returned -Track. The reserved word is not allowed anywhere, just do not check it, and instead your generalized alphanumeric character rule will be followed.

+3
source

Source: https://habr.com/ru/post/983524/


All Articles