Lexing newlines in scala StdLexical?

I am trying to use lex (then parse) a C-like language. C has preprocessor directives where line breaks are significant, and then actual code, where they are just spaces.

One way to do this would be to execute a two-processor process, such as the early C compilers, to have a separate preprocessor for the # directives, and then lex output it.

However, I wondered if this could be done in one lexer. I am very pleased with the scala combinator parser code, but I'm not sure how it StdLexicalhandles spaces.

Can someone write some simple code example that could use lex a #includeline (using a new line) and some trivial code (ignoring a new line)? Or is it impossible, and is it better to go with a two-pass application?

+3
source share
1 answer

OK, I decided it myself, answer here for posterity.

In StdLexical, you already have the ability to specify spaces in your lexer. All you have to do is override your token method accordingly. Here is an example code (with deletion of non-matching bits)

override def token: CeeLexer.Parser[Token] = controlLine 
  // | ... (where ... is whatever you want to keep of the original method)
def controlLine = hashInclude

def hashInclude : CeeLexer.Parser[HashInclude] =
  ('#' ~ word("include") ~ rep(nonEolws)~'\"' ~ rep(chrExcept('\"', '\n', EofCh)) ~ '\"' ~ '\n' |
   '#' ~ word("include") ~ rep(nonEolws)~'<' ~ rep(chrExcept('>', '\n', EofCh)) ~ '>' ~ '\n' ) ^^ {
   case hash~include~whs~openQ~fname~closeQ~eol =>  // code to handle #include
 }
+7
source

Source: https://habr.com/ru/post/1741048/


All Articles