Where can I learn the basics of writing lexer?

I want to learn how to write lexer. My university course had a task in which we had to write a parser (and a lexer to agree with it), but it was given to us without any instructions or feedback (outside the sign), so I did not learn much from this.

After searching for this topic, I can find some pretty complicated recordings that focus on areas that I feel are a few steps ahead of where I am. I want to discuss the basics of writing a lexer for a very simple language, which I can use as a basis for learning the tokenization of more complex languages.

At this point, I'm not interested in best practices or optimization methods, but instead prefer to focus on the main points. What are some good resources to get me started?

+68
compiler-construction language-agnostic lexer
Jun 02 2018-11-11T00:
source share
2 answers

In fact, there are two main approaches to writing a lexer:

  1. Creating a handwritten, in which case I recommend this short tutorial .
  2. Using some lexer generator tools such as lex . In this case, I recommend reading the tutorials for a specific instrument of choice.

I would also like to recommend a kaleidoscope tutorial from the LLVM documentation. It goes through the implementation of a simple language and, in particular, demonstrates how to write a small lexer. There is a C ++ and Objective Caml tutorial version.

A classic textbook on this subject is Compilers: principles, methods and tools , also known as the Dragon Book. However, this probably falls into the category of โ€œfairly advanced reviews.โ€

+61
Jun 02 2018-11-11T00:
source share

The Dragon Book is probably the definitive guide to this, although it can be a bit overwhelming. Language implementation templates and the Pragmatics programming language are excellent resources.

+9
Jun 02 2018-11-11T00:
source share



All Articles