Creating a simple domain language

I am curious to learn about creating a domain-specific language. At the moment, the domain is quite simple, just some variables and run some loops if statements.

Edit: The language will be non-English with very simple syntax.

I mean targeting a Java virtual machine, that is, compiling into Java bytecode.

Currently, I know how to write some simple grammars using ANTLR.

I know that ANTLR creates a lexer and a parser, but how do I go further?

  • about semantic analysis: do you need to write it manually or are there any tools for creating it?
  • How can I convert the output from a lexer and parser to Java bytecode?
  • I know that there are libraries like ASM or BCEL, but which procedure?
  • Is there any framework for this? And if so, what is the easiest?
+6
source share
3 answers

You should try Xtext , the Eclipse-based DSL toolkit. Version 2 is quite powerful and stable. You have many resources on your homepage to get started, including some video tutorials. Since the Eclipse ecosystem runs around Java, this seems like the best choice for you.

You can also try MPS , but it is a projection editor, and for beginners it may seem more difficult. However, it is no less powerful than Xtext.

+4
source

If your goal is to learn as much as possible about compilers, then you really need to go in a complicated way - write ad hoc parser (without antlr and the like), write your own semantic passages and your own code generation.

Otherwise, you better extend the existing extensible language with DSL, reuse its parser, its semantics, and code generation functions. For example, you can easily implement an almost arbitrary complex DSL on top of Clojure macros (and Clojure itself translates to the JVM, you will get it for free).

+2
source

DSLs with simple syntax may or may not mean simple semantics.

Simple semantics can mean easy translation into the target language or not. such translations are “technically easy” only if the DSL and the target language share many common data types and execution models. (Constraint systems have simple semantics, but translating them into Fortran is very difficult!). (You should be wondering: if DSL translation is simple, why do you have it?)

If you want to create DSL (in your case, you will keep it simple because you are learning), you want the DSL compiler infrastructure to have everything you need, including support for complex translations. The “what's needed” to handle the translation of all DSLs into all possible target languages ​​is clearly an incredibly large set of mechanisms.

However, there is a lot that is clear that may be useful:

  • A strong syntactic mechanism (which wants to sculpt with grammars, the structure of which is forced by the weakness of the simulator? (If you do not know what it is, read about LL (1) grammars as an example).
  • Automatically constructing the representation (for example, an abstract syntax tree) of the analyzed DSL
  • Ability to access / modify / build new AST
  • Ability to capture information about symbols and their meaning (symbol tables)
  • Ability to build AST analyzes for DSL, to support translations requiring informatain from "far away" in the tree, in order to influence the translation at a specific point in the tree
  • The ability to reorganize AST is easy to achieve local optimizations
  • Ability to track / analyze information and data flow information if DSL has some procedural aspects, and code generation requires deep reasoning or optimization

Most of the tools available for “creating DSL generators” provide some sort of parsing, perhaps tree building, and then leave you to fill out the rest. This puts you in the position of having a small, clean DSL, but to implement it forever. This is not good. You really want all this infrastructure.

Our DMS Software Reengineering Toolkit contains all the infrastructure drawn above and more. (He obviously does not and cannot have a moon). You can see the full, all-in-one "page", a simple DSL example that uses some integral parts of this mechanism .

+1
source

Source: https://habr.com/ru/post/895715/


All Articles