How does the compiler compile the compiler?

Based on high-level programming, I’m interested in learning about low-level programming. I want to know how the compiler compiles?

After viewing some articles on the wiki, Numerical machine code is considered the lowest level language, but the compiler must compile this machine code. What language is this compiler written in?

+4
source share
4 answers

As a rule, compiler authors go in one of two ways:

  • Write the entire compiler in some other existing language. This is usually the easiest option.

  • Write enough code in some other language to have a minimally used translator, and use this "stage" as the basis for writing the compiler itself in the language that it was going to compile. This is more complicated and usually takes more time, but at its core it makes it possible to clear language errors and shortcomings by testing the language in a real project.

The first program to translate the code was written, at least in part, into machine code — the actual numbers that tell the processor what to do. This is the lowest level because there is no "compiler" * on machine code; these are just numbers arranged in a certain way, and the processor has a circuit inside it to process them without external help.

* There are programs that can help develop hardware that interprets and executes instructions, but this may go beyond the definition of a compiler. Such programs generate hardware descriptions - circuits, etc. - unlike directly executable files, which are the output of the compiler.

+10
source

You can always use your favorite compiler A to write another compiler, say B. In this B, you added extra functionality so that it can become your favorite, and you will use it to write the C compiler, ...

How to start? In the old days, people simply filled the memory with raw numbers to directly interpret the CPU. This is why the source is often called code. Once the minimal compiler is programmed this way, it can be executed to create another one written in the compiled language. This can again be used to create a higher level, etc.

In fact, filling raw command codes into memory can itself be seen as a zero-level compilation process where the person is the compiler.

As a rule, the compiler for a given language is written in one language. For example, this refers to the C programming language. This is somewhat more than a coincidence, because whoever knows the language is good enough to dare to write a compiler for it, this language is probably among his favorite for use. This is just a typical case, although not necessary, as there are many languages ​​to choose from, including especially good ones for building a compiler.

+7
source

The numerical machine code is binary. 1s and 0s. Compilation involves reducing it to some even lower form, so it does not compile.

For example, from the wiki article that you quoted: For example, on the Zilog Z80 processor, the machine code 00000101, which causes the CPU to decrement the B processor register, would be represented in assembly language as DEC B.

Thus, you will have a compiler when you write the Z80 assembler language, and the DEC B command will be compiled to '00000101', and not vice versa.

+6
source

A numerical machine code is a series of on and off states in a circuit and is that all electronic data is at the lowest level. There is no “compiler” for this low-level language, and the circuits on the computer are combined and structured in such a way as to “interpret” them by reading on and off in code implemented by high or low electrical states. In any case, these states of high or low level cause the unlocking / closing of different gates / chains, generally behave differently. See Electronic gates for more details.

+2
source

Source: https://habr.com/ru/post/1490997/


All Articles