How do interpreted languages (like Ruby) work?

Question

How do interpreted languages (like Ruby) work?

I'm going to get to know Ruby. I know this is an interpreted language. I know that compiled languages end up being translated into machine code, but what does the ruby interpreter do? I read that the interpreter was written in C, but each Ruby line is converted to c, which again compiles to machine code? I also heard about JIT, but if that adds a lot of complexity to the answer, you don't need to answer that. I am looking for what happens with my Ruby code.

+6

compiler-construction ruby programming-languages interpreter

Frank klöse Jul 05 '11 at 8:38

source share

1 answer

delnan · Accepted Answer · 2011-07-05T15:44:29+0000

It converts Ruby code into some form of a simpler, “intermediate” representation (in recent versions it compiles to bytecode). It also creates a virtual machine in your computer memory that mimics the physical machine that performs this representation.

This machine reflects the physical, at least how reasonable and useful. It often has a memory for instructions, a program counter, a stack for storing intermediate values and return addresses, etc. Some more complex machines also have registers. There is a fixed and relatively primitive (compared to lanugages like Ruby, not comparable to the actual processor instruction sets). As with the processor, the virtual machine is infinitely cyclical:

Read the current instruction (identified by the program counter).
(It decodes it, although usually it is much simpler than in real processors, at least than in CISC).
Executes it (possibly manipulating the stack and / or registers in the process).
Updates the program counter.

With the interpreter, all this happens through a layer of indirection. Your actual physical processor does not know what it is doing. VM is the software itself, each of the above steps is the delegation of the CPU in several (in cases with fairly high-level bytecode instructions, perhaps tens or hundreds) of physical processor cycles. And this happens every time the instruction is read.

Enter JIT compilation. The simplest form simply replaces each bytecode instruction with a (slightly optimized) copy of the code that will be executed when the interpreter encounters it. This already gives a gain in speed, for example. manipulation of the program counter may be omitted. But there are even smarter options.

JIT tracking, for example, begins as a regular interpreter and additionally monitor the program they run. If they notice that the program spends a lot of time in a certain section of the code (almost always, in a loop or function called from loops), it starts to write down what it does during this - it generates a trace. When it reaches the start point of the recording (after one iteration of the loop), it calls it day and compiles the trace to machine code. But since he saw how the program actually behaves at runtime, it can generate code that exactly matches this behavior. Take, for example, a loop that adds integers. The machine code will not contain any typechecks names or function calls that the interpreter actually performs. At least it will not contain most of them. To ensure correctness, add checks to ensure that the conditions under which the trace was recorded (for example, variables participating in integers) are preserved. When such a check is not performed, it is freed and resumes interpretation until another trace is written. But until that happens, he could do a hundred iterations at a speed that rivals C handwritten code.

How do interpreted languages ​​(like Ruby) work?

More articles:

How do interpreted languages (like Ruby) work?