Both JIT and non-JIT should not be used. Interpreters end up producing machine code

Well, I read a few discussions regarding the differences between JIT and non-JIT interpreters, and why JIT usually improves performance.

However my question is:

Ultimately, shouldn't an interpreter that does not support JIT include bytecode (line by line) in executable machine / native code, as does the JIT compiler? I have seen posts and tutorials that say it is, and posts that say it is not. The final argument is that the / JVM interpreter executes this bytecode directly without interacting with native / native code.

If non-JIT interpreters turn every line into machine code, it seems that the main advantages of JIT ...

  • The caching intelligence is either all (normal JIT) or frequently encountered (hotspot / adaptive optimization) parts of the bytecode, so the step of compiling the machine code is not required every time.

  • Any optimization of JIT compilers can convert bytecode to machine code.

That's for sure? It seems that the difference between bytecode translation and machine code using non-JIT and JIT-compatible interpreters seems to be small (except for possible optimization or JITting vs line by line blocks).

Thanks in advance.

+6
source share
2 answers

The non-JIT interpreter does not convert bytecode to machine code. You can imagine the non-JIT bytecode interpreter working something like this (I will use pseudocode similar to Java):

int[] bytecodes = { ... }; int ip = 0; // instruction pointer while(true) { int code = bytecodes[ip]; switch(code) { case 0; // do something ip += 1; break; case 1: // do something else ip += 1; break; // and so on... } } 

So, for each executable bytecode, the interpreter should get the code, include its value to decide what to do, and increase its "instruction pointer" before proceeding to the next iteration.

With JIT, all this overhead will be reduced to zero. He just took the contents of the corresponding switch branches (the parts that say "// do something"), put them together in memory and go to the beginning of the first. No software instruction pointer is required — only the CPU hardware instruction pointer. The bytecode is not retrieved from memory and does not switch to their values.

Writing a virtual machine is not difficult (unless it should be extremely high), and can be an interesting exercise. I did it once for an embedded project where the program code should be very compact.

+8
source

Decades ago, it seems widely believed that compilers would turn the entire program into machine code, while interpreters would translate the operator into machine code, execute it, drop it, translate the next, etc. This concept was 99% wrong, but there are two tiny kernels of truth. On some microprocessors, some instructions required the use of the addresses that were specified in the code. For example, at 8080 there was an instruction to read or write the specified I / O address 0x00-0xFF, but there was no instruction to read or write the I / O address specified in the register. It was common for language translators if the user code did something like "out 123.45" to store the instructions "out 7Bh / ret" in three bytes of memory, load the battery from 2Dh and make a call to the first of these instructions. In this situation, the interpreter will actually create a machine code instruction to execute the interpreted instruction. However, code generation was mostly limited to things like IN and OUT instructions.

Many common Microsoft BASIC interpreters for 6502 (and possibly 8080) made somewhat more extensive use of the code stored in RAM, but the code that was stored in RAM was substantially independent of the program being executed; most of the RAM did not change during program execution, but the address of the next command was stored on the line as part of a subroutine that allowed the use of an absolute mode “LDA” instruction that saved at least one cycle each byte selection.

0
source

Source: https://habr.com/ru/post/906687/


All Articles