How PyPy Can Be Faster Than Cpython

I read PyPy - how can it defeat CPython? and countless other things, but I can't figure out how something written in Python is faster than python itself.

The only way I can think of this is that PyPy somehow bypasses C and compiles directly into assembly language instructions. If so, then this is normal.

Can someone explain to me how PyPy works? I need a simple answer.

I love python and want to start helping. PyPy looks like an amazing place to start, regardless of whether it pulls my code or not. But I can not understand from a brief study that I did.

+4
source share
4 answers

The easiest way to understand PyPy is to forget that it is implemented in Python.

Actually this is not so, it is implemented in RPython. RPython is started using the Python interpreter, but Python code can not be compiled by the RPython compiler (PyPy framework). RPython is a subset of Python, but the parts that are β€œleft” are significant enough that programming in RPython is very different from programming commonly used in Python.

So, since Python code cannot be viewed as RPython code, and RPython idiomatic programs β€œlook and feel” that are very different from Python idiomatic programs can completely ignore the relationship between them and consider the compiled example.

Pretend I developed a new language, Frobble, with a compiler. And I wrote a Python interpreter in Frobble. I argue that my FrobblePython interpreter is often significantly faster than the CPython interpreter.

Does it seem strange or impossible to you? Of course not. The new Python interpreter can be either faster or slower than the CPython interpreter (or, more likely, faster in some cases and slower in others, changing boundaries). Whether this will be faster or not will depend on the implementation of FrobblePython, as well as on the performance characteristics of the code compiled by my Frobble compiler.

This is exactly how you should think of the PyPy interpreter. The fact that the language used to implement it, RPython, can be interpreted by the Python interpreter (with the same external results as compiling the RPython program and running it), it is completely irrelevant to understanding how fast it works. All that matters is the implementation of the PyPy interpreter and the performance characteristics of the code compiled by the RPython compiler (for example, the fact that the RPython compiler can automatically add certain JITing types for the programs that it compiles).

+11
source

The answer "he has JIT" is technically correct, but insufficient. PyPy, executed as Python code by the Python interpreter, can JIT compile the Python code that it interprets (in fact, JIT tests often run this way), but are still terribly slow (it may take a few minutes to start the interpretation).

The missing part, which precedes JIT and is actually required for JIT, writes the interpreter in a limited subset of Python (called RPython) and then compiles it into C code. This way you get a program that runs at roughly the level of C abstraction (despite that it is written as a higher level of abstraction). This interpreter has historically been, and AFAIK is still somewhat slower than CPython, but not several orders of magnitude slower (as an interpreted interpreter).

Your comment about compiling directly to the assembly is confusing. Assembly code will not be automatically faster than C code - in fact, it will be difficult for you to surpass today's C compilers in generating assembly code, and C code is much easier to write and / or generate without even getting into any portability mess. The problem is not to turn Python into C or assembly (look at Nuitka), the problem is to rephrase the program more efficiently without affecting semantics. Going straight to the assembly does not solve any of the difficult problems with this, makes the relatively easy task of generating code for a more efficient program more difficult, and very rarely allows any optimizations that you also cannot express in C.

PyPy JIT now generates machine code, but the PyPy executable is compiled from C code by the C compiler. PyPy developers would be idiots if they tried to compete with existing C compilers on the same platform, not to mention several platforms. Fortunately, they are not idiots and know this. The reasons that JITs generate assembly code are different and much better (for starters, there are several optimizations in the JIT context that you cannot do in C).

By the way, most of what I wrote above is also indicated in the answers to the question that you are referring to.

+6
source

PyPy itself is written in RPython, which is a limited subset of Python. Although you can run it on top of CPython, it is very slow, so instead you translate this RPython to C, therefore, bypassing the interpretation. This, theoretically, might be faster than CPython, but actually quite a bit slower. In addition, a time-only compiler (also in RPython) is implemented that compiles Python for assembler.

In short, during execution, at run time, the actual double interpretation does not cause any problems.

+3
source

Pypy has a JIT (Just In Time) compilation. Compiling JIT can perform optimization at runtime (because it has not been precompiled).

Code will not be compiled for assembly or C from the very beginning. He interpreted the code (works in the Pypy interpreter). The interpreter can then compile just-in-time compilation.

http://en.wikipedia.org/wiki/Just-in-time_compilation

http://en.wikipedia.org/wiki/Interpreted_language

+2
source

Source: https://habr.com/ru/post/1436872/


All Articles