Can C / C ++ software be compiled into bytecode for later execution? (Architecture-independent unix software.)

I would like to compile the existing software into a presentation, which can subsequently be run on different architectures (and the OS).

To do this, I need (byte) code that can be easily run / emulated on another arch / OS ( LLVM IR? Some RISC builds?)

Some random ideas:

  • Compiling into JVM bytecode and working with java. Too restrict? C compilers available?
  • MS CIL. C compilers available?
  • LLVM? Can an intermediate view start later?
  • Compiling into a RISC arch, such as MMIX. What about system calls?

Then there is a system call display object, but, for example, BSDs have system call translation levels.

Are there already working systems that compile C / C ++ into something that can later be started with an interpreter in a different architecture?


Edit

Can I compile existing unix software into not so low-level binary code that can be β€œemulated” more easily than running a full x86 emulator? Something like JVM than XEN HVM.

+6
source share
6 answers

There are several C compilers in the JVM listed on the JVM's Wikipedia page. I have never tried any of them, but they sound like an interesting exercise to create.

Due to its close association with the Java language, the JVM performs strict runtime checks set by the Java specification. This requires that the C compilers in the bytecode provide their own "weak machine abstraction", for example, create compiled code that uses a Java array to represent main memory (so pointers can be compiled for integers) and link the C library to a centralized a Java class that emulates system calls. Most or all of the compilers listed below take a similar approach.

+5
source

C compiled for LLVM bit code is not platform independent. Look at the portable Google client , they are trying to solve it.

Adobe has alchemy which allows you to compile C for flash.

There are C in Java or even JavaScript compilers. However, due to differences in memory management, they are not very convenient.

+4
source

LLVM is not a good solution to this problem. As beautiful as the LLVM IR, it is by no means independent of the machine and is not intended for this. It is very easy and really necessary to generate the target LLVM IR in some languages: sizeof (void *), for example, will be 4 or 8 or whatever, when compiled into IR.

LLVM also does nothing to ensure OS independence.

One interesting feature might be QEMU. You can compile a program for a specific architecture, and then use the QEMU user space emulation to run it on different architectures. Unfortunately, this can solve the problem with the target machine, but it does not solve the OS problem: QEMU Linux user mode emulation only works on Linux systems.

The JVM is probably the best choice for target and OS independence if you want to distribute binary files.

+2
source

As Ankur mentions, C ++ / CLI may be the solution. You can use Mono to run on Linux if it does not have its own bits. But if you already have a code base that you are trying to execute at minimal cost, it may be counterproductive to use it. If that makes sense in your situation, you should go with Java or C #.

Most people who come with C ++ do this for performance reasons, but if you do not play with very low-level materials, you will do the coding earlier at a higher level. This, in turn, gives you time to optimize, so that by the time you were done in C ++, you would have an even faster version in any higher-level language that you decide to use.

+2
source

Web assembly is trying to solve this now by creating a standard bytecode format for the Internet, but unlike JVM bytecode, web assembly is at a lower level, operating at the C / C ++ abstraction level rather than Java, so it looks more like to what is commonly called the "assembler language", which is usually compiled with C / C ++ code.

+2
source

The real problem is that C and C ++ are not architecture independent languages. You can write things that are portable enough in them, but the compiler also hardcodes aspects of the machine through your code. Think for example sizeof (long). In addition, as Richard mentions, there is no OS independence. Therefore, if only the libraries that you use do not have the same agreements and exist on several platforms, then you will not be able to run the application.

It would be best to write your code in a more portable language or provide binaries for the platforms you care about.

+1
source

Source: https://habr.com/ru/post/891106/


All Articles