The availability of some platform-specific functions, such as SSE or AVX, can be determined at runtime, which is very useful if you do not want to compile and send different objects for different functions.
The following code, for example, allows me to test AVX and compile with gcc, which provides the cpuid.h header:
#include "stdbool.h" #include "cpuid.h" bool has_avx(void) { uint32_t eax, ebx, ecx, edx; __get_cpuid(1, &eax, &ebx, &ecx, &edx); return ecx & bit_AVX; }
Instead of clogging the code with runtime checks, like the ones above that do checks repeatedly, are slow and branch out (checks can be cached to reduce overhead, but there will still be a branch), I figured I could use infrastructure provided by the dynamic linker / loader.
Calls to functions with external communication on platforms with ELF are already indirect and pass through the Procedural Relationship Table / PLT and the Global Offset Table / GOT.
Suppose there are two internal functions: the basic _do_something_basic , which is always and somehow an optimized version of _do_something_avx that uses AVX. I could export a generic do_something character and add it to the base add:
static void _do_something_basic(…) {
While loading my library or program, I would like to check the availability of AVX once with has_avx and depending on the result of the checkpoint the do_something character is up to _do_something_avx .
Even better would be if I could point the initial version of the do_something character to a self-modification function that checks for AVX using has_avx and replaces _do_something_basic or _do_something_avx .
In theory this should be possible, but how can I find the location of the PLT / GOT programmatically? Is there an ABI / API providing an ELF loader, for example. ld-linux.so.2 what could i use for this? Do I need a script linker to get the PLT / GOT location? For security reasons, can I even write PLT / GOT if I get a pointer to it?
Perhaps some project has already done this or something very similar.
I fully understand that the solution will be very platform-specific, but since I already have to deal with low-level details on the platform, such as command set functions, this is normal.