Background . I am trying to implement a system as described in this previous answer . In short, I have an application that references a shared library (currently on Linux). I would like this shared library to switch between multiple implementations at runtime (for example, based on whether the central processor supports a specific set of instructions).
In my simplest case, I have three different shared library files:
libtest.so : This is the version of the "vanilla" library that will be used as a backup case.libtest_variant.so : This is the "optimized" version of the library that I would like to select at runtime if the processor supports it. It is compatible with ABI with libtest.so .libtest_dispatch.so : This is the library that is responsible for libtest_dispatch.so use the library at runtime.
In accordance with the approach suggested in the linked answer above, I do the following:
- The final application is associated with
libtest.so . - I have a
DT_SONAME libtest.so field set to libtest_dispatch.so . So when I run the application, it will load libtest_dispatch.so instead of the actual libtest.so dependency. libtest_dispatch.so configured to have a constructor function that looks like this (pseudocode):
__attribute__((constructor)) void init() { if (can_use_variant) dlopen("libtest_variant" SHLIB_EXT, RTLD_NOW | RTLD_GLOBAL); else dlopen("libtest" SHLIB_EXT, RTLD_NOW | RTLD_GLOBAL); }
The dlopen() call will load the shared library, which provides the appropriate implementation, and the application moves.
Result: It works! If I put the same named function in each shared library, at runtime I can verify that the corresponding version is running based on the conditions used by the distribution library.
Problem:. The above work is for an example of toys, which I demonstrated in a related question. In particular, it works fine if libraries only export functions. However, when there are variables in the game (be it global variables with C-binding or C ++ constructs such as typeinfo ), I get errors of unresolved characters at runtime.
The code below demonstrates the problem:
libtest.h
extern int bar; int foo();
libtest.cc
#include <iostream> int bar = 2; int foo() { std::cout << "function call came from libtest" << std::endl; return 0; }
libtest_variant.cc
#include <iostream> int bar = 1; int foo() { std::cout << "function call came from libtest_variant" << std::endl; return 0; }
libtest_dispatch.cc
#include <dlfcn.h> #include <iostream> #include <stdlib.h> __attribute__((constructor)) void init() { if (getenv("USE_VARIANT")) dlopen("libtest_variant" SHLIB_EXT, RTLD_NOW | RTLD_GLOBAL); else dlopen("libtest" SHLIB_EXT, RTLD_NOW | RTLD_GLOBAL); }
test.cc
#include "lib.h" #include <iostream> int main() { std::cout << "bar: " << bar << std::endl; foo(); }
I create libraries and test the application using the following:
g++ -fPIC -shared -o libtest.so libtest.cc -Wl,-soname,libtest_dispatch.so g++ -fPIC -shared -o libtest_variant.so libtest_variant g++ -fPIC -shared -o libtest_dispatch.so libtest_dispatch.cc -ldl g++ test.cc -o test -L. -ltest -Wl,-rpath,.
Then I try to run the test using the following command lines:
> ./test ./test: symbol lookup error: ./test: undefined symbol: bar > USE_VARIANT=1 ./test ./test: symbol lookup error: ./test: undefined symbol: bar
Failure. If I delete all instances of the global variable bar and try to send only the foo() function, then all of this will work. I am trying to find out exactly why and if I can get the effect I want in the presence of global variables.
Debugging:. While trying to diagnose the problem, I performed some game with the environment variable LD_DEBUG during the launch of the test program. It seems like the problem boils down to the following:
The dynamic linker moves global variables from shared libraries very early in the loading process, before invoking constructors from shared libraries. So he tries to find some characters of the global variable before my distribution library can start its constructor and load the library that will actually provide those characters.
This seems to be a big roadblock. Is there a way to change this process so that my dispatcher can work first?
I know that I could preload the library using LD_PRELOAD . However, this is a cumbersome requirement for the environment in which my software will ultimately run. I would like to find another solution, if possible.
Upon further examination, it turns out that even if I have an LD_PRELOAD library, I have the same problem. The constructor is still not executed until the resolution of the global variable appears. Using the preload function simply pushes the desired library to the top of the list of libraries.