Is there a reliable way to find out which libraries dlopen () can be in the binary elf system?

Question

Is there a reliable way to find out which libraries dlopen () can be in the binary elf system?

Basically, I want to get a list of libraries that a binary can load.

The unnecessary way I came across seems to work (with possible false positives):

comm -13 <(ldd elf_file | sed 's|\s*\([^ ]*\)\s.*|\1|'| sort -u) <(strings -a elf_file | egrep '^(|.*/)lib[^:/]*\.so(|\.[0-9]+)$' | sort -u)

This is unreliable. But it gives useful information, even if the binary was deleted.

Is there any reliable way to get this information without false positives?

EDIT : more context.

Firefox is moving from using gstreamer to using ffmpeg . I was wondering which versions of libavcodec.so would work. libxul.so uses dlopen() for many additional functions. And the library names are hardcoded. So the above command helps in this case.

I also have a general interest in package management and binary dependencies. I know that you can get direct dependencies with readelf -d , dependency dependencies with ldd . And I was wondering about additional dependencies, so the question is.

+5

c posix gnu elf

Not important Dec 18 '15 at 17:54

source share

2 answers

^{(I am focused on Linux, I believe that most of my answer is suitable for every POSIX system, but on MacOSX dlopen .dylib files are required dynamic libraries , not .so shared objects)}

A program can even emit some C code in some temporary file /tmp/foo1234.c , fork compiling this /tmp/foo1234.c into the general library /tmp/foo1234.so with some command gcc -O -shared -fPIC /tmp/foo1234.c -o /tmp/foo1234.so - generated and executed during the execution of your program-, possibly delete the file /tmp/foo1234.c , since it is no longer needed, and dlopen that /tmp/foo1234.so ( and maybe even delete /tmp/foo1234.so after dlopen ), all in the same process. My GCC MELT plugin for gcc does just that, and so Bigloo , and the GCCJIT library , does something close.

Thus, your quest is impossible and does not even make sense.

Is there any reliable way to get this information without false positives?

No, there is no reliable way to get such information without false positives (you can prove that this is equivalent to a stop problem or some other insoluble problem ). See also Rice theorem .

In practice, most dlopen happens on plugins provided by some configuration. The configuration file cannot be exactly specified as such (for example, some Foo programs may have an agreement similar to a plugin with the name bar in some foo.conf configuration file provided by the foo-bar.so plugin).

However, you may find some heuristic approximation. Most programs running some dlopen have some kind of plugin convention requesting some specific symbol names in the plugin. You can search for common objects that define these names. Of course, you will get false positives.

For example, the zsh shell accepts plugins called zsh modules . example shows that enables_ , boot_ , features_ , etc. functions are expected in zsh modules. You can use nm -D to search for *.so files that support them (therefore, searching for plugins can probably be downloaded using zsh )

^{(I'm not sure that this approach is worth it, and in fact you should know which plugins are useful for your system, with which applications)}

By the way, you can use strace (1) when executing any command to understand the system calls it makes, so plugins are loading. You can also use ltrace (1) or pmap (1) (in any given process) or just use process cat /proc/1234/maps for process cat /proc/1234/maps to understand its virtual address space , so the plugins are already loaded. See proc (5) .

Note that strace , ltrace , pmap exist on Linux, but many POSIX systems have similar programs.

In addition, the program can generate some machine code at runtime and execute it ( SBCL does this with every REPL interaction!). Your program can also use some JIT methods (e.g. libjit , llvm , asmjit , GCCJIT or with handwritten code ...) to do the same. Thus, plugin behavior can occur without dlopen (and you could imitate dlopen with mmap calls and some ELF move handling ).

Additions:

If you install firefox from its packaged version (for example, the iceweasel package on Debian), its package will most likely handle dependencies

+3

Basile starynkevitch Dec 18 '15 at 18:15

source share

Jay · Accepted Answer · 2015-12-18T18:03:05+0000

ldd tells you the libraries that your binary was associated with. These are not the ones that the program could open with dlopen .

Signature for dlopen is

 void *dlopen(const char *filename, int flag);

That way, you can, not yet feasible, run strings in binary format, but it can still fail if the library name is not a static string, but built or read from somewhere during program execution - and this last situation means that the answer to your question is no ... Not reliable. (The library file name could be read from the network, from a Unix socket, or even without compression on the fly, for example. Anything is possible! - although I would not recommend any of these ideas myself ...)

edit : also, as John Bolinger noted, library names could be read from the configuration file.

edit : you can also try replacing the dlopen system call dlopen one of yours (this is done by the Boehm garbage collector with malloc , for example), so it will open the library and also write down its name somewhere. But if the program did not open a specific library at runtime, you still will not know about it.

Is there a reliable way to find out which libraries dlopen () can be in the binary elf system?

Additions:

More articles: