The post “10 LDFLAGS I love” by Jessie Frazelle reminded me that I know a thing or two about the dynamic linker that many people using Linux rarely need, but when they need them, it’s because they really need them. I came by this knowledge a couple of decades ago, back when I was much more active within Debian than I am today. Debian has always paid a lot of attention to handling shared libraries, and the distribution has pretty strict rules as to how to package them, which led me to dig a bit deeper into how this part of the system works. Very little in this post is specific to Debian and most of the concepts are applicable to systems other than Linux.
The first thing you’ll learn reading thru Debian’s policy regarding
shared libraries is that each library has a shared object name, or
SONAME for short. You can obtain this information by using
readelf or
objdump, both
part of binutils. The
library’s SONAME is stored in the dynamic section of the
ELF file. For
example, consider libz
:
$ readelf -d /lib/x86_64-linux-gnu/libz.so.1.2.8 | fgrep '(SONAME)'
0x000000000000000e (SONAME) Library soname: [libz.so.1]
In Debian, the library is found in the /lib/x86_64-linux-gnu
directory
(this also comes from
policy).
Note how the name includes a number (1), but not the full version number
(1.2.8). You’ll find libraries where this number doesn’t match the
release number, like libpng 1.6.23, that ships a library called
libpng16.so.16
. The SONAME is a convention that is meant to identify a
library’s binary interface (ABI). It changes when there are incompatible
changes, for example, if you remove a function (programs linked
against old versions with the function wouldn’t run with newer
libraries without the function). You also want to change the SONAME
when you add functions (programs linked against newer versions with
the function wouldn’t run with older libraries without the function).
Historically some developers have shown resistance to changing the
number in the SONAME when required, and this has led to a lot of trouble
and pain. There are ways to avoid changing the number even when it’s
necessary to do so, but that requires some planning ahead of time (if
you are interested, read at least section 3 of Ulrich Drepper’s paper
“How To Write Shared
Libraries”).
Who cares about this SONAME? The dynamic linker/loader does for one. This is a program that loads the programs that you want to run. When you do this:
$ /bin/echo 'Hello, world!'
very broadly speaking, what happens is the kernel sees this an ELF file, then it looks for a piece of information called the “requesting program interpreter” (or “interp” for short) and it calls that program with your program as argument. Where’s this program?
$ readelf -l /bin/echo | fgrep 'Requesting program interpreter:'
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
If you look at that file, you’ll see that this is actually a symlink
into the directory /lib/x86_64-linux-gnu
(again, Debian policy). And
yes, this is a program, not a library, even if it lives under /lib
. If
you go ahead and run that program, you’ll see something like this:
$ /lib/x86_64-linux-gnu/ld-2.22.so
Usage: ld.so [OPTION]... EXECUTABLE-FILE [ARGS-FOR-PROGRAM...]
You have invoked `ld.so', the helper program for shared library
executables. This program usually lives in the file `/lib/ld.so', and
special directives in executable files using ELF shared libraries tell
the system's program loader to load the helper program from this file.
This helper program loads the shared libraries needed by the program
executable, prepares the program to run, and runs it. You may invoke
this helper program directly from the command line to load and run an
ELF executable file; this is like executing that file itself, but always
uses this helper program from the file you specified, instead of the
helper program file specified in the executable file you run. This is
mostly of use for maintainers to test new versions of this helper
program; chances are you did not intend to run this program.
--list list all dependencies and how they are resolved
--verify verify that given object really is a dynamically
linked object we can handle
--inhibit-cache Do not use /etc/ld.so.cache
--library-path PATH use given PATH instead of content of the
environment variable LD_LIBRARY_PATH
--inhibit-rpath LIST ignore RUNPATH and RPATH information in object
names in LIST
--audit LIST use objects named in LIST as auditors
That’s actually interesting! It tells you that you can call this program with another program as its argument… and something happens?
$ /lib/x86_64-linux-gnu/ld-2.22.so /bin/echo 'Hello, world!'
Hello, world!
Instead of letting the kernel read the contents of /bin/echo
to locate
the desired interpreter, you are invoking a interpreter directly, and it
is executing your program. Notice that I said “a” and not “the”, as it
is sometimes useful to use an interpreter different than the one
indicated in the program, for testing new versions of the loader, as the
output itself suggests, or when you have your hands on an executable
compiled for a different system that has the loader in a different
location.
This program has a handful of flags. Two of them should immediately
catch your eye: --verify
and --list
.
--verify
tells you (via exit value) if the dynamic loader can handle
the binary in the argument. For example, this loader handles 64-bit
executables but it cannot handle 32-bit ones (there’s a different
loader for those; if you are intested, start by reading Debian’s
multiarch wiki page).
--list
ranks high on my favorites list. It tells you which files are
used to satisfy the binary’s NEEDs:
$ readelf -d /bin/echo | fgrep '(NEEDED)'
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
The NEEDED
entries are located in the dynamic section of the ELF file.
The one here says that echo
needs an object with the SONAME
libc.so.6
. It doesn’t say which file it needs. This is where the
dynamic loader comes in: it looks at the SONAMEs needed by the binary
and it looks at the libraries installed in the system looking for the
ones that provide those SONAMEs. --list
then tells you which files are
actually used when you run the program:
$ /lib/x86_64-linux-gnu/ld-2.22.so --list /bin/echo
linux-vdso.so.1 (0x00007ffe303b8000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdc378dd000)
/lib64/ld-linux-x86-64.so.2 => /lib/x86_64-linux-gnu/ld-2.22.so (0x0000558ad7fa8000)
“Wait a minute! Isn’t that what ldd does?” So much so, that this is
exactly what ldd
does (well, not exactly exactly, but I’ll get
there): in modern systems, ldd
is a shell script that uses the
dynamic loader to do its work. (As a footnote, this is why it’s
important for you to not couple a script with a particular interpreter
using an extension like .sh
, .pl
or .py
: you rarely care about how
something is implemented, you care about the interfaces it implements)
There’s another flag in that list that is pretty nifty: --audit
. But
that’s a post in and by itself, so I’ll not mention anything else about
it here.
Is that it? Not by a long shot.
My favorite interface to the dynamic loader isn’t the flags that you see
listed above, but environment variables, and among those my real
favorite is LD_DEBUG
:
$ LD_DEBUG=help /lib/x86_64-linux-gnu/ld-2.22.so
Valid options for the LD_DEBUG environment variable are:
libs display library search paths
reloc display relocation processing
files display progress for input file
symbols display symbol table processing
bindings display information about symbol binding
versions display version dependencies
scopes display scope information
all all previous options combined
statistics display relocation statistics
unused determined unused DSOs
help display this help message and exit
To direct the debugging output into a file instead of standard output a
filename can be specified using the LD_DEBUG_OUTPUT environment
variable.
For example, libs
will show you how the libraries are actually
located:
$ LD_DEBUG=libs /lib/x86_64-linux-gnu/ld-2.22.so \
--inhibit-cache /bin/echo 'Hello, world!'
24749: find library=libc.so.6 [0]; searching
24749: search path=/lib/x86_64-linux-gnu/tls/x86_64:/lib/x86_64-linux-gnu/tls:/lib/x86_64-linux-gnu/x86_64:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu/tls/x86_64:/usr/lib/x86_64-linux-gnu/tls:/usr/lib/x86_64-linux-gnu/x86_64:/usr/lib/x86_64-linux-gnu:/lib/tls/x86_64:/lib/tls:/lib/x86_64:/lib:/usr/lib/tls/x86_64:/usr/lib/tls:/usr/lib/x86_64:/usr/lib (system search path)
24749: trying file=/lib/x86_64-linux-gnu/tls/x86_64/libc.so.6
24749: trying file=/lib/x86_64-linux-gnu/tls/libc.so.6
24749: trying file=/lib/x86_64-linux-gnu/x86_64/libc.so.6
24749: trying file=/lib/x86_64-linux-gnu/libc.so.6
24749:
24749:
24749: calling init: /lib/x86_64-linux-gnu/libc.so.6
24749:
24749:
24749: initialize program: /bin/echo
24749:
24749:
24749: transferring control: /bin/echo
24749:
Hello, world!
As you can see, the dynamic loader has a path that it uses to search for
libraries. There are several mechanisms used to configured this path,
but describing them goes beyond the scope of this post. Same goes for
why it’s looking for directories called tls
and x86_64
. If you are
interested, look in the manual page for
ld.so(8)
.
The dynamic loader will perform this search for each and every SONAME listed in the ELF file, meaning that programs linked to lots and lots of libraries will load slightly slower than programs with less of those (more on this below).
You probably noticed that I sneaked in the --inhibit-cache
flag there.
Since you are running programs much much often than you are adding
shared libraries to the system, instead of repeating this process each
time you call a program, there’s a cache (/etc/ld.so.cache
) that you
can update by calling
ldconfig(8)
.
It’s worth noting that you do not need to invoke the dynamic loader by hand to use this functionality, it’s enough to set the environment variable to the correct value, like this:
$ LD_DEBUG=libs /bin/echo 'Hello, world!'
Now that the loader has found all the libraries that the program
requires, it proceeds to process the symbols indicated in the program.
This process is absolutely fascinating and full of quirks and turns, and
I cannot cover all its details in here, so I’ll limit myself to briefly
examining the output of the loader when you use the symbols
option.
This output is way too long for me to include it here, so I’ll just show
a short illustrative snippet:
$ LD_DEBUG=symbols /bin/echo
25379: symbol=_res; lookup in file=/bin/echo [0]
25379: symbol=_res; lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
25379: symbol=stderr; lookup in file=/bin/echo [0]
25379: symbol=error_one_per_line; lookup in file=/bin/echo [0]
25379: symbol=error_one_per_line; lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
...
What’s happening here is that for each undefined symbol in the loaded objects (not just the program, but all the objects that have been loaded up to this point), the dynamic loader is looking for that symbol in all the loaded objects until it finds one that matches the missing one. This is actually the reason why a program linking to lots and lots of shared libraries will load slower: the loader has to scan thru a larger set of symbols to decide which one it can use for the missing ones. Relatively speaking, it wasn’t that long ago that this process was optimized to use better data structures to solve this problem (absolutely speaking, it was several years ago).
You can find the missing symbols using readelf
, like this:
$ readelf -s /bin/echo
Symbol table '.dynsym' contains 57 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __uflow@GLIBC_2.2.5 (2)
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND getenv@GLIBC_2.2.5 (2)
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND free@GLIBC_2.2.5 (2)
...
Here you see that getenv
and free
are undefined. You can also see
that /bin/echo
wants specific versions of those symbols.
If you look closely at the output of symbols
, you’ll see that the
dynamic loader is looking first in the ELF file itself, and then in
the loaded libraries. The practical effect of this is that the order in
which you pass -l
flags to your compiler is important because that
will determine the order in which the loader uses the libraries for
symbol resolution. The less practical effect is that, if you know how,
you could define a symbol in your program that overrides symbols in the
libraries that your program is linking with (a hint for one possible way
to do this is found in Jessie’s post).
There’s another helpful environment variable that the dynamic loader
pays attention to: LD_TRACE_LOADED_OBJECTS
. This is what ldd
actually uses to display the list of libraries that are loaded. If you
compare the output of setting this variable and the output of --list
you’ll notice that they are slightly different. ldd
will optionally
set LD_VERBOSE=1
(to output version information) and LD_DEBUG=unused
(to output unused shared libraries).
Another favorite of mine is LD_PRELOAD
. It is used to specify the name
of a library to be loaded before any other library. This is useful if
you want to replace a function provided in a library with a different
implementation. This is done for example by the program fakeroot
.
Among other things, fakeroot
replaces the function getuid(2)
by a
version that returns 0, no matter which user is calling the program,
meaning that the code will think it’s running as the root user. This is
extensively used in Debian to be able to package programs without having
to depend on administrative access to the host where the packaging
scripts are running (which is a really good thing™). In the past I used
LD_PRELOAD
to implement an OpenGL debugger, which provided all the
OpenGL library functions, logged the calls and routed the calls to the
actual OpenGL library in the system by dlopen’ing the running program.
Using a similar concept you could intercept calls to modify the
program’s behavior at runtime, without recompiling, for example to
change the blending mode and add transparency to all the rendered
fragments (I’ll leave the reason as to why you would want to do this up
to your imagination).
The documentation for the link and the dynamic loader is full of golden nuggets like these ones. When I started looking at this topic so many years ago Google didn’t exist as a company and blog posts were still several years ahead in the future, and I know I wish someone had given me a roadmap for the things I needed to read to understand the topic. Instead of all that, I took the long way around and I read source code. Today you have Google, you have blog posts and papers, and you have nicely organized documentation. I encourage you to follow the links in this post and dig into them!