ld.so: glibc's dynanic linker/loader

The post “10 LDFLAGS I love” by Jessie Frazelle reminded me that I know a thing or two about the dynamic linker that many people using Linux rarely need, but when they need them, it’s because they really need them. I came by this knowledge a couple of decades ago, back when I was much more active within Debian than I am today. Debian has always paid a lot of attention to handling shared libraries, and the distribution has pretty strict rules as to how to package them, which led me to dig a bit deeper into how this part of the system works. Very little in this post is specific to Debian and most of the concepts are applicable to systems other than Linux.

The first thing you’ll learn reading thru Debian’s policy regarding shared libraries is that each library has a shared object name, or SONAME for short. You can obtain this information by using readelf or objdump, both part of binutils. The library’s SONAME is stored in the dynamic section of the ELF file. For example, consider libz:

$ readelf -d /lib/x86_64-linux-gnu/libz.so.1.2.8 | fgrep '(SONAME)'
 0x000000000000000e (SONAME)             Library soname: [libz.so.1]

In Debian, the library is found in the /lib/x86_64-linux-gnu directory (this also comes from policy).

Note how the name includes a number (1), but not the full version number (1.2.8). You’ll find libraries where this number doesn’t match the release number, like libpng 1.6.23, that ships a library called libpng16.so.16. The SONAME is a convention that is meant to identify a library’s binary interface (ABI). It changes when there are incompatible changes, for example, if you remove a function (programs linked against old versions with the function wouldn’t run with newer libraries without the function). You also want to change the SONAME when you add functions (programs linked against newer versions with the function wouldn’t run with older libraries without the function). Historically some developers have shown resistance to changing the number in the SONAME when required, and this has led to a lot of trouble and pain. There are ways to avoid changing the number even when it’s necessary to do so, but that requires some planning ahead of time (if you are interested, read at least section 3 of Ulrich Drepper’s paper “How To Write Shared Libraries”).

Who cares about this SONAME? The dynamic linker/loader does for one. This is a program that loads the programs that you want to run. When you do this:

$ /bin/echo 'Hello, world!'

very broadly speaking, what happens is the kernel sees this an ELF file, then it looks for a piece of information called the “requesting program interpreter” (or “interp” for short) and it calls that program with your program as argument. Where’s this program?

$ readelf -l /bin/echo | fgrep 'Requesting program interpreter:'
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]

If you look at that file, you’ll see that this is actually a symlink into the directory /lib/x86_64-linux-gnu (again, Debian policy). And yes, this is a program, not a library, even if it lives under /lib. If you go ahead and run that program, you’ll see something like this:

$ /lib/x86_64-linux-gnu/ld-2.22.so
Usage: ld.so [OPTION]... EXECUTABLE-FILE [ARGS-FOR-PROGRAM...]
You have invoked `ld.so', the helper program for shared library
executables.  This program usually lives in the file `/lib/ld.so', and
special directives in executable files using ELF shared libraries tell
the system's program loader to load the helper program from this file.
This helper program loads the shared libraries needed by the program
executable, prepares the program to run, and runs it.  You may invoke
this helper program directly from the command line to load and run an
ELF executable file; this is like executing that file itself, but always
uses this helper program from the file you specified, instead of the
helper program file specified in the executable file you run.  This is
mostly of use for maintainers to test new versions of this helper
program; chances are you did not intend to run this program.

  --list                list all dependencies and how they are resolved
  --verify              verify that given object really is a dynamically
                        linked object we can handle
  --inhibit-cache       Do not use /etc/ld.so.cache
  --library-path PATH   use given PATH instead of content of the
                        environment variable LD_LIBRARY_PATH
  --inhibit-rpath LIST  ignore RUNPATH and RPATH information in object
                        names in LIST
  --audit LIST          use objects named in LIST as auditors

That’s actually interesting! It tells you that you can call this program with another program as its argument… and something happens?

$ /lib/x86_64-linux-gnu/ld-2.22.so /bin/echo 'Hello, world!'
Hello, world!

Instead of letting the kernel read the contents of /bin/echo to locate the desired interpreter, you are invoking a interpreter directly, and it is executing your program. Notice that I said “a” and not “the”, as it is sometimes useful to use an interpreter different than the one indicated in the program, for testing new versions of the loader, as the output itself suggests, or when you have your hands on an executable compiled for a different system that has the loader in a different location.

This program has a handful of flags. Two of them should immediately catch your eye: --verify and --list.

--verify tells you (via exit value) if the dynamic loader can handle the binary in the argument. For example, this loader handles 64-bit executables but it cannot handle 32-bit ones (there’s a different loader for those; if you are intested, start by reading Debian’s multiarch wiki page).

--list ranks high on my favorites list. It tells you which files are used to satisfy the binary’s NEEDs:

$ readelf -d /bin/echo | fgrep '(NEEDED)'
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

The NEEDED entries are located in the dynamic section of the ELF file. The one here says that echo needs an object with the SONAME libc.so.6. It doesn’t say which file it needs. This is where the dynamic loader comes in: it looks at the SONAMEs needed by the binary and it looks at the libraries installed in the system looking for the ones that provide those SONAMEs. --list then tells you which files are actually used when you run the program:

$ /lib/x86_64-linux-gnu/ld-2.22.so --list /bin/echo
        linux-vdso.so.1 (0x00007ffe303b8000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdc378dd000)
        /lib64/ld-linux-x86-64.so.2 => /lib/x86_64-linux-gnu/ld-2.22.so (0x0000558ad7fa8000)

“Wait a minute! Isn’t that what ldd does?” So much so, that this is exactly what ldd does (well, not exactly exactly, but I’ll get there): in modern systems, ldd is a shell script that uses the dynamic loader to do its work. (As a footnote, this is why it’s important for you to not couple a script with a particular interpreter using an extension like .sh, .pl or .py: you rarely care about how something is implemented, you care about the interfaces it implements)

There’s another flag in that list that is pretty nifty: --audit. But that’s a post in and by itself, so I’ll not mention anything else about it here.

Is that it? Not by a long shot.

My favorite interface to the dynamic loader isn’t the flags that you see listed above, but environment variables, and among those my real favorite is LD_DEBUG:

$ LD_DEBUG=help /lib/x86_64-linux-gnu/ld-2.22.so
Valid options for the LD_DEBUG environment variable are:

  libs        display library search paths
  reloc       display relocation processing
  files       display progress for input file
  symbols     display symbol table processing
  bindings    display information about symbol binding
  versions    display version dependencies
  scopes      display scope information
  all         all previous options combined
  statistics  display relocation statistics
  unused      determined unused DSOs
  help        display this help message and exit

To direct the debugging output into a file instead of standard output a
filename can be specified using the LD_DEBUG_OUTPUT environment
variable.

For example, libs will show you how the libraries are actually located:

$ LD_DEBUG=libs /lib/x86_64-linux-gnu/ld-2.22.so \
	--inhibit-cache /bin/echo 'Hello, world!'
     24749:     find library=libc.so.6 [0]; searching
     24749:      search path=/lib/x86_64-linux-gnu/tls/x86_64:/lib/x86_64-linux-gnu/tls:/lib/x86_64-linux-gnu/x86_64:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu/tls/x86_64:/usr/lib/x86_64-linux-gnu/tls:/usr/lib/x86_64-linux-gnu/x86_64:/usr/lib/x86_64-linux-gnu:/lib/tls/x86_64:/lib/tls:/lib/x86_64:/lib:/usr/lib/tls/x86_64:/usr/lib/tls:/usr/lib/x86_64:/usr/lib          (system search path)
     24749:       trying file=/lib/x86_64-linux-gnu/tls/x86_64/libc.so.6
     24749:       trying file=/lib/x86_64-linux-gnu/tls/libc.so.6
     24749:       trying file=/lib/x86_64-linux-gnu/x86_64/libc.so.6
     24749:       trying file=/lib/x86_64-linux-gnu/libc.so.6
     24749:
     24749:
     24749:     calling init: /lib/x86_64-linux-gnu/libc.so.6
     24749:
     24749:
     24749:     initialize program: /bin/echo
     24749:
     24749:
     24749:     transferring control: /bin/echo
     24749:
Hello, world!

As you can see, the dynamic loader has a path that it uses to search for libraries. There are several mechanisms used to configured this path, but describing them goes beyond the scope of this post. Same goes for why it’s looking for directories called tls and x86_64. If you are interested, look in the manual page for ld.so(8).

The dynamic loader will perform this search for each and every SONAME listed in the ELF file, meaning that programs linked to lots and lots of libraries will load slightly slower than programs with less of those (more on this below).

You probably noticed that I sneaked in the --inhibit-cache flag there. Since you are running programs much much often than you are adding shared libraries to the system, instead of repeating this process each time you call a program, there’s a cache (/etc/ld.so.cache) that you can update by calling ldconfig(8).

It’s worth noting that you do not need to invoke the dynamic loader by hand to use this functionality, it’s enough to set the environment variable to the correct value, like this:

$ LD_DEBUG=libs /bin/echo 'Hello, world!'

Now that the loader has found all the libraries that the program requires, it proceeds to process the symbols indicated in the program. This process is absolutely fascinating and full of quirks and turns, and I cannot cover all its details in here, so I’ll limit myself to briefly examining the output of the loader when you use the symbols option. This output is way too long for me to include it here, so I’ll just show a short illustrative snippet:

$ LD_DEBUG=symbols /bin/echo
     25379:     symbol=_res;  lookup in file=/bin/echo [0]
     25379:     symbol=_res;  lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
     25379:     symbol=stderr;  lookup in file=/bin/echo [0]
     25379:     symbol=error_one_per_line;  lookup in file=/bin/echo [0]
     25379:     symbol=error_one_per_line;  lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
     ...

What’s happening here is that for each undefined symbol in the loaded objects (not just the program, but all the objects that have been loaded up to this point), the dynamic loader is looking for that symbol in all the loaded objects until it finds one that matches the missing one. This is actually the reason why a program linking to lots and lots of shared libraries will load slower: the loader has to scan thru a larger set of symbols to decide which one it can use for the missing ones. Relatively speaking, it wasn’t that long ago that this process was optimized to use better data structures to solve this problem (absolutely speaking, it was several years ago).

You can find the missing symbols using readelf, like this:

$ readelf -s /bin/echo

Symbol table '.dynsym' contains 57 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __uflow@GLIBC_2.2.5 (2)
     2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND getenv@GLIBC_2.2.5 (2)
     3: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND free@GLIBC_2.2.5 (2)
     ...

Here you see that getenv and free are undefined. You can also see that /bin/echo wants specific versions of those symbols.

If you look closely at the output of symbols, you’ll see that the dynamic loader is looking first in the ELF file itself, and then in the loaded libraries. The practical effect of this is that the order in which you pass -l flags to your compiler is important because that will determine the order in which the loader uses the libraries for symbol resolution. The less practical effect is that, if you know how, you could define a symbol in your program that overrides symbols in the libraries that your program is linking with (a hint for one possible way to do this is found in Jessie’s post).

There’s another helpful environment variable that the dynamic loader pays attention to: LD_TRACE_LOADED_OBJECTS. This is what ldd actually uses to display the list of libraries that are loaded. If you compare the output of setting this variable and the output of --list you’ll notice that they are slightly different. ldd will optionally set LD_VERBOSE=1 (to output version information) and LD_DEBUG=unused (to output unused shared libraries).

Another favorite of mine is LD_PRELOAD. It is used to specify the name of a library to be loaded before any other library. This is useful if you want to replace a function provided in a library with a different implementation. This is done for example by the program fakeroot. Among other things, fakeroot replaces the function getuid(2) by a version that returns 0, no matter which user is calling the program, meaning that the code will think it’s running as the root user. This is extensively used in Debian to be able to package programs without having to depend on administrative access to the host where the packaging scripts are running (which is a really good thing™). In the past I used LD_PRELOAD to implement an OpenGL debugger, which provided all the OpenGL library functions, logged the calls and routed the calls to the actual OpenGL library in the system by dlopen’ing the running program. Using a similar concept you could intercept calls to modify the program’s behavior at runtime, without recompiling, for example to change the blending mode and add transparency to all the rendered fragments (I’ll leave the reason as to why you would want to do this up to your imagination).

The documentation for the link and the dynamic loader is full of golden nuggets like these ones. When I started looking at this topic so many years ago Google didn’t exist as a company and blog posts were still several years ahead in the future, and I know I wish someone had given me a roadmap for the things I needed to read to understand the topic. Instead of all that, I took the long way around and I read source code. Today you have Google, you have blog posts and papers, and you have nicely organized documentation. I encourage you to follow the links in this post and dig into them!