This is a long overdue follow up to my glibc’s dynamic linker/loader post from last year. With the Go 1.8 release around the corner, which adds support for dynamic plugins, this is a good time to revisit the topic.

Plugins in Go are described by Ian Lance Taylor in the “Go Execution Modes” design document. They are part of a larger effort to support dynamic shared objects in Go. As Ian points out, Go 1.4 supported three execution modes:

  • A statically linked Go binary
  • A dynamically linked Go binary, linked with the C library for DNS and user name lookups
  • A Go binary linked with arbitrary non-Go code, either statically or dynamically

Simplifying a little bit, if you have never cared about cgo, you’ve been likely working with the first or second modes, without you being aware of it. Go 1.5 added support for an alternative DNS resolver that doesn’t require cgo, meaning the second mode is not as tightly coupled with DNS as it used to be (if you do want to use the C library’s resolver, you still need it). Cgo support is required for the second mode, but if your code doesn’t use packages that depend on cgo, you might have cgo support in the compiler and still have binaries working in the first mode. It’s even possible to force the use of the Go DNS resolver via build tags.

Go 1.5 also added the following modes:

  • Go code linked into, and called from, a non-Go program
  • Go code as a shared library plugin with a C style API
  • Building Go packages as a shared library

The first one is your Go code as a static library that you can link into another program and call it with a C style API. The second one is your Go code acting as a shared object loaded at runtime by another program. And the third one is your Go code built as a shared object loaded by the dynamic linker at the program’s load time. The last two are very similar, but not exactly the same thing.

Go 1.6 added another mode:

  • A Go program built as a PIE

In this mode your Go code is built in such a way that it’s position independent, meaning it can be loaded at any memory address and it will work, which is something that security-concious applications care about.

There’s still one mode listed in the design document missing in 1.7:

  • Go code that uses a shared library plugin

This is the new mode implemented in Go 1.8: you can write a plugin in Go, and you can load that plugin from your Go program. In order to support this the plugin package was added to the standard library.

Plugins

Since it’s the new shiny feature, let’s take a look at plugins first.

This program snippet is mostly what’s required to work with a plugin:

// Load the DSO specified by a filename.
p, err := plugin.Open(fn)
if err != nil {
	fmt.Printf("plugin.Open: %s\n", err)
	return
}

// Lookup a symbol named "Hello". If we get something, we don't know
what we got (a variable or function).
h, err := p.Lookup("Hello")
if err != nil {
	fmt.Printf("p.Lookup: %s\n", err)
	return
}

// Msger is an interface that specifies that a func Msg() string must be
// implemented. Type-assert that interface to verify that the symbol
obtained above is of the correct kind.
m, ok := h.(Msger)
if !ok {
	fmt.Println("E: Expecting Msger interface, but got something else.")
	return
}

// We have what we want. Use it.
fmt.Printf("%s: %s\n", fn, m.Msg())

(full code here)

You can compile the above code in the usual way, e.g. go build demo.go. If you do this you’ll notice something interesting:

$ go build demo.go

$ file demo
demo: [...], dynamically linked, [...]

$ readelf -d demo | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

This program needs to import "plugin", and that’s what causes it to be dynamically linked. libdl.so.2 is a C library that provides the necessary functionality to load code at runtime.

What about the plugin itself? In the example above, the program expects the loaded plugin to have a symbol named “Hello” and that symbol should be a variable implementing a specific interface. The information required to make this determination is stored in the plugin.

One possible such plugin might look like this:

package main

import "C"

type EnglishMsger struct{}

func (EnglishMsger) Msg() string {
	return "hello, world! (from plugin)"
}

var Hello EnglishMsger

(full code here)

Another one is this:

package main

import "C"

type SpanishMsger struct{}

func (SpanishMsger) Msg() string {
	return "¡hola, mundo! (desde un plugin)"
}

var Hello SpanishMsger

(full code here)

Either one will work, and which one is loaded is determined at runtime. Note that the only thing that these plugins have in common is that both have a global variable named Hello. The types of the variables are different, but they satisfy the same interface. You could have used a function, and the only thing different would have been that you would need to type-assert to a different type.

These plugins can be compiled like this:

$ go build -buildmode=plugin english.go

$ go build -buildmode=plugin spanish.go

This will produce files called english.so and spanish.so respectively, but you can use -o and name your plugin whatever you want (if english.plugin is what you want, that works, the file doesn’t even need to be named .so).

With the program shown above, you can use these plugins like this:

$ ./demo english.so
hello, world! (from plugin)

$ ./demo spanish.so
¡hola, mundo! (desde un plugin)

Needless to say, you can implement much more than a multilingual hello world.

I’d like to emphasize a point here: you can lookup exported functions or variables, and you must type-assert them before being able to use them. If your symbol is a function, you can use it like any other function value. If your symbol is a variable, you can use it like you would use any other instance of the corresponding type: you can read the value, you can call the methods defined for that type, and if satisfies an interface, you can use it anywhere where the interface is valid.

In short, to use a plugin, all you have to do is:

  • Load the plugin
  • Lookup a symbol by name
  • Type-assert the symbol to the type that you expect
  • Use the loaded code

Not your regular dlopen

If you are familiar with how the equivalent C code would work, you might have noticed a couple of things.

First, I’m using “english.so” as the argument to the demo program, and that gets passed to plugin.Open. With your regular dlopen usage, if you pass a path without a slash in it, it will apply certain lookup rules to locate the file. plugin.Open is not dlopen and it won’t apply those rules. Instead it will take whatever path you pass to it and canonicalize it: it will remove any . and .. it might contain, traverse any symlinks and make it absolute. So the example above behaves as if I had used $PWD/add_plugin.so instead.

Second, the program is looking up Hello. With a Go programmer’s mindset this might look normal, as Hello is in fact the name of the variable, but… is it? It’s defined in package main, so its name would normally be main.Hello (even if you cannot actually use that name). If you build the plugins yourself, you’ll find that that’s not the name recorded in the binaries. The plugin package performs some mapping between the names as recorded in the binaries, and the names of the symbols as requested by the loading program. This is why you can say that you want "Hello" as shown in the code.

Examining the plugin package, you’ll notice an interesting restriction: if both your program and the plugin use package foo, the foo packages used to compile the program and the plugin must be identical, meaning at runtime the hashes for package foo stored in both binaries will be compared and an error will be produced if they don’t match. This will become relevant in a minute.

Shared libraries

How about building a dynamically linked Go program with dynamically linked Go code? It should be dead easy, right?

$ go install -buildmode=shared github.com/mem/dso/lib

(full code here)

The output of that command is this:

multiple roots ${GOPATH}/pkg/linux_amd64_dynlink & ${GOROOT}/pkg/linux_amd64_dynlink

This is the compiler’s way of saying that it’s trying to install packages to two different locations.

What this really means is that since I’ve never built the standard library with -buildmode=shared, the compiler is trying to do it as part of the command shown above, and since my toy lib lives in GOPATH and the standard library lives in GOROOT, it’s trying to install packages to two different locations.

Second try:

$ go install -buildmode=shared std

$ go install -buildmode=shared github.com/mem/dso/lib
multiple roots ${GOPATH}/pkg/linux_amd64_dynlink & ${GOROOT}/pkg/linux_amd64_dynlink

Same error, but now it’s trying to say something different (for the same reason): in order to link dynamically against the standard library (and whatever other thing your package uses), you have to pass the -linkshared flag.

Third time:

$ go install -buildmode=shared std

$ go install -buildmode=shared -linkshared github.com/mem/dso/lib

Yay!

What happening here is that I’m trying to build a shared library (-buildmode=shared) and that library links to other shared libraries (the standard library), therefore it’s necessary to say that you really want to do that (-linkshared).

The last command will create a file called libgithub.com-mem-dso-lib.so in the $GOPATH/pkg/linux_amd64_dynlink directory, along with two other files (lib.a and lib.shlibname) in the $GOPATH/pkg/linux_amd64_dynlink/github.com/mem/dso directory.

If you follow this steps, you are now ready to use the shiny new library:

$ go install -linkshared github.com/mem/dso/cmd/demo-lib

(full code here)

This creates a file $GOPATH/bin/demo-lib. Upon closer inspection:

$ readelf -d $GOPATH/bin/demo-lib | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libstd.so]
 0x0000000000000001 (NEEDED)             Shared library: [libgithub.com-mem-dso-lib.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]


$ readelf -d $GOPATH/pkg/linux_amd64_dynlink/libgithub.com-mem-dso-lib.so |
  grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libstd.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

The demo-lib binary links not one but two Go shared libraries. If you run the program it will work just fine. These shared libraries are located in non-standard locations ($GOROOT/pkg/... and $GOPATH/pkg/...), and the program works just fine because the compiler added an RPATH entry in the dynamic section of the executable listing those two directories where the libraries are installed.

As hinted above, libstd.so is the Go standard library. This is a very unfortunate choice for a name, as it is exceedingly generic – imagine if every language called their standard library libstd.so instead of libc.so, libstdc++.so, etc. It’s also unfortunate that the name does not include some kind of version related to the Go version used to build the binary (for example libstd-1.7.so, libstd-1.8.so, etc). I have not checked, but I doubt that programs built with Go 1.7 will work fine with Go 1.8’s libstd.so (ignoring other issues that possibly exist as well).

What’s in a SONAME?

Examining the other library (libgithub.com-mem-dso-lib.so), you’ll find that it does not have a SONAME (see previous post). You can add one if you need to, for example, if you need to have multiple incompatible versions of the same package installed as shared libraries:

$ go install \
	-ldflags '-extldflags -Wl,-soname,libgithub.com-mem-dso-lib.so.0' \
	-buildmode=shared \
	-linkshared \
	github.com/mem/dso/lib

Note that the output filename will still be libgithub.com-mem-dso-lib.so. This creates a problem. If you relink your executable and try to run it, you’ll see:

$ bin/demo-lib
bin/demo-lib: error while loading shared libraries: libgithub.com-mem-dso-lib.so.0: cannot open shared object file: No such file or directory

What’s happening here is that the dynamic linker is looking for a file named libgithub.com-mem-dso-lib.so.0 and it’s not there. What you need to do is rename the library, and install a symlink from libgithub.com-mem-dso-lib.so to libgithub.com-mem-dso-lib.so.0 (again, see the previous post as to why).

It remains to be seen if this will be a problem for Go, as there’s no single strategy around versioning packages. There are two large options for expressing version numbers:

  • Some people use versions embedded in import paths, and, as shown above, this would be reflected in the filename for the shared library, meaning this would have a chance of working.

  • Other people create branches in the repositories hosting the packages (for example v1, v2, etc) and use vendoring tools to keep the packages pinned to the connect branch. This would not be reflected in the filenames. People who have to support multiple installed versions of the same package would have to devise their own solution to the problem. One such group of people are Linux distributions, like Debian and Fedora. Debian has shown in practice this to be a very painful and problematic path to walk.

Note that both options are orthogonal to semantic versioning, which has some traction with the community. It refers to how to pick up the version number, but not to how to express that number in a package: a different import path, a different branch, or something else. The new tool dep has shown a preference (US17) for the second strategy, but so far it’s just a preference.

In either case, without a versioning strategy, it’s not clear how to translate from:

import "github.com/mem/dso/lib"

to a filename like libgithub.com-mem-dso-lib.so.3.1.0 and from there to a SONAME.

Libraries using libraries

Once you have shared libraries, there’s an expectation that your shared libraries can link to other shared libraries. For example, it is almost certain that you’ll have libpng installed in your system. If you examine it, you’ll find something like this:

$ readelf -d /usr/lib/x86_64-linux-gnu/libpng16.so.16 | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libz.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

This says that the libpng16.so.16 links dynamically against the libz.so.1, libm.so.6 and libc.so.6 libraries. When the library was built, there was a step in the process that read something similar to this:

$ gcc -o libpng16.so.16.26.0 {other parameters} -lz -lm

In most common cases, this says “leave a note (in the form of NEEDED above) in the ELF file that instructs the dynamic linker to look for libz.so.1 at runtime”.

What does this look like in Go?

Consider this structure for discussion purposes:

github.com/mem/dso/cmd/demo-outer-inner
github.com/mem/dso/outer
github.com/mem/dso/outer/inner

demo-outer-inner is a main package that imports github.com/mem/dso/outer, which in turn imports github.com/mem/dso/outer/inner.

If your $GOPATH/pkg directory is empty, and you compile demo-outer-inner, the result looks like this:

$ rm -rf $GOPATH/pkg $GOPATH/bin

$ go install -linkshared github.com/mem/dso/cmd/demo-outer-inner

$ readelf -d $GOPATH/bin/demo-outer-inner | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libstd.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

The program is linked against the Go standard library, the C standard library and nothing else. In particular, to compiler didn’t decide to build a shared library out of the outer or inner packages and link those to the resulting program.

What if the shared libraries are there before building demo-outer-inner?

Let’s try that:

$ rm -rf $GOPATH/pkg $GOPATH/bin

$ go install -buildmode=shared -linkshared github.com/mem/dso/outer

$ go install -linkshared github.com/mem/dso/cmd/demo-outer-inner

$ readelf -d $GOPATH/bin/demo-outer-inner | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libstd.so]
 0x0000000000000001 (NEEDED)             Shared library: [libgithub.com-mem-dso-outer.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

Ok, that’s interesting. If the library is already there, with the same command line, the compiler does decide to use it instead of embedding it in the executable as before. But not that inner still didn’t magically show up.

Let’s see…

$ rm -rf $GOPATH/pkg $GOPATH/bin

$ go install -buildmode=shared -linkshared github.com/mem/dso/outer/inner

$ go install -buildmode=shared -linkshared github.com/mem/dso/outer

$ go install -linkshared github.com/mem/dso/cmd/demo-outer-inner

$ readelf -d $GOPATH/bin/demo-outer-inner | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libstd.so]
 0x0000000000000001 (NEEDED)             Shared library: [libgithub.com-mem-dso-outer.so]
 0x0000000000000001 (NEEDED)             Shared library: [libgithub.com-mem-dso-outer-inner.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

I have to confess that this is a little annoying. I understand how this happens, but I still find it annoying. How annoying? If you first build the inner package as a shared library, but not the outer one, then when you build the program, you’ll see that it embeds outer in the binary and dynamically links to inner. This has nothing to do with the directory structure of the packages, that’s just there to make a point. You’ll get the same results if they were laid out side by side, with outer importing inner.

This is not what happens if you build the packages in the default mode. If you try this:

$ go install github.com/mem/dso/outer

$ find $GOPATH/pkg -type f
$GOPATH/pkg/linux_amd64/github.com/mem/dso/outer.a
$GOPATH/pkg/linux_amd64/github.com/mem/dso/outer/inner.a

Notice how you get two archives: one for outer and one for inner. Same thing happens if you build the program without building the other packages first.

To make matters more confusing, the above also happens if you use -linkshared. What I mean is this:

$ rm -rf $GOPATH/pkg $GOPATH/bin

$ go install -linkshared github.com/mem/dso/cmd/demo-outer-inner

$ find $GOPATH/pkg $GOPATH/bin -type f
$GOPATH/pkg/linux_amd64_dynlink/github.com/mem/dso/outer.a
$GOPATH/pkg/linux_amd64_dynlink/github.com/mem/dso/outer/inner.a
$GOPATH/bin/demo-outer-inner

Notice the two archives under $(GOOS)_$(GOARCH)_dynlink, matching the two archives under $(GOOS)_$(GOARCH) in the other case. These are built in the same way as they would if you asked for shared mode, with the exception that a shared library is not produced. It’s possible to have shared and non-shared versions of the same library (just as you can have static and dynamic libraries for C code).

The point here is that build order matters. If you intention is to have shared libraries for your packages, you must build them before anything that uses them, including other packages that you want to install as libraries. If you are going to be building shared libraries for Go packages, you must be very intentional about it.

Revisiting plugins

Let’s take another look at those shared libraries:

$ rm -rf $GOPATH/pkg $GOPATH/bin

$ go install -buildmode=shared -linkshared github.com/mem/dso/outer/inner

$ go install -buildmode=shared -linkshared github.com/mem/dso/outer

$ go install -linkshared github.com/mem/dso/cmd/demo-outer-inner

$ find $GOPATH/bin $GOPATH/pkg -type f -print0 | xargs -r0 ls -s
20 $GOPATH/bin/demo-outer-inner
 4 $GOPATH/pkg/linux_amd64_dynlink/github.com/mem/dso/outer.a
 4 $GOPATH/pkg/linux_amd64_dynlink/github.com/mem/dso/outer/inner.a
 4 $GOPATH/pkg/linux_amd64_dynlink/github.com/mem/dso/outer/inner.shlibname
 4 $GOPATH/pkg/linux_amd64_dynlink/github.com/mem/dso/outer.shlibname
16 $GOPATH/pkg/linux_amd64_dynlink/libgithub.com-mem-dso-outer-inner.so
20 $GOPATH/pkg/linux_amd64_dynlink/libgithub.com-mem-dso-outer.so

the numbers are the size of the files in kB. Those are pretty small for Go binaries! The reason is that they don’t carry with them the standard library:

$ ls -sh $GOROOT/pkg/linux_amd64_dynlink/libstd.so
38M $GOROOT/pkg/linux_amd64_dynlink/libstd.so

What about plugins?

$ ls -sh github.com-mem-dso-plugin-english.plugin
1.7M github.com-mem-dso-plugin-english.plugin

That’s … big. Examining the file, it looks like it has quite a bit of the runtime package in it. It’s notable that it does not list libstd.so in its dynamic dependencies. Isn’t this the kind of thing that -linkshared “fixes”? Let’s see:

$ go build -buildmode=plugin -linkshared english.go
# command-line-arguments
runtime.islibrary: missing Go type information for global symbol: size 1

I haven’t been able to figure out what this error actually means, and I haven’t been able to figure out what the logic in the compiler that leads to this error is trying to do or what it’s trying to guard against.

There’s another gotcha. Remember that I said that if your program and your plugin use the same package, they must use identical versions? If you recompile the little demo program using -linkshared, you’ll get this:

$ go build -buildmode=plugin -o spanish.so github.com/mem/dso/plugin/spanish

$ go install -linkshared github.com/mem/dso/cmd/demo-plugin

$ demo-plugin spanish.so
plugin.Open: plugin.Open: plugin was built with a different version of package runtime/cgo

What the runtime is saying here is that the version of runtime/cgo in your program is different from the one in your plugin, and the reason for that is that they were compiled differently (with and without -linkshared). As far as I’ve been able to find out, there’s no solution for this.

Conclusion

When I first started using Go the first thing that bothered me a little wasn’t the size of the binaries, but the fact that it didn’t support dynamic linking. I understand that in some contexts the fact that Go uses statically linked binaries is a huge advantage. But there are situations where that isn’t as good, for example with a distribution like Debian, where in order to fix a security issue it becomes necessary to recompile all the packages that might contain a copy of the vulnerable code. Another one is embedded systems, where size matters. If you have a single Go binary, not having shared libraries is not an issue, but the moment you start to have more than a handful, things pile up quite quickly. In times when 128 MB of storage might seem small, maybe even too small, this might sound like a non-issue, but consider that you can have a working Linux image for a Raspberry Pi in less than 8 MB. With that in mind, having the possibility of building Go programs in a way that binary code can be shared is good. Having the possibility to load more code at runtime is even better.