Bringing a dynamic environment to C: My linker project

By Macoy Madson. Published on .

I'm writing a linker. It's an unusual linker. It's focus is not on producing executable files. Instead, its focus is to facilitate rapid iteration on a program without having to re-link or re-open it after making changes. It is hot-loading of code at an object-file granularity.

I realized from reading Andreas Fredriksson's Hot Runtime Linking ( that the dynamism I've been wanting in a ahead-of-time compiled environment could be achieved by taking over the linking and loading stages. I made an attempt to support hot-reloading in Cakelisp, but it was fragile and limited. Not only was working at the link/load stage a better fit, but it also meant I could hot-reload more than just code written in Cakelisp.

The goal of this linker is to bring a dynamic environment to compiled languages. I'm calling it a "linker/loader" because its purpose isn't to create an executable like most linkers—its purpose is to link, load, and execute code, and allow doing that on a continuously running process.

By keeping the program running while making changes to it, I believe it will change how you think about the program. If iteration times are extremely low, you are more willing to try experiments. Over long development periods the time saved will add up, and the low friction to making changes will result in a better product.

The world without this linker

Hot-reloading in C family languages is typically done via dynamic loading. On Windows, the dynamic loading happens via LoadLibrary on .dll files, while GNU/Linux uses libdl to load .so ("shared object") files. However, this approach has many limitations:

Another alternative is just-in-time (JIT) compilation. The primary qualms I have with JIT systems are:

What you will be able to do with this linker

The current interface I have in mind is as follows:

The interface shouldn't require much up-front work to use with an existing project. Additional work would be necessary to allow the project to self-modify or introspect on its own image, however.

Inspirations and similar projects

I had many different ideas that resulted in my starting this project:

The following projects are similar to mine, though do not take exactly the same approach:

Other applications

This linker can facilitate program introspection. I plan on having symbols the linker itself provides to the program image that allow the program to inspect its own symbols. This opens the door to a whole variety of interesting things:

Things I'm still figuring out

I haven't yet touched the debugging aspect of this. I want certain features in my linker/loader which will necessitate my program image being unique from a normally linked executable. That means I will need to do something custom to help debuggers find the debug symbols from wherever my loader has decided to place the executing code in memory. I've only glimpsed at the DWARF debugging info, and it's pretty complicated.

The intent with this linker/loader was primarily to aid during development, so I have been focused on supporting my primary development architecture, x86-64 (a.k.a. AMD64). Linkers are machine architecture-dependent, so each architecture would need to be added one-by-one once support for them is desired. This doesn't mean your program would only work on x86-64; it could support a superset of the architectures my linker supports, and you would need to use a different linker to create executables for other architectures.

With my software, I do all initial development on GNU/Linux, then port to Windows after I have proven to myself that the concept is valuable. This means I have not done any work towards Windows Portable Executable or Common Object File Formats. If I find I can do the things I want on GNU/Linux (which uses ELF format executables), I will port the linker to Windows.

There are complexities around the data sections of the program image that I need to figure out. For example, you should be able to change functions as much as you want while still persisting data across reloads. However, if you change the presence or size of items in data, the linker will need to do some work to try to persist data which hasn't been affected. This will likely require some help from the debug symbols to determine where things are in data and guess at whether they have changed since last load. I need to do more experimentation before I can find the limitations of this system, but ideally, you can change data without needing to restart the program in many cases.6

Get in touch!

Let me know what you think by emailing me: macoy [at] macoy [dot] me.

You can see the current code here. As of publishing this article, it can load an ELF format object file for x86-64, process the file's relocations, and call into the object file correctly. It's not near release; I'll write a new blog post once that happens.

  1. I solved this in Cakelisp by automatically converting static variables to heap allocate instead, but it's a dirty solution and fragile.↩︎

  2. An example of a JIT library for C is libgccjit (GCC-based). You could also build one based on LLVM. Both GCC and LLVM are enormous dependencies by my standards. Tiny C Compiler would be an example of a small library, but still a complex dependency.↩︎

  3. In theory any compiled language which produces object files should automatically work with this linker. In practice, I think there are going to be incompatibilities with some languages which would need to be supported case-by-case. Anything which does link-time code generation, for example, would not work with this system until support is added.↩︎

  4. There is some additional complications here. My first pass would be to require the program to itself return control to the linker-loader so that the program's image can be safely edited. This means the program would need some code to recognize that it has been requested to reload and return all the way up the stack on all threads so that its code can be modified. Eventually once I learn more I may be able to do something better. Note that in this condition the program doesn't need to close its window or free all its state to be reloaded, it only needs to not be executing any code in sections which are going to be reloaded. It can then pick back up where it left off with all the same data.↩︎

  5. This is limited by compiler optimizations. For example, a module-local function (a function marked static in C) will be relatively referenced by the compiler and no record of the reference will reach the linker.↩︎

  6. One clear exception would be changing the size or structure of data which is referenced by other data. The linker would need a user-written migration function to know how to handle that, which is likely not worth writing when compared to the time required to re-launch the process.↩︎