source from https://github.com/green7ea/cpp-compilation, many thanks!
The C++ Build Process Explained
The last good thing written in C was Franz Schubert’s Symphony Number 9.
– Erwin Dieterich
Fifty years of programming language research, and we end up with C++?
– Richard A. O’Keefe
There are only two kinds of programming languages: those people always bitch about and those nobody uses.
– Bjarne Stroustrup
The C++ build process is built on top of the C build process which was hacked together in 1972 on a PDP-7 with at most 144 KB of RAM. Honestly, I’m surprised it aged so well.
The exact linking process in C and C++ isn’t standard, it’s up to each compiler to make things work. Most compilers behave in the same way with small differences but we’ll be focusing on the GNU tools to keep things simple.
An Overview
Computers in 1972 weren’t very powerful and so the process had to be broken down into smaller steps. This also allows you to rebuild only the parts that change.
- we have many C/C++ source files,
- for each source file, we run an independent compiler,
- we run the pre-processor over the current file,
- this pulls in every header and its parents into the file,
- the file gets compiled and transformed into an object,
- we group every object we generated into an executable or library.
Entering a command to compile every source file separately is annoying and so we have a build tool that knows how to compile each of our source files. Not only that but it can also regenerate the objects for the source files that have changed saving us a lot of time when recompiling with minor changes.
Trivial C Program
Let’s use a trivial C program to illustrate the build process. We can build it with the following commands:
1 | cc -Wall -O0 -std=c99 -g -c -o add.o add.c |
add.h
1 |
|
add.c
1 |
|
simple.c
1 |
|
Why Headers?
TLDR we need to know the size of structures and function arguments
When our main function wants to call the add
function, it needs to know what it returns and what it takes in as an argument before it can call the function. A typical call to add
from our main would look like:
- our
main
function,- push space for the return value onto a stack (an int),
- push the parameters onto the stack (
a
andb
), - push the return address (the next part of our main function),
- jump to the add function,
- our
add
function- store the execution state,
- add
a
andb
placing the result in the return value space, - restore the execution state,
- jump to the return address,
- back in our
main
function,- pop the parameters (
a
andb
), - use the return value that is now on the stack.
- pop the parameters (
The way to call a function and the method for building the stack make up most of an ABI (Application Binary Interface) definition. Every compiler is free to have its own ABI. The same compiler usually has a different ABI for the different processors it supports. This makes things complex but efficient.
Since we are making space on the stack for our return value and parameters, we have to know their size. If any of those are a structure, we have to know what it’s made of so that we can know its size. We let the compiler know all this with function declarations and structure definitions.
You could put a function’s declaration in every source file that needs it but that’s a terrible idea since the declaration has to be the same everywhere if you want anything to work. Instead of having the same declaration everywhere, we put the declaration in a common file and include it where it is necessary. This common file is what we know as a header.
Sometimes, we only use a structure through pointers or references which means we don’t really have to know that structure’s exact size. This leads to a clever trick known as pointer implementation (PIMPL) which is really useful for speeding up compilation and hiding implementation details. For more information on PIMPL.
Pre-processor
In those header files and source files, you’ve hopefully noticed lines that start with #
. Whenever you see a directive that starts with #
, we are dealing with the C pre-processor. The pre-processor does the following:
- include files (
#include
), - macro expansions (
#define RADTODEG(x) ((x) * 57.29578)
), - conditional compilation (
#if
,#ifdef
, etc.), - line control (
__FILE__
,__LINE__
).
Basically, the compiler has a state which can be modified by these directives. Since every *.c
file is treated independently, every *.c
file that is being compiled has its own state. The headers that are included modify that file’s state. The pre-processor works at a string level and replaces the tags in the source file by the result of basic functions based on the state of the compiler.
The #include
pre-processor is probably the most important. Luckily, it is really simple: it finds the file and replaces the #include
line with the contents of that file.
Where does it find the files?
#include <sum.h>
looks forsum.h
in a list of include directories,#include "sum.h"
does the same but looks in the current folder first.
C and C++ don’t actually define a mechanism for providing the list of include directories, that is up to the compiler. This causes many problems with cross platform development which some build tools can solve.
Include Guards
When you include a header, there is usually a #ifndef
and #define
statement at the top of the file and a corresponding #endif
at the bottom. We call this an include guard. It is responsible for setting a variable the first time it is run so that including the same file a second time doesn’t redefine things that already exist and cause the compiler to panic.
1 | #ifndef FILENAME_INCLUDED |
This is a very useful trick but it’s also one of the more fundamental problems of C:
- you include a file a first time,
- it modifies the compiler state,
- you include the same file a second time,
- based on the compiler state, it pretends to be empty.
That is completely crazy - the file you include can change based on the state of the compiler. Not only that but the included files themselves can modify the state of the compiler (windows.h is infamous for doing this).
Because of this, compiling becomes slow and complex. Suppose that we want to compile two files which both include <string.h>
and that <string.h>
itself includes about 50 other files. We are not able to cache <string.h>
without proving that the compiler state is the same when we include it!
So what started out as a simple, easy to implement solution turns out to scale really poorly. This wasn’t an issue in 1972 when the computers limited the complexity but almost 50 years later, it’s a big problem. The C++ standards committee has been trying to introduce a module system to fix this but it’s a difficult task to change such a fundamental system in an established language.
Header Trees
When you include a header, this header can include others and it can quickly get messy. If we compile a file with the -H
flag, we can visualize the various header graphs:
1 | gcc -H -O0 -std=c99 -g -c -o simple.o simple.c |
- /usr/include/stdio.h
- /usr/include/bits/libc-header-start.h
- /usr/include/features.h
- /usr/include/sys/cdefs.h
- /usr/include/bits/wordsize.h
- /usr/include/bits/long-double.h
- /usr/include/gnu/stubs.h
- /usr/include/gnu/stubs-64.h
- /usr/include/sys/cdefs.h
- /usr/include/features.h
- /usr/lib/gcc/x86_64-pc-linux-gnu/8 2 1/include/stddef.h
- /usr/lib/gcc/x86_64-pc-linux-gnu/8 2 1/include/stdarg.h
- /usr/include/bits/types.h
- /usr/include/bits/wordsize.h
- /usr/include/bits/typesizes.h
- /usr/include/bits/types/__fpos_t.h
- /usr/include/bits/types/__mbstate_t.h
- /usr/include/bits/types/__fpos64_t.h
- /usr/include/bits/types/__FILE.h
- /usr/include/bits/types/FILE.h
- /usr/include/bits/types/struct_FILE.h
- /usr/include/bits/stdio_lim.h
- /usr/include/bits/sys_errlist.h
- /usr/include/bits/libc-header-start.h
- add.h
We can see that we go from 2 includes to 22. This can quickly get out of hand for big projects.
The difficulty is that sometimes, you are including many headers indirectly through another header. For example, if you include ros.h
, it includes boost
which quickly balloons the number of headers to parse. It can quickly get out of hand and to compile a single source file, you sometimes have to visit over 2000 header files. This makes compilation excruciatingly slow and this is where the PIMPL
idiom can really help.
An Object File
After all this work, the compiler can do the actual compiling of our source file with all the headers pasted into it. Once the compilation is finished, we have an object file.
An object file is an organized way to store assembly functions that aren’t yet linked together. We can examine the object file we generate with the simple.c
source file with the following command:
1 | objdump -dr simple.o |
simple.o
1 | simple.o: file format elf64-x86-64 |
This is what the most complex file in our tiny example looks like. The important thing to notice is that the assembly code is grouped into the <main>
function and that there are calls to functions that don’t yet exist like the call to sub
copied below:
1 | 25: e8 00 00 00 00 callq 2a <main+0x2a> |
Symbol Tables
At the top of an object file there is a list of all the functions that are defined in that object and all the functions that are used but not defined. This is known as a symbol table.
To get our symbol tables, we use the following commands:
1 | nm add.o > add.sym |
add.o
Position | Type | Name |
---|---|---|
0 | Text | add |
14 | Text | sub |
simple.o
Position | Type | Name |
---|---|---|
Undefined | add | |
Undefined | GLOBAL_OFFSET_TABLE | |
0 | Text | main |
Undefined | printf | |
Undefined | sub |
A Linker’s Job
To get an executable, we put many object files together and link the undefined function calls to their implementations found in other object files. There are two ways to do this:
- Linking the functions directly together by jumping directly to the function.
- Having a table that contains our functions and look up where to jump before jumping to the desired function.
The first option describes static linking. This is more efficient, less flexible and rarely used.
The second option describes dynamic linking. It is a little bit slower but much more flexible and is the standard way to ship a library.
Differences in C++
So far, we’ve been talking about C but luckily, C++ was designed to be compatible with the C build process. In fact, the first C++ compiler was known as “C with Classes” and it was a pre-compiler that transformed a C++ into C.
Modern C++, introduces two big differences:
- templates,
- mangling.
Templates are complicated enough to have their own tutorial but mangling is pretty simple and more important. The following source file gives us an idea of mangling:
1 | extern "C" int add_c(int a, int b) |
If we look at it with nm
:
1 | nm mangling.o |
Position | Type | Name | Signature |
---|---|---|---|
0 | Text | add_c | int add_c(int, int) |
44 | Text | _Z3addff | float add(float, float) |
14 | Text | _Z3addii | int add(int, int) |
28 | Text | _Z3addPKiRS_ | int add(const int *, const int &) |
5e | Text | _ZN4manu3addEii | int manu::add(int, int) |
Basically, in C, functions are simply identified by their names. This prevents us from having namespaces and having a function with the same name but different arguments. C++ gets around this by using mangling. extern "C"
turns off mangling so that C++ can be compatible with C.
Unfortunately, many compilers do mangling differently and so are incompatible. Luckily, most compilers have recently standardized on the Itanium C++ ABI that you see above.
- start with
_Z
since underscore capital letter is reserved in C, - an
N
after theZ
indicates nested names, - put numbers that indicate the length of the next argument,
- this gives us a list of strings,
- the last string is the function, class or struct name,
- the previous ones are the namespaces or outer classes,
- if our names were nested, we insert an
E
, - we indicated the type and modifiers of our arguments.
Even with mangling, we don’t have enough size information for function calls to forgo headers. We are missing the size of the return value and the size of structures.
A Basis for Objects
To build an object system, we need static dispatch (when two functions have the same name, calling the one with matching arguments). This is crucial so that we can differentiate between a.method
and b.method
. If we didn’t have mangling, we couldn’t use the same method name in two different classes.
1 | struct Num |
Position | Type | Name | Signature |
---|---|---|---|
18 | Text | main | main |
0 | Text | _Z3addPK3Numii | add(const Num *, int, int) |
0 | Weak | _ZN3Num3addEii | Num::add(int, int) |
Putting Things Together with Build Systems
We don’t want to remember and execute the build commands by hand (at least I don’t). That’s why we have build tools:
- Make,
- meson,
- Autotools,
- bazel,
- CMake,
- VisualStudio,
- bash scripts,
- etc.
A build tool usually:
- has a list of source files,
- knows how to build each source file,
- keeps a dependency graph to rebuild only files that change,
- keeps a list of directories containing header files,
- keeps a list of external libraries to link to (static/dynamic),
- manages compiler flags (optimization level, warning level),
- knows which files to link into executables and libraries.
Some build tools offer additional features:
- program installation,
- cross platform support,
- cross compilation,
- dependency installation.
Ideally, at the end of all this, we can quickly and easily generate bug free programs.
We’ve been trying to generate bug free programs since at least 1972, have come close a few times but have never truly managed. Thankfully, most of the time, the compiler, linker and build system aren’t to blame.