C and C++ static linking: just a copy?_问答_开发者

When someone statically links a .lib, will the linker copy the whole contents of lib into the final exe开发者_开发百科cutable or just the functions used in the object files?

The whole library? -- No.
Just the functions you called? -- No.
Something else? -- Yes.

It certainly doesn't throw in the whole library.

But it doesn't necessarily include just "the functions used in the object files" either.

The linker will make a recursively built list of which object modules in the library satisfy your undefined symbols.

Then, it will include each of those object modules.

Typically, a given object module will include more than one function, and if some of these are not called by the ones that you do call, you will get some number of functions (and data objects) that you didn't need.

The linker typically does not remove dead code before building the final executable. That is, it will (usually) link in ALL symbols whether they are used in the final executable or not. However, linkers often explicitly provide Optimization settings you can use to force the linker to try extra hard to do this.

For GCC, this is accomplished in two stages:

First compile the data but tell the compiler to separate the code into separate sections within the translation unit. This will be done for functions, classes, and external variables by using the following two compiler flags:

-fdata-sections -ffunction-sections
Link the translation units together using the linker optimization flag (this causes the linker to discard unreferenced sections):

-Wl,--gc-sections

So if you had one file called test.cpp that had two functions declared in it, but one of them was unused, you could omit the unused one with the following command to gcc(g++):

gcc -Os -fdata-sections -ffunction-sections test.cpp -o test.o -Wl,--gc-sections

(Note that -Os is an additional compiler flag that tells GCC to optimize for size)

As for MSVC, function level linking accomplishes the same thing. I believe the compiler flag for this is (to sort things into sections):

/Gy

And then the linker flag (to discard unused sections):

/OPT:REF

Linkers were invented in ancient times, when memory was especially precious. One of their primary functions was to prune out the modules you weren't using. That ability has been carried forward to the present day.

It's quite common for some library functions to rely on others though, and all the dependencies will be linked.

Sort of. It will however also need to fix up all the function call pointers. Especially if those function calls exist outside of the static library (ie in another static library or executable).

Depends on the linker. Some linkers are lazy and just throw the whole library in. The other extreme is linkers that throw in only the necessary code into an executable.

A sample test is to write a program that uses puts and compare with a program that uses printf. If the executables are the same size, you have more of a lazy linker.

Example:

puts_test.cpp

#include <cstdio>
using namespace std;

int main(void)
{
  puts("Hello World\n");
  return 0;
}

printf_test.cpp

#include <cstdio>
using namespace std;

int main(void)
{
  printf("%s\n", "Hello World");
  return 0;
}

With the above example, the puts function does not require extra code for parsing format strings or converting numerics into text. This is the baseline because it requires a minimal library function.

The example using printf requires more functionality. The printf function requires parsing the format string and outputting text.

The expected result is that the printf executable should be larger than the puts executable. Most compilers will haul in all the code for the printf function to resolve symbols (such as for displaying floats) even though that portion of the code is not used. More intelligent (and costly) compilers will break up the printf function and only include the parts that are used or required. In the example above, the compiler should only include the parts for processing text and not include code to format integers and floating point values.

A lazy compiler, or in debug mode, will copy the entire library for the puts example, thus making the executables the same size.

Symbol comparison

The *nix platforms and Cygwin provide tools to obtaining the symbols from executables. One such utility is nm. Run nm on each executable, directing output to a text file. Compare the two text files. Lazy compilers should have the same symbols; except their locations may differ (which is not important to the issue).

It will use only the used functions & symbols (unless told otherwise, but that can be tricky).

Side issue:

This can actually be a problem if you f.ex. have some classes that just register themselves to a factory. No-one calls these classes directly, so they won't be included and thus not registered in the factory. There are ways around this (usually by declaring some anonymous variable in the header file that references the source file).