xref: /freebsd/contrib/llvm-project/lld/docs/NewLLD.rst (revision 06c3fb2749bda94cb5201f81ffdb8fa6c3161b2e)
10b57cec5SDimitry AndricThe ELF, COFF and Wasm Linkers
20b57cec5SDimitry Andric==============================
30b57cec5SDimitry Andric
40b57cec5SDimitry AndricThe ELF Linker as a Library
50b57cec5SDimitry Andric---------------------------
60b57cec5SDimitry Andric
70b57cec5SDimitry AndricYou can embed LLD to your program by linking against it and calling the linker's
8*06c3fb27SDimitry Andricentry point function ``lld::lldMain``.
90b57cec5SDimitry Andric
10480093f4SDimitry AndricThe current policy is that it is your responsibility to give trustworthy object
110b57cec5SDimitry Andricfiles. The function is guaranteed to return as long as you do not pass corrupted
120b57cec5SDimitry Andricor malicious object files. A corrupted file could cause a fatal error or SEGV.
130b57cec5SDimitry AndricThat being said, you don't need to worry too much about it if you create object
140b57cec5SDimitry Andricfiles in the usual way and give them to the linker. It is naturally expected to
150b57cec5SDimitry Andricwork, or otherwise it's a linker's bug.
160b57cec5SDimitry Andric
170b57cec5SDimitry AndricDesign
180b57cec5SDimitry Andric======
190b57cec5SDimitry Andric
200b57cec5SDimitry AndricWe will describe the design of the linkers in the rest of the document.
210b57cec5SDimitry Andric
220b57cec5SDimitry AndricKey Concepts
230b57cec5SDimitry Andric------------
240b57cec5SDimitry Andric
250b57cec5SDimitry AndricLinkers are fairly large pieces of software.
260b57cec5SDimitry AndricThere are many design choices you have to make to create a complete linker.
270b57cec5SDimitry Andric
280b57cec5SDimitry AndricThis is a list of design choices we've made for ELF and COFF LLD.
290b57cec5SDimitry AndricWe believe that these high-level design choices achieved a right balance
300b57cec5SDimitry Andricbetween speed, simplicity and extensibility.
310b57cec5SDimitry Andric
320b57cec5SDimitry Andric* Implement as native linkers
330b57cec5SDimitry Andric
340b57cec5SDimitry Andric  We implemented the linkers as native linkers for each file format.
350b57cec5SDimitry Andric
360b57cec5SDimitry Andric  The linkers share the same design but share very little code.
370b57cec5SDimitry Andric  Sharing code makes sense if the benefit is worth its cost.
380b57cec5SDimitry Andric  In our case, the object formats are different enough that we thought the layer
390b57cec5SDimitry Andric  to abstract the differences wouldn't be worth its complexity and run-time
400b57cec5SDimitry Andric  cost.  Elimination of the abstract layer has greatly simplified the
410b57cec5SDimitry Andric  implementation.
420b57cec5SDimitry Andric
430b57cec5SDimitry Andric* Speed by design
440b57cec5SDimitry Andric
450b57cec5SDimitry Andric  One of the most important things in archiving high performance is to
460b57cec5SDimitry Andric  do less rather than do it efficiently.
470b57cec5SDimitry Andric  Therefore, the high-level design matters more than local optimizations.
480b57cec5SDimitry Andric  Since we are trying to create a high-performance linker,
490b57cec5SDimitry Andric  it is very important to keep the design as efficient as possible.
500b57cec5SDimitry Andric
510b57cec5SDimitry Andric  Broadly speaking, we do not do anything until we have to do it.
520b57cec5SDimitry Andric  For example, we do not read section contents or relocations
530b57cec5SDimitry Andric  until we need them to continue linking.
540b57cec5SDimitry Andric  When we need to do some costly operation (such as looking up
550b57cec5SDimitry Andric  a hash table for each symbol), we do it only once.
560b57cec5SDimitry Andric  We obtain a handle (which is typically just a pointer to actual data)
570b57cec5SDimitry Andric  on the first operation and use it throughout the process.
580b57cec5SDimitry Andric
590b57cec5SDimitry Andric* Efficient archive file handling
600b57cec5SDimitry Andric
610b57cec5SDimitry Andric  LLD's handling of archive files (the files with ".a" file extension) is
620b57cec5SDimitry Andric  different from the traditional Unix linkers and similar to Windows linkers.
630b57cec5SDimitry Andric  We'll describe how the traditional Unix linker handles archive files, what the
640b57cec5SDimitry Andric  problem is, and how LLD approached the problem.
650b57cec5SDimitry Andric
660b57cec5SDimitry Andric  The traditional Unix linker maintains a set of undefined symbols during
670b57cec5SDimitry Andric  linking.  The linker visits each file in the order as they appeared in the
680b57cec5SDimitry Andric  command line until the set becomes empty. What the linker would do depends on
690b57cec5SDimitry Andric  file type.
700b57cec5SDimitry Andric
710b57cec5SDimitry Andric  - If the linker visits an object file, the linker links object files to the
720b57cec5SDimitry Andric    result, and undefined symbols in the object file are added to the set.
730b57cec5SDimitry Andric
740b57cec5SDimitry Andric  - If the linker visits an archive file, it checks for the archive file's
750b57cec5SDimitry Andric    symbol table and extracts all object files that have definitions for any
760b57cec5SDimitry Andric    symbols in the set.
770b57cec5SDimitry Andric
780b57cec5SDimitry Andric  This algorithm sometimes leads to a counter-intuitive behavior.  If you give
790b57cec5SDimitry Andric  archive files before object files, nothing will happen because when the linker
800b57cec5SDimitry Andric  visits archives, there is no undefined symbols in the set.  As a result, no
810b57cec5SDimitry Andric  files are extracted from the first archive file, and the link is done at that
820b57cec5SDimitry Andric  point because the set is empty after it visits one file.
830b57cec5SDimitry Andric
840b57cec5SDimitry Andric  You can fix the problem by reordering the files,
850b57cec5SDimitry Andric  but that cannot fix the issue of mutually-dependent archive files.
860b57cec5SDimitry Andric
870b57cec5SDimitry Andric  Linking mutually-dependent archive files is tricky.  You may specify the same
880b57cec5SDimitry Andric  archive file multiple times to let the linker visit it more than once.  Or,
890b57cec5SDimitry Andric  you may use the special command line options, `--start-group` and
900b57cec5SDimitry Andric  `--end-group`, to let the linker loop over the files between the options until
910b57cec5SDimitry Andric  no new symbols are added to the set.
920b57cec5SDimitry Andric
930b57cec5SDimitry Andric  Visiting the same archive files multiple times makes the linker slower.
940b57cec5SDimitry Andric
950b57cec5SDimitry Andric  Here is how LLD approaches the problem. Instead of memorizing only undefined
960b57cec5SDimitry Andric  symbols, we program LLD so that it memorizes all symbols.  When it sees an
970b57cec5SDimitry Andric  undefined symbol that can be resolved by extracting an object file from an
980b57cec5SDimitry Andric  archive file it previously visited, it immediately extracts the file and links
990b57cec5SDimitry Andric  it.  It is doable because LLD does not forget symbols it has seen in archive
1000b57cec5SDimitry Andric  files.
1010b57cec5SDimitry Andric
1020b57cec5SDimitry Andric  We believe that LLD's way is efficient and easy to justify.
1030b57cec5SDimitry Andric
1040b57cec5SDimitry Andric  The semantics of LLD's archive handling are different from the traditional
1050b57cec5SDimitry Andric  Unix's.  You can observe it if you carefully craft archive files to exploit
1060b57cec5SDimitry Andric  it.  However, in reality, we don't know any program that cannot link with our
1070b57cec5SDimitry Andric  algorithm so far, so it's not going to cause trouble.
1080b57cec5SDimitry Andric
1090b57cec5SDimitry AndricNumbers You Want to Know
1100b57cec5SDimitry Andric------------------------
1110b57cec5SDimitry Andric
1120b57cec5SDimitry AndricTo give you intuition about what kinds of data the linker is mainly working on,
1130b57cec5SDimitry AndricI'll give you the list of objects and their numbers LLD has to read and process
1140b57cec5SDimitry Andricin order to link a very large executable. In order to link Chrome with debug
1150b57cec5SDimitry Andricinfo, which is roughly 2 GB in output size, LLD reads
1160b57cec5SDimitry Andric
1170b57cec5SDimitry Andric- 17,000 files,
1180b57cec5SDimitry Andric- 1,800,000 sections,
1190b57cec5SDimitry Andric- 6,300,000 symbols, and
1200b57cec5SDimitry Andric- 13,000,000 relocations.
1210b57cec5SDimitry Andric
1220b57cec5SDimitry AndricLLD produces the 2 GB executable in 15 seconds.
1230b57cec5SDimitry Andric
1240b57cec5SDimitry AndricThese numbers vary depending on your program, but in general,
1250b57cec5SDimitry Andricyou have a lot of relocations and symbols for each file.
1260b57cec5SDimitry AndricIf your program is written in C++, symbol names are likely to be
1270b57cec5SDimitry Andricpretty long because of name mangling.
1280b57cec5SDimitry Andric
1290b57cec5SDimitry AndricIt is important to not waste time on relocations and symbols.
1300b57cec5SDimitry Andric
1310b57cec5SDimitry AndricIn the above case, the total amount of symbol strings is 450 MB,
1320b57cec5SDimitry Andricand inserting all of them to a hash table takes 1.5 seconds.
1330b57cec5SDimitry AndricTherefore, if you causally add a hash table lookup for each symbol,
1340b57cec5SDimitry Andricit would slow down the linker by 10%. So, don't do that.
1350b57cec5SDimitry Andric
1360b57cec5SDimitry AndricOn the other hand, you don't have to pursue efficiency
1370b57cec5SDimitry Andricwhen handling files.
1380b57cec5SDimitry Andric
1390b57cec5SDimitry AndricImportant Data Structures
1400b57cec5SDimitry Andric-------------------------
1410b57cec5SDimitry Andric
1420b57cec5SDimitry AndricWe will describe the key data structures in LLD in this section.  The linker can
1430b57cec5SDimitry Andricbe understood as the interactions between them.  Once you understand their
1440b57cec5SDimitry Andricfunctions, the code of the linker should look obvious to you.
1450b57cec5SDimitry Andric
1460b57cec5SDimitry Andric* Symbol
1470b57cec5SDimitry Andric
1480b57cec5SDimitry Andric  This class represents a symbol.
1490b57cec5SDimitry Andric  They are created for symbols in object files or archive files.
1500b57cec5SDimitry Andric  The linker creates linker-defined symbols as well.
1510b57cec5SDimitry Andric
1520b57cec5SDimitry Andric  There are basically three types of Symbols: Defined, Undefined, or Lazy.
1530b57cec5SDimitry Andric
1540b57cec5SDimitry Andric  - Defined symbols are for all symbols that are considered as "resolved",
1550b57cec5SDimitry Andric    including real defined symbols, COMDAT symbols, common symbols,
1560b57cec5SDimitry Andric    absolute symbols, linker-created symbols, etc.
1570b57cec5SDimitry Andric  - Undefined symbols represent undefined symbols, which need to be replaced by
1580b57cec5SDimitry Andric    Defined symbols by the resolver until the link is complete.
1590b57cec5SDimitry Andric  - Lazy symbols represent symbols we found in archive file headers
1600b57cec5SDimitry Andric    which can turn into Defined if we read archive members.
1610b57cec5SDimitry Andric
1620b57cec5SDimitry Andric  There's only one Symbol instance for each unique symbol name. This uniqueness
1630b57cec5SDimitry Andric  is guaranteed by the symbol table. As the resolver reads symbols from input
1640b57cec5SDimitry Andric  files, it replaces an existing Symbol with the "best" Symbol for its symbol
1650b57cec5SDimitry Andric  name using the placement new.
1660b57cec5SDimitry Andric
1670b57cec5SDimitry Andric  The above mechanism allows you to use pointers to Symbols as a very cheap way
1680b57cec5SDimitry Andric  to access name resolution results. Assume for example that you have a pointer
1690b57cec5SDimitry Andric  to an undefined symbol before name resolution. If the symbol is resolved to a
1700b57cec5SDimitry Andric  defined symbol by the resolver, the pointer will "automatically" point to the
1710b57cec5SDimitry Andric  defined symbol, because the undefined symbol the pointer pointed to will have
1720b57cec5SDimitry Andric  been replaced by the defined symbol in-place.
1730b57cec5SDimitry Andric
1740b57cec5SDimitry Andric* SymbolTable
1750b57cec5SDimitry Andric
1760b57cec5SDimitry Andric  SymbolTable is basically a hash table from strings to Symbols
1770b57cec5SDimitry Andric  with logic to resolve symbol conflicts. It resolves conflicts by symbol type.
1780b57cec5SDimitry Andric
1790b57cec5SDimitry Andric  - If we add Defined and Undefined symbols, the symbol table will keep the
1800b57cec5SDimitry Andric    former.
1810b57cec5SDimitry Andric  - If we add Defined and Lazy symbols, it will keep the former.
1820b57cec5SDimitry Andric  - If we add Lazy and Undefined, it will keep the former,
1830b57cec5SDimitry Andric    but it will also trigger the Lazy symbol to load the archive member
1840b57cec5SDimitry Andric    to actually resolve the symbol.
1850b57cec5SDimitry Andric
1860b57cec5SDimitry Andric* Chunk (COFF specific)
1870b57cec5SDimitry Andric
1880b57cec5SDimitry Andric  Chunk represents a chunk of data that will occupy space in an output.
1890b57cec5SDimitry Andric  Each regular section becomes a chunk.
1900b57cec5SDimitry Andric  Chunks created for common or BSS symbols are not backed by sections.
1910b57cec5SDimitry Andric  The linker may create chunks to append additional data to an output as well.
1920b57cec5SDimitry Andric
1930b57cec5SDimitry Andric  Chunks know about their size, how to copy their data to mmap'ed outputs,
1940b57cec5SDimitry Andric  and how to apply relocations to them.
1950b57cec5SDimitry Andric  Specifically, section-based chunks know how to read relocation tables
1960b57cec5SDimitry Andric  and how to apply them.
1970b57cec5SDimitry Andric
1980b57cec5SDimitry Andric* InputSection (ELF specific)
1990b57cec5SDimitry Andric
2000b57cec5SDimitry Andric  Since we have less synthesized data for ELF, we don't abstract slices of
2010b57cec5SDimitry Andric  input files as Chunks for ELF. Instead, we directly use the input section
2020b57cec5SDimitry Andric  as an internal data type.
2030b57cec5SDimitry Andric
2040b57cec5SDimitry Andric  InputSection knows about their size and how to copy themselves to
2050b57cec5SDimitry Andric  mmap'ed outputs, just like COFF Chunks.
2060b57cec5SDimitry Andric
2070b57cec5SDimitry Andric* OutputSection
2080b57cec5SDimitry Andric
2090b57cec5SDimitry Andric  OutputSection is a container of InputSections (ELF) or Chunks (COFF).
2100b57cec5SDimitry Andric  An InputSection or Chunk belongs to at most one OutputSection.
2110b57cec5SDimitry Andric
2120b57cec5SDimitry AndricThere are mainly three actors in this linker.
2130b57cec5SDimitry Andric
2140b57cec5SDimitry Andric* InputFile
2150b57cec5SDimitry Andric
2160b57cec5SDimitry Andric  InputFile is a superclass of file readers.
2170b57cec5SDimitry Andric  We have a different subclass for each input file type,
2180b57cec5SDimitry Andric  such as regular object file, archive file, etc.
2190b57cec5SDimitry Andric  They are responsible for creating and owning Symbols and InputSections/Chunks.
2200b57cec5SDimitry Andric
2210b57cec5SDimitry Andric* Writer
2220b57cec5SDimitry Andric
2230b57cec5SDimitry Andric  The writer is responsible for writing file headers and InputSections/Chunks to
2240b57cec5SDimitry Andric  a file.  It creates OutputSections, put all InputSections/Chunks into them,
2250b57cec5SDimitry Andric  assign unique, non-overlapping addresses and file offsets to them, and then
2260b57cec5SDimitry Andric  write them down to a file.
2270b57cec5SDimitry Andric
2280b57cec5SDimitry Andric* Driver
2290b57cec5SDimitry Andric
2300b57cec5SDimitry Andric  The linking process is driven by the driver. The driver:
2310b57cec5SDimitry Andric
2320b57cec5SDimitry Andric  - processes command line options,
2330b57cec5SDimitry Andric  - creates a symbol table,
2340b57cec5SDimitry Andric  - creates an InputFile for each input file and puts all symbols within into
2350b57cec5SDimitry Andric    the symbol table,
2360b57cec5SDimitry Andric  - checks if there's no remaining undefined symbols,
2370b57cec5SDimitry Andric  - creates a writer,
2380b57cec5SDimitry Andric  - and passes the symbol table to the writer to write the result to a file.
2390b57cec5SDimitry Andric
2400b57cec5SDimitry AndricLink-Time Optimization
2410b57cec5SDimitry Andric----------------------
2420b57cec5SDimitry Andric
2430b57cec5SDimitry AndricLTO is implemented by handling LLVM bitcode files as object files.
2440b57cec5SDimitry AndricThe linker resolves symbols in bitcode files normally. If all symbols
2450b57cec5SDimitry Andricare successfully resolved, it then runs LLVM passes
2460b57cec5SDimitry Andricwith all bitcode files to convert them to one big regular ELF/COFF file.
2470b57cec5SDimitry AndricFinally, the linker replaces bitcode symbols with ELF/COFF symbols,
2480b57cec5SDimitry Andricso that they are linked as if they were in the native format from the beginning.
2490b57cec5SDimitry Andric
2500b57cec5SDimitry AndricThe details are described in this document.
2515ffd83dbSDimitry Andrichttps://llvm.org/docs/LinkTimeOptimization.html
2520b57cec5SDimitry Andric
2530b57cec5SDimitry AndricGlossary
2540b57cec5SDimitry Andric--------
2550b57cec5SDimitry Andric
2560b57cec5SDimitry Andric* RVA (COFF)
2570b57cec5SDimitry Andric
2580b57cec5SDimitry Andric  Short for Relative Virtual Address.
2590b57cec5SDimitry Andric
2600b57cec5SDimitry Andric  Windows executables or DLLs are not position-independent; they are
2610b57cec5SDimitry Andric  linked against a fixed address called an image base. RVAs are
2620b57cec5SDimitry Andric  offsets from an image base.
2630b57cec5SDimitry Andric
2640b57cec5SDimitry Andric  Default image bases are 0x140000000 for executables and 0x18000000
2650b57cec5SDimitry Andric  for DLLs. For example, when we are creating an executable, we assume
2660b57cec5SDimitry Andric  that the executable will be loaded at address 0x140000000 by the
2670b57cec5SDimitry Andric  loader, so we apply relocations accordingly. Result texts and data
2680b57cec5SDimitry Andric  will contain raw absolute addresses.
2690b57cec5SDimitry Andric
2700b57cec5SDimitry Andric* VA
2710b57cec5SDimitry Andric
2720b57cec5SDimitry Andric  Short for Virtual Address. For COFF, it is equivalent to RVA + image base.
2730b57cec5SDimitry Andric
2740b57cec5SDimitry Andric* Base relocations (COFF)
2750b57cec5SDimitry Andric
2760b57cec5SDimitry Andric  Relocation information for the loader. If the loader decides to map
2770b57cec5SDimitry Andric  an executable or a DLL to a different address than their image
2780b57cec5SDimitry Andric  bases, it fixes up binaries using information contained in the base
2790b57cec5SDimitry Andric  relocation table. A base relocation table consists of a list of
2800b57cec5SDimitry Andric  locations containing addresses. The loader adds a difference between
2810b57cec5SDimitry Andric  RVA and actual load address to all locations listed there.
2820b57cec5SDimitry Andric
2830b57cec5SDimitry Andric  Note that this run-time relocation mechanism is much simpler than ELF.
2840b57cec5SDimitry Andric  There's no PLT or GOT. Images are relocated as a whole just
2850b57cec5SDimitry Andric  by shifting entire images in memory by some offsets. Although doing
2860b57cec5SDimitry Andric  this breaks text sharing, I think this mechanism is not actually bad
2870b57cec5SDimitry Andric  on today's computers.
2880b57cec5SDimitry Andric
2890b57cec5SDimitry Andric* ICF
2900b57cec5SDimitry Andric
2910b57cec5SDimitry Andric  Short for Identical COMDAT Folding (COFF) or Identical Code Folding (ELF).
2920b57cec5SDimitry Andric
2930b57cec5SDimitry Andric  ICF is an optimization to reduce output size by merging read-only sections
2940b57cec5SDimitry Andric  by not only their names but by their contents. If two read-only sections
2950b57cec5SDimitry Andric  happen to have the same metadata, actual contents and relocations,
2960b57cec5SDimitry Andric  they are merged by ICF. It is known as an effective technique,
2970b57cec5SDimitry Andric  and it usually reduces C++ program's size by a few percent or more.
2980b57cec5SDimitry Andric
2990b57cec5SDimitry Andric  Note that this is not an entirely sound optimization. C/C++ require
3000b57cec5SDimitry Andric  different functions have different addresses. If a program depends on
3010b57cec5SDimitry Andric  that property, it would fail at runtime.
3020b57cec5SDimitry Andric
3030b57cec5SDimitry Andric  On Windows, that's not really an issue because MSVC link.exe enabled
3040b57cec5SDimitry Andric  the optimization by default. As long as your program works
3050b57cec5SDimitry Andric  with the linker's default settings, your program should be safe with ICF.
3060b57cec5SDimitry Andric
3070b57cec5SDimitry Andric  On Unix, your program is generally not guaranteed to be safe with ICF,
3080b57cec5SDimitry Andric  although large programs happen to work correctly.
3090b57cec5SDimitry Andric  LLD works fine with ICF for example.
3100b57cec5SDimitry Andric
3110b57cec5SDimitry AndricOther Info
3120b57cec5SDimitry Andric----------
3130b57cec5SDimitry Andric
3140b57cec5SDimitry Andric.. toctree::
3150b57cec5SDimitry Andric   :maxdepth: 1
3160b57cec5SDimitry Andric
3170b57cec5SDimitry Andric   missingkeyfunction
318