xref: /freebsd/contrib/llvm-project/lld/docs/ELF/linker_script.rst (revision fe6060f10f634930ff71b7c50291ddc610da2475)
15ffd83dbSDimitry AndricLinker Script implementation notes and policy
25ffd83dbSDimitry Andric=============================================
35ffd83dbSDimitry Andric
45ffd83dbSDimitry AndricLLD implements a large subset of the GNU ld linker script notation. The LLD
55ffd83dbSDimitry Andricimplementation policy is to implement linker script features as they are
65ffd83dbSDimitry Andricdocumented in the ld `manual <https://sourceware.org/binutils/docs/ld/Scripts.html>`_
75ffd83dbSDimitry AndricWe consider it a bug if the lld implementation does not agree with the manual
85ffd83dbSDimitry Andricand it is not mentioned in the exceptions below.
95ffd83dbSDimitry Andric
105ffd83dbSDimitry AndricThe ld manual is not a complete specification, and is not sufficient to build
115ffd83dbSDimitry Andrican implementation. In particular some features are only defined by the
125ffd83dbSDimitry Andricimplementation and have changed over time.
135ffd83dbSDimitry Andric
145ffd83dbSDimitry AndricThe lld implementation policy for properties of linker scripts that are not
155ffd83dbSDimitry Andricdefined by the documentation is to follow the GNU ld implementation wherever
165ffd83dbSDimitry Andricpossible. We reserve the right to make different implementation choices where
175ffd83dbSDimitry Andricit is appropriate for LLD. Intentional deviations will be documented in this
185ffd83dbSDimitry Andricfile.
195ffd83dbSDimitry Andric
2016d6b3b3SDimitry AndricSymbol assignment
2116d6b3b3SDimitry Andric~~~~~~~~~~~~~~~~~
2216d6b3b3SDimitry Andric
2316d6b3b3SDimitry AndricA symbol assignment looks like:
2416d6b3b3SDimitry Andric
2516d6b3b3SDimitry Andric::
2616d6b3b3SDimitry Andric
2716d6b3b3SDimitry Andric  symbol = expression;
2816d6b3b3SDimitry Andric  symbol += expression;
2916d6b3b3SDimitry Andric
3016d6b3b3SDimitry AndricThe first form defines ``symbol``. If ``symbol`` is already defined, it will be
3116d6b3b3SDimitry Andricoverridden. The other form requires ``symbol`` to be already defined.
3216d6b3b3SDimitry Andric
3316d6b3b3SDimitry AndricFor a simple assignment like ``alias = aliasee;``, the ``st_type`` field is
3416d6b3b3SDimitry Andriccopied from the original symbol. Any arithmetic operation (e.g. ``+ 0`` will
3516d6b3b3SDimitry Andricreset ``st_type`` to ``STT_NOTYPE``.
3616d6b3b3SDimitry Andric
3716d6b3b3SDimitry AndricThe ``st_size`` field is set to 0.
3816d6b3b3SDimitry Andric
39*fe6060f1SDimitry AndricSECTIONS command
40*fe6060f1SDimitry Andric~~~~~~~~~~~~~~~~
41*fe6060f1SDimitry Andric
42*fe6060f1SDimitry AndricA ``SECTIONS`` command looks like:
43*fe6060f1SDimitry Andric
44*fe6060f1SDimitry Andric::
45*fe6060f1SDimitry Andric
46*fe6060f1SDimitry Andric  SECTIONS {
47*fe6060f1SDimitry Andric    section-command
48*fe6060f1SDimitry Andric    section-command
49*fe6060f1SDimitry Andric    ...
50*fe6060f1SDimitry Andric  } [INSERT [AFTER|BEFORE] anchor_section;]
51*fe6060f1SDimitry Andric
52*fe6060f1SDimitry AndricEach section-command can be a symbol assignment, an output section description,
53*fe6060f1SDimitry Andricor an overlay description.
54*fe6060f1SDimitry Andric
55*fe6060f1SDimitry AndricWhen the ``INSERT`` keyword is present, the ``SECTIONS`` command describes some
56*fe6060f1SDimitry Andricoutput sections which should be inserted after or before the specified anchor
57*fe6060f1SDimitry Andricsection. The insertion occurs after input sections have been mapped to output
58*fe6060f1SDimitry Andricsections but before orphan sections have been processed.
59*fe6060f1SDimitry Andric
60*fe6060f1SDimitry AndricIn the case where no linker script has been provided or every ``SECTIONS``
61*fe6060f1SDimitry Andriccommand is followed by ``INSERT``, LLD applies built-in rules which are similar
62*fe6060f1SDimitry Andricto GNU ld's internal linker scripts.
63*fe6060f1SDimitry Andric
64*fe6060f1SDimitry Andric- Align the first section in a ``PT_LOAD`` segment according to ``-z noseparate-code``,
65*fe6060f1SDimitry Andric  ``-z separate-code``, or ``-z separate-loadable-segments``
66*fe6060f1SDimitry Andric- Define ``__bss_start``, ``end``, ``_end``, ``etext``, ``_etext``, ``edata``, ``_edata``
67*fe6060f1SDimitry Andric- Sort ``.ctors.*``/``.dtors.*``/``.init_array.*``/``.fini_array.*`` and PowerPC64 specific ``.toc``
68*fe6060f1SDimitry Andric- Place input ``.text.*`` into output ``.text``, and handle certain variants
69*fe6060f1SDimitry Andric  (``.text.hot.``, ``.text.unknown.``, ``.text.unlikely.``, etc) in the precense of
70*fe6060f1SDimitry Andric  ``-z keep-text-section-prefix``.
71*fe6060f1SDimitry Andric
725ffd83dbSDimitry AndricOutput section description
735ffd83dbSDimitry Andric~~~~~~~~~~~~~~~~~~~~~~~~~~
745ffd83dbSDimitry Andric
755ffd83dbSDimitry AndricThe description of an output section looks like:
765ffd83dbSDimitry Andric
775ffd83dbSDimitry Andric::
785ffd83dbSDimitry Andric
795ffd83dbSDimitry Andric  section [address] [(type)] : [AT(lma)] [ALIGN(section_align)] [SUBALIGN](subsection_align)] {
805ffd83dbSDimitry Andric    output-section-command
815ffd83dbSDimitry Andric    ...
825ffd83dbSDimitry Andric  } [>region] [AT>lma_region] [:phdr ...] [=fillexp] [,]
835ffd83dbSDimitry Andric
845ffd83dbSDimitry AndricOutput section address
855ffd83dbSDimitry Andric----------------------
865ffd83dbSDimitry Andric
875ffd83dbSDimitry AndricWhen an *OutputSection* *S* has ``address``, LLD will set sh_addr to ``address``.
885ffd83dbSDimitry Andric
895ffd83dbSDimitry AndricThe ELF specification says:
905ffd83dbSDimitry Andric
915ffd83dbSDimitry Andric> The value of sh_addr must be congruent to 0, modulo the value of sh_addralign.
925ffd83dbSDimitry Andric
935ffd83dbSDimitry AndricThe presence of ``address`` can cause the condition unsatisfied. LLD will warn.
945ffd83dbSDimitry AndricGNU ld from Binutils 2.35 onwards will reduce sh_addralign so that
955ffd83dbSDimitry Andricsh_addr=0 (modulo sh_addralign).
965ffd83dbSDimitry Andric
975ffd83dbSDimitry AndricOutput section alignment
985ffd83dbSDimitry Andric------------------------
995ffd83dbSDimitry Andric
1005ffd83dbSDimitry Andricsh_addralign of an *OutputSection* *S* is the maximum of
1015ffd83dbSDimitry Andric``ALIGN(section_align)`` and the maximum alignment of the input sections in
1025ffd83dbSDimitry Andric*S*.
1035ffd83dbSDimitry Andric
1045ffd83dbSDimitry AndricWhen an *OutputSection* *S* has both ``address`` and ``ALIGN(section_align)``,
1055ffd83dbSDimitry AndricGNU ld will set sh_addralign to ``ALIGN(section_align)``.
1065ffd83dbSDimitry Andric
1075ffd83dbSDimitry AndricOutput section LMA
1085ffd83dbSDimitry Andric------------------
1095ffd83dbSDimitry Andric
1105ffd83dbSDimitry AndricA load address (LMA) can be specified by ``AT(lma)`` or ``AT>lma_region``.
1115ffd83dbSDimitry Andric
1125ffd83dbSDimitry Andric- ``AT(lma)`` specifies the exact load address. If the linker script does not
1135ffd83dbSDimitry Andric  have a PHDRS command, then a new loadable segment will be generated.
1145ffd83dbSDimitry Andric- ``AT>lma_region`` specifies the LMA region. The lack of ``AT>lma_region``
1155ffd83dbSDimitry Andric  means the default region is used. Note, GNU ld propagates the previous LMA
1165ffd83dbSDimitry Andric  memory region when ``address`` is not specified. The LMA is set to the
1175ffd83dbSDimitry Andric  current location of the memory region aligned to the section alignment.
1185ffd83dbSDimitry Andric  If the linker script does not have a PHDRS command, then if
1195ffd83dbSDimitry Andric  ``lma_region`` is different from the ``lma_region`` for
1205ffd83dbSDimitry Andric  the previous OutputSection a new loadable segment will be generated.
1215ffd83dbSDimitry Andric
1225ffd83dbSDimitry AndricThe two keywords cannot be specified at the same time.
1235ffd83dbSDimitry Andric
1245ffd83dbSDimitry AndricIf neither ``AT(lma)`` nor ``AT>lma_region`` is specified:
1255ffd83dbSDimitry Andric
1265ffd83dbSDimitry Andric- If the previous section is also in the default LMA region, and the two
1275ffd83dbSDimitry Andric  section have the same memory regions, the difference between the LMA and the
1285ffd83dbSDimitry Andric  VMA is computed to be the same as the previous difference.
1295ffd83dbSDimitry Andric- Otherwise, the LMA is set to the VMA.
130*fe6060f1SDimitry Andric
131*fe6060f1SDimitry AndricOverwrite sections
132*fe6060f1SDimitry Andric~~~~~~~~~~~~~~~~~~
133*fe6060f1SDimitry Andric
134*fe6060f1SDimitry AndricAn ``OVERWRITE_SECTIONS`` command looks like:
135*fe6060f1SDimitry Andric
136*fe6060f1SDimitry Andric::
137*fe6060f1SDimitry Andric
138*fe6060f1SDimitry Andric  OVERWRITE_SECTIONS {
139*fe6060f1SDimitry Andric    output-section-description
140*fe6060f1SDimitry Andric    output-section-description
141*fe6060f1SDimitry Andric    ...
142*fe6060f1SDimitry Andric  }
143*fe6060f1SDimitry Andric
144*fe6060f1SDimitry AndricUnlike a ``SECTIONS`` command, ``OVERWRITE_SECTIONS``  does not specify a
145*fe6060f1SDimitry Andricsection order or suppress the built-in rules.
146*fe6060f1SDimitry Andric
147*fe6060f1SDimitry AndricIf a described output section description also appears in a ``SECTIONS``
148*fe6060f1SDimitry Andriccommand, the ``OVERWRITE_SECTIONS`` command wins; otherwise, the output section
149*fe6060f1SDimitry Andricwill be added somewhere following the usual orphan section placement rules.
150*fe6060f1SDimitry Andric
151*fe6060f1SDimitry AndricIf a described output section description also appears in an ``INSERT
152*fe6060f1SDimitry Andric[AFTER|BEFORE]`` command, the description will be provided by the
153*fe6060f1SDimitry Andricdescription in the ``OVERWRITE_SECTIONS`` command while the insert command
154*fe6060f1SDimitry Andricstill applies (possibly after orphan section placement). It is recommended to
155*fe6060f1SDimitry Andricleave the brace empty (i.e. ``section : {}``) for the insert command, because
156*fe6060f1SDimitry Andricits description will be ignored anyway.
157