xref: /freebsd/contrib/llvm-project/lld/docs/ELF/warn_backrefs.rst (revision e8d8bef961a50d4dc22501cde4fb9fb0be1b2532)
1*e8d8bef9SDimitry Andric--warn-backrefs
2*e8d8bef9SDimitry Andric===============
3*e8d8bef9SDimitry Andric
4*e8d8bef9SDimitry Andric``--warn-backrefs`` gives a warning when an undefined symbol reference is
5*e8d8bef9SDimitry Andricresolved by a definition in an archive to the left of it on the command line.
6*e8d8bef9SDimitry Andric
7*e8d8bef9SDimitry AndricA linker such as GNU ld makes a single pass over the input files from left to
8*e8d8bef9SDimitry Andricright maintaining the set of undefined symbol references from the files loaded
9*e8d8bef9SDimitry Andricso far. When encountering an archive or an object file surrounded by
10*e8d8bef9SDimitry Andric``--start-lib`` and ``--end-lib`` that archive will be searched for resolving
11*e8d8bef9SDimitry Andricsymbol definitions; this may result in input files being loaded, updating the
12*e8d8bef9SDimitry Andricset of undefined symbol references. When all resolving definitions have been
13*e8d8bef9SDimitry Andricloaded from the archive, the linker moves on the next file and will not return
14*e8d8bef9SDimitry Andricto it.  This means that if an input file to the right of a archive cannot have
15*e8d8bef9SDimitry Andrican undefined symbol resolved by a archive to the left of it. For example:
16*e8d8bef9SDimitry Andric
17*e8d8bef9SDimitry Andric    ld def.a ref.o
18*e8d8bef9SDimitry Andric
19*e8d8bef9SDimitry Andricwill result in an ``undefined reference`` error. If there are no cyclic
20*e8d8bef9SDimitry Andricreferences, the archives can be ordered in such a way that there are no
21*e8d8bef9SDimitry Andricbackward references. If there are cyclic references then the ``--start-group``
22*e8d8bef9SDimitry Andricand ``--end-group`` options can be used, or the same archive can be placed on
23*e8d8bef9SDimitry Andricthe command line twice.
24*e8d8bef9SDimitry Andric
25*e8d8bef9SDimitry AndricLLD remembers the symbol table of archives that it has previously seen, so if
26*e8d8bef9SDimitry Andricthere is a reference from an input file to the right of an archive, LLD will
27*e8d8bef9SDimitry Andricstill search that archive for resolving any undefined references. This means
28*e8d8bef9SDimitry Andricthat an archive only needs to be included once on the command line and the
29*e8d8bef9SDimitry Andric``--start-group`` and ``--end-group`` options are redundant.
30*e8d8bef9SDimitry Andric
31*e8d8bef9SDimitry AndricA consequence of the differing archive searching semantics is that the same
32*e8d8bef9SDimitry Andriclinker command line can result in different outcomes. A link may succeed with
33*e8d8bef9SDimitry AndricLLD that will fail with GNU ld, or even worse both links succeed but they have
34*e8d8bef9SDimitry Andricselected different objects from different archives that both define the same
35*e8d8bef9SDimitry Andricsymbols.
36*e8d8bef9SDimitry Andric
37*e8d8bef9SDimitry AndricThe ``warn-backrefs`` option provides information that helps identify cases
38*e8d8bef9SDimitry Andricwhere LLD and GNU ld archive selection may differ.
39*e8d8bef9SDimitry Andric
40*e8d8bef9SDimitry Andric    % ld.lld --warn-backrefs ... -lB -lA
41*e8d8bef9SDimitry Andric    ld.lld: warning: backward reference detected: system in A.a(a.o) refers to B.a(b.o)
42*e8d8bef9SDimitry Andric
43*e8d8bef9SDimitry Andric    % ld.lld --warn-backrefs ... --start-lib B/b.o --end-lib --start-lib A/a.o --end-lib
44*e8d8bef9SDimitry Andric    ld.lld: warning: backward reference detected: system in A/a.o refers to B/b.o
45*e8d8bef9SDimitry Andric
46*e8d8bef9SDimitry Andric    # To suppress the warning, you can specify --warn-backrefs-exclude=<glob> to match B/b.o or B.a(b.o)
47*e8d8bef9SDimitry Andric
48*e8d8bef9SDimitry AndricThe ``--warn-backrefs`` option can also provide a check to enforce a
49*e8d8bef9SDimitry Andrictopological order of archives, which can be useful to detect layering
50*e8d8bef9SDimitry Andricviolations (albeit unable to catch all cases). There are two cases where GNU ld
51*e8d8bef9SDimitry Andricwill result in an ``undefined reference`` error:
52*e8d8bef9SDimitry Andric
53*e8d8bef9SDimitry Andric* If adding the dependency does not form a cycle: conceptually ``A`` is higher
54*e8d8bef9SDimitry Andric  level library while ``B`` is at a lower level. When you are developing an
55*e8d8bef9SDimitry Andric  application ``P`` which depends on ``A``, but does not directly depend on
56*e8d8bef9SDimitry Andric  ``B``, your link may fail surprisingly with ``undefined symbol:
57*e8d8bef9SDimitry Andric  symbol_defined_in_B`` if the used/linked part of ``A`` happens to need some
58*e8d8bef9SDimitry Andric  components of ``B``. It is inappropriate for ``P`` to add a dependency on
59*e8d8bef9SDimitry Andric  ``B`` since ``P`` does not use ``B`` directly.
60*e8d8bef9SDimitry Andric* If adding the dependency forms a cycle, e.g. ``B->C->A ~> B``. ``A``
61*e8d8bef9SDimitry Andric  is supposed to be at the lowest level while ``B`` is supposed to be at the
62*e8d8bef9SDimitry Andric  highest level. When you are developing ``C_test`` testing ``C``, your link may
63*e8d8bef9SDimitry Andric  fail surprisingly with ``undefined symbol`` if there is somehow a dependency on
64*e8d8bef9SDimitry Andric  some components of ``B``. You could fix the issue by adding the missing
65*e8d8bef9SDimitry Andric  dependency (``B``), however, then every test (``A_test``, ``B_test``,
66*e8d8bef9SDimitry Andric  ``C_test``) will link against every library. This breaks the motivation
67*e8d8bef9SDimitry Andric  of splitting ``B``, ``C`` and ``A`` into separate libraries and makes binaries
68*e8d8bef9SDimitry Andric  unnecessarily large. Moreover, the layering violation makes lower-level
69*e8d8bef9SDimitry Andric  libraries (e.g. ``A``) vulnerable to changes to higher-level libraries (e.g.
70*e8d8bef9SDimitry Andric  ``B``, ``C``).
71*e8d8bef9SDimitry Andric
72*e8d8bef9SDimitry AndricResolution:
73*e8d8bef9SDimitry Andric
74*e8d8bef9SDimitry Andric* Add a dependency from ``A`` to ``B``.
75*e8d8bef9SDimitry Andric* The reference may be unintended and can be removed.
76*e8d8bef9SDimitry Andric* The dependency may be intentionally omitted because there are multiple
77*e8d8bef9SDimitry Andric  libraries like ``B``.  Consider linking ``B`` with object semantics by
78*e8d8bef9SDimitry Andric  surrounding it with ``--whole-archive`` and ``--no-whole-archive``.
79*e8d8bef9SDimitry Andric* In the case of circular dependency, sometimes merging the libraries are the best.
80*e8d8bef9SDimitry Andric
81*e8d8bef9SDimitry AndricThere are two cases like a library sandwich where GNU ld will select a
82*e8d8bef9SDimitry Andricdifferent object.
83*e8d8bef9SDimitry Andric
84*e8d8bef9SDimitry Andric* ``A.a B A2.so``: ``A.a`` may be used as an interceptor (e.g. it provides some
85*e8d8bef9SDimitry Andric  optimized libc functions and ``A2`` is libc).  ``B`` does not need to know
86*e8d8bef9SDimitry Andric  about ``A.a``, and ``A.a`` may be pulled into the link by other part of the
87*e8d8bef9SDimitry Andric  program. For linker portability, consider ``--whole-archive`` and
88*e8d8bef9SDimitry Andric  ``--no-whole-archive``.
89*e8d8bef9SDimitry Andric
90*e8d8bef9SDimitry Andric* ``A.a B A2.a``: similar to the above case but ``--warn-backrefs`` does not
91*e8d8bef9SDimitry Andric  flag the problem, because ``A2.a`` may be a replicate of ``A.a``, which is
92*e8d8bef9SDimitry Andric  redundant but benign. In some cases ``A.a`` and ``B`` should be surrounded by
93*e8d8bef9SDimitry Andric  a pair of ``--start-group`` and ``--end-group``. This is especially common
94*e8d8bef9SDimitry Andric  among system libraries (e.g.  ``-lc __isnanl references -lm``, ``-lc
95*e8d8bef9SDimitry Andric  _IO_funlockfile references -lpthread``, ``-lc __gcc_personality_v0 references
96*e8d8bef9SDimitry Andric  -lgcc_eh``, and ``-lpthread _Unwind_GetCFA references -lunwind``).
97*e8d8bef9SDimitry Andric
98*e8d8bef9SDimitry Andric  In C++, this is likely an ODR violation. We probably need a dedicated option
99*e8d8bef9SDimitry Andric  for ODR detection.
100