xref: /freebsd/contrib/llvm-project/lld/docs/WebAssembly.rst (revision 0fca6ea1d4eea4c934cfff25ac9ee8ad6fe95583)
10b57cec5SDimitry AndricWebAssembly lld port
20b57cec5SDimitry Andric====================
30b57cec5SDimitry Andric
40b57cec5SDimitry AndricThe WebAssembly version of lld takes WebAssembly binaries as inputs and produces
50b57cec5SDimitry Andrica WebAssembly binary as its output.  For the most part it tries to mimic the
60b57cec5SDimitry Andricbehaviour of traditional ELF linkers and specifically the ELF lld port.  Where
70b57cec5SDimitry Andricpossible the command line flags and the semantics should be the same.
80b57cec5SDimitry Andric
90b57cec5SDimitry Andric
100b57cec5SDimitry AndricObject file format
110b57cec5SDimitry Andric------------------
120b57cec5SDimitry Andric
130b57cec5SDimitry AndricThe WebAssembly object file format used by LLVM and LLD is specified as part of
140b57cec5SDimitry Andricthe WebAssembly tool conventions on linking_.
150b57cec5SDimitry Andric
160b57cec5SDimitry AndricThis is the object format that the llvm will produce when run with the
170b57cec5SDimitry Andric``wasm32-unknown-unknown`` target.
180b57cec5SDimitry Andric
190b57cec5SDimitry AndricUsage
200b57cec5SDimitry Andric-----
210b57cec5SDimitry Andric
220b57cec5SDimitry AndricThe WebAssembly version of lld is installed as **wasm-ld**.  It shared many
230b57cec5SDimitry Andriccommon linker flags with **ld.lld** but also includes several
240b57cec5SDimitry AndricWebAssembly-specific options:
250b57cec5SDimitry Andric
260b57cec5SDimitry Andric.. option:: --no-entry
270b57cec5SDimitry Andric
280b57cec5SDimitry Andric  Don't search for the entry point symbol (by default ``_start``).
290b57cec5SDimitry Andric
300b57cec5SDimitry Andric.. option:: --export-table
310b57cec5SDimitry Andric
320b57cec5SDimitry Andric  Export the function table to the environment.
330b57cec5SDimitry Andric
340b57cec5SDimitry Andric.. option:: --import-table
350b57cec5SDimitry Andric
360b57cec5SDimitry Andric  Import the function table from the environment.
370b57cec5SDimitry Andric
380b57cec5SDimitry Andric.. option:: --export-all
390b57cec5SDimitry Andric
400b57cec5SDimitry Andric  Export all symbols (normally combined with --no-gc-sections)
410b57cec5SDimitry Andric
42e8d8bef9SDimitry Andric  Note that this will not export linker-generated mutable globals unless
43e8d8bef9SDimitry Andric  the resulting binaryen already includes the 'mutable-globals' features
44e8d8bef9SDimitry Andric  since that would otherwise create and invalid binaryen.
45e8d8bef9SDimitry Andric
460b57cec5SDimitry Andric.. option:: --export-dynamic
470b57cec5SDimitry Andric
480b57cec5SDimitry Andric  When building an executable, export any non-hidden symbols.  By default only
49480093f4SDimitry Andric  the entry point and any symbols marked as exports (either via the command line
50480093f4SDimitry Andric  or via the `export-name` source attribute) are exported.
510b57cec5SDimitry Andric
520b57cec5SDimitry Andric.. option:: --global-base=<value>
530b57cec5SDimitry Andric
540b57cec5SDimitry Andric  Address at which to place global data.
550b57cec5SDimitry Andric
560b57cec5SDimitry Andric.. option:: --no-merge-data-segments
570b57cec5SDimitry Andric
580b57cec5SDimitry Andric  Disable merging of data segments.
590b57cec5SDimitry Andric
600b57cec5SDimitry Andric.. option:: --stack-first
610b57cec5SDimitry Andric
620b57cec5SDimitry Andric  Place stack at start of linear memory rather than after data.
630b57cec5SDimitry Andric
640b57cec5SDimitry Andric.. option:: --compress-relocations
650b57cec5SDimitry Andric
66480093f4SDimitry Andric  Relocation targets in the code section are 5-bytes wide in order to
67480093f4SDimitry Andric  potentially accommodate the largest LEB128 value.  This option will cause the
68480093f4SDimitry Andric  linker to shrink the code section to remove any padding from the final
69480093f4SDimitry Andric  output.  However because it affects code offset, this option is not
70480093f4SDimitry Andric  compatible with outputting debug information.
710b57cec5SDimitry Andric
720b57cec5SDimitry Andric.. option:: --allow-undefined
730b57cec5SDimitry Andric
74e8d8bef9SDimitry Andric  Allow undefined symbols in linked binary.  This is the legacy
75fe6060f1SDimitry Andric  flag which corresponds to ``--unresolve-symbols=ignore`` +
76fe6060f1SDimitry Andric  ``--import-undefined``.
77e8d8bef9SDimitry Andric
782efbaac7SDimitry Andric.. option:: --allow-undefined-file=<filename>
792efbaac7SDimitry Andric
802efbaac7SDimitry Andric  Like ``--allow-undefined``, but the filename specified a flat list of
812efbaac7SDimitry Andric  symbols, one per line, which are allowed to be undefined.
822efbaac7SDimitry Andric
83e8d8bef9SDimitry Andric.. option:: --unresolved-symbols=<method>
84e8d8bef9SDimitry Andric
85e8d8bef9SDimitry Andric  This is a more full featured version of ``--allow-undefined``.
86e8d8bef9SDimitry Andric  The semanatics of the different methods are as follows:
87e8d8bef9SDimitry Andric
88e8d8bef9SDimitry Andric  report-all:
89e8d8bef9SDimitry Andric
90e8d8bef9SDimitry Andric     Report all unresolved symbols.  This is the default.  Normally the linker
91e8d8bef9SDimitry Andric     will generate an error message for each reported unresolved symbol but the
92e8d8bef9SDimitry Andric     option ``--warn-unresolved-symbols`` can change this to a warning.
93e8d8bef9SDimitry Andric
94e8d8bef9SDimitry Andric  ignore-all:
95e8d8bef9SDimitry Andric
96e8d8bef9SDimitry Andric     Resolve all undefined symbols to zero.  For data and function addresses
97e8d8bef9SDimitry Andric     this is trivial.  For direct function calls, the linker will generate a
98e8d8bef9SDimitry Andric     trapping stub function in place of the undefined function.
99e8d8bef9SDimitry Andric
10081ad6265SDimitry Andric  import-dynamic:
10181ad6265SDimitry Andric
10281ad6265SDimitry Andric     Undefined symbols generate WebAssembly imports, including undefined data
10381ad6265SDimitry Andric     symbols.  This is somewhat similar to the --import-undefined option but
10481ad6265SDimitry Andric     works all symbol types.  This options puts limitations on the type of
10581ad6265SDimitry Andric     relocations that are allowed for imported data symbols.  Relocations that
10681ad6265SDimitry Andric     require absolute data addresses (i.e. All R_WASM_MEMORY_ADDR_I32) will
10781ad6265SDimitry Andric     generate an error if they cannot be resolved statically.  For clang/llvm
10881ad6265SDimitry Andric     this means inputs should be compiled with `-fPIC` (i.e. `pic` or
10981ad6265SDimitry Andric     `dynamic-no-pic` relocation models).  This options is useful for linking
11081ad6265SDimitry Andric     binaries that are themselves static (non-relocatable) but whose undefined
11181ad6265SDimitry Andric     symbols are resolved by a dynamic linker.  Since the dynamic linking API is
11281ad6265SDimitry Andric     experimental, this option currently requires `--experimental-pic` to also
11381ad6265SDimitry Andric     be specified.
11481ad6265SDimitry Andric
1150b57cec5SDimitry Andric.. option:: --import-memory
1160b57cec5SDimitry Andric
1170b57cec5SDimitry Andric  Import memory from the environment.
1180b57cec5SDimitry Andric
119fe6060f1SDimitry Andric.. option:: --import-undefined
120fe6060f1SDimitry Andric
121fe6060f1SDimitry Andric   Generate WebAssembly imports for undefined symbols, where possible.  For
122fe6060f1SDimitry Andric   example, for function symbols this is always possible, but in general this
123fe6060f1SDimitry Andric   is not possible for undefined data symbols.  Undefined data symbols will
124fe6060f1SDimitry Andric   still be reported as normal (in accordance with ``--unresolved-symbols``).
125fe6060f1SDimitry Andric
1265f757f3fSDimitry Andric.. option:: --initial-heap=<value>
1275f757f3fSDimitry Andric
1285f757f3fSDimitry Andric  Initial size of the heap. Default: zero.
1295f757f3fSDimitry Andric
1300b57cec5SDimitry Andric.. option:: --initial-memory=<value>
1310b57cec5SDimitry Andric
1325f757f3fSDimitry Andric  Initial size of the linear memory. Default: the sum of stack, static data and heap sizes.
1330b57cec5SDimitry Andric
1340b57cec5SDimitry Andric.. option:: --max-memory=<value>
1350b57cec5SDimitry Andric
1360b57cec5SDimitry Andric  Maximum size of the linear memory. Default: unlimited.
1370b57cec5SDimitry Andric
138*0fca6ea1SDimitry Andric.. option:: --no-growable-memory
139*0fca6ea1SDimitry Andric
140*0fca6ea1SDimitry Andric  Set maximum size of the linear memory to its initial size, disallowing memory growth.
141*0fca6ea1SDimitry Andric
1420b57cec5SDimitry AndricBy default the function table is neither imported nor exported, but defined
1430b57cec5SDimitry Andricfor internal use only.
1440b57cec5SDimitry Andric
1450b57cec5SDimitry AndricBehaviour
1460b57cec5SDimitry Andric---------
1470b57cec5SDimitry Andric
1480b57cec5SDimitry AndricIn general, where possible, the WebAssembly linker attempts to emulate the
1490b57cec5SDimitry Andricbehaviour of a traditional ELF linker, and in particular the ELF port of lld.
1500b57cec5SDimitry AndricFor more specific details on how this is achieved see the tool conventions on
1510b57cec5SDimitry Andriclinking_.
1520b57cec5SDimitry Andric
1530b57cec5SDimitry AndricFunction Signatures
1540b57cec5SDimitry Andric~~~~~~~~~~~~~~~~~~~
1550b57cec5SDimitry Andric
1560b57cec5SDimitry AndricOne way in which the WebAssembly linker differs from traditional native linkers
1570b57cec5SDimitry Andricis that function signature checking is strict in WebAssembly.  It is a
1580b57cec5SDimitry Andricvalidation error for a module to contain a call site that doesn't agree with
1590b57cec5SDimitry Andricthe target signature.  Even though this is undefined behaviour in C/C++, it is not
1600b57cec5SDimitry Andricuncommon to find this in real-world C/C++ programs.  For example, a call site in
1610b57cec5SDimitry Andricone compilation unit which calls a function defined in another compilation
1620b57cec5SDimitry Andricunit but with too many arguments.
1630b57cec5SDimitry Andric
1640b57cec5SDimitry AndricIn order not to generate such invalid modules, lld has two modes of handling such
1650b57cec5SDimitry Andricmismatches: it can simply error-out or it can create stub functions that will
1660b57cec5SDimitry Andrictrap at runtime (functions that contain only an ``unreachable`` instruction)
1670b57cec5SDimitry Andricand use these stub functions at the otherwise invalid call sites.
1680b57cec5SDimitry Andric
1690b57cec5SDimitry AndricThe default behaviour is to generate these stub function and to produce
17085868e8aSDimitry Andrica warning.  The ``--fatal-warnings`` flag can be used to disable this behaviour
1710b57cec5SDimitry Andricand error out if mismatched are found.
1720b57cec5SDimitry Andric
1735ffd83dbSDimitry AndricExports
1745ffd83dbSDimitry Andric~~~~~~~
1750b57cec5SDimitry Andric
1760b57cec5SDimitry AndricWhen building a shared library any symbols marked as ``visibility=default`` will
177480093f4SDimitry Andricbe exported.
178480093f4SDimitry Andric
179480093f4SDimitry AndricWhen building an executable, only the entry point (``_start``) and symbols with
180480093f4SDimitry Andricthe ``WASM_SYMBOL_EXPORTED`` flag are exported by default.  In LLVM the
181480093f4SDimitry Andric``WASM_SYMBOL_EXPORTED`` flag is set by the ``wasm-export-name`` attribute which
182480093f4SDimitry Andricin turn can be set using ``__attribute__((export_name))`` clang attribute.
1830b57cec5SDimitry Andric
1840b57cec5SDimitry AndricIn addition, symbols can be exported via the linker command line using
185fe6060f1SDimitry Andric``--export`` (which will error if the symbol is not found) or
186fe6060f1SDimitry Andric``--export-if-defined`` (which will not).
1870b57cec5SDimitry Andric
1880b57cec5SDimitry AndricFinally, just like with native ELF linker the ``--export-dynamic`` flag can be
189480093f4SDimitry Andricused to export symbols in the executable which are marked as
1900b57cec5SDimitry Andric``visibility=default``.
1910b57cec5SDimitry Andric
1925ffd83dbSDimitry AndricImports
1935ffd83dbSDimitry Andric~~~~~~~
1945ffd83dbSDimitry Andric
1955ffd83dbSDimitry AndricBy default no undefined symbols are allowed in the final binary.  The flag
1965ffd83dbSDimitry Andric``--allow-undefined`` results in a WebAssembly import being defined for each
1975ffd83dbSDimitry Andricundefined symbol.  It is then up to the runtime to provide such symbols.
1982efbaac7SDimitry Andric``--allow-undefined-file`` is the same but allows a list of symbols to be
1992efbaac7SDimitry Andricspecified.
2005ffd83dbSDimitry Andric
2015ffd83dbSDimitry AndricAlternatively symbols can be marked in the source code as with the
2025ffd83dbSDimitry Andric``import_name`` and/or ``import_module`` clang attributes which signals that
2035ffd83dbSDimitry Andricthey are expected to be undefined at static link time.
2045ffd83dbSDimitry Andric
2052efbaac7SDimitry AndricStub Libraries
2062efbaac7SDimitry Andric~~~~~~~~~~~~~~
2072efbaac7SDimitry Andric
2082efbaac7SDimitry AndricAnother way to specify imports and exports is via a "stub library".  This
2092efbaac7SDimitry Andricfeature is inspired by the ELF stub objects which are supported by the Solaris
2102efbaac7SDimitry Andriclinker.  Stub libraries are text files that can be passed as normal linker
2112efbaac7SDimitry Andricinputs, similar to how linker scripts can be passed to the ELF linker.  The stub
2122efbaac7SDimitry Andriclibrary is a stand-in for a set of symbols that will be available at runtime,
2132efbaac7SDimitry Andricbut doesn't contain any actual code or data.  Instead it contains just a list of
2142efbaac7SDimitry Andricsymbols, one per line.  Each symbol can specify zero or more dependencies.
2152efbaac7SDimitry AndricThese dependencies are symbols that must be defined, and exported, by the output
2162efbaac7SDimitry Andricmodule if the symbol is question is imported/required by the output module.
2172efbaac7SDimitry Andric
2182efbaac7SDimitry AndricFor example, imagine the runtime provides an external symbol ``foo`` that
2192efbaac7SDimitry Andricdepends on the ``malloc`` and ``free``.  This can be expressed simply as::
2202efbaac7SDimitry Andric
2212efbaac7SDimitry Andric  #STUB
2222efbaac7SDimitry Andric  foo: malloc,free
2232efbaac7SDimitry Andric
2242efbaac7SDimitry AndricHere we are saying that ``foo`` is allowed to be imported (undefined) but that
2252efbaac7SDimitry Andricif it is imported, then the output module must also export ``malloc`` and
2262efbaac7SDimitry Andric``free`` to the runtime.  If ``foo`` is imported (undefined), but the output
2272efbaac7SDimitry Andricmodule does not define ``malloc`` and ``free`` then the link will fail.
2282efbaac7SDimitry Andric
2292efbaac7SDimitry AndricStub libraries must begin with ``#STUB`` on a line by itself.
2302efbaac7SDimitry Andric
2310b57cec5SDimitry AndricGarbage Collection
2320b57cec5SDimitry Andric~~~~~~~~~~~~~~~~~~
2330b57cec5SDimitry Andric
2340b57cec5SDimitry AndricSince WebAssembly is designed with size in mind the linker defaults to
2350b57cec5SDimitry Andric``--gc-sections`` which means that all unused functions and data segments will
2360b57cec5SDimitry Andricbe stripped from the binary.
2370b57cec5SDimitry Andric
2380b57cec5SDimitry AndricThe symbols which are preserved by default are:
2390b57cec5SDimitry Andric
2400b57cec5SDimitry Andric- The entry point (by default ``_start``).
2410b57cec5SDimitry Andric- Any symbol which is to be exported.
2420b57cec5SDimitry Andric- Any symbol transitively referenced by the above.
2430b57cec5SDimitry Andric
2440b57cec5SDimitry AndricWeak Undefined Functions
2450b57cec5SDimitry Andric~~~~~~~~~~~~~~~~~~~~~~~~
2460b57cec5SDimitry Andric
2470b57cec5SDimitry AndricOn native platforms, calls to weak undefined functions end up as calls to the
2480b57cec5SDimitry Andricnull function pointer.  With WebAssembly, direct calls must reference a defined
2490b57cec5SDimitry Andricfunction (with the correct signature).  In order to handle this case the linker
2500b57cec5SDimitry Andricwill generate function a stub containing only the ``unreachable`` instruction
2510b57cec5SDimitry Andricand use this for any direct references to an undefined weak function.
2520b57cec5SDimitry Andric
2530b57cec5SDimitry AndricFor example a runtime call to a weak undefined function ``foo`` will up trapping
2540b57cec5SDimitry Andricon ``unreachable`` inside and linker-generated function called
2550b57cec5SDimitry Andric``undefined:foo``.
2560b57cec5SDimitry Andric
2570b57cec5SDimitry AndricMissing features
2580b57cec5SDimitry Andric----------------
2590b57cec5SDimitry Andric
2600b57cec5SDimitry Andric- Merging of data section similar to ``SHF_MERGE`` in the ELF world is not
2610b57cec5SDimitry Andric  supported.
2620b57cec5SDimitry Andric- No support for creating shared libraries.  The spec for shared libraries in
2630b57cec5SDimitry Andric  WebAssembly is still in flux:
264349cc55cSDimitry Andric  https://github.com/WebAssembly/tool-conventions/blob/main/DynamicLinking.md
2650b57cec5SDimitry Andric
266349cc55cSDimitry Andric.. _linking: https://github.com/WebAssembly/tool-conventions/blob/main/Linking.md
267