xref: /freebsd/contrib/llvm-project/lld/docs/WebAssembly.rst (revision 3ceba58a7509418b47b8fca2d2b6bbf088714e26)
1WebAssembly lld port
2====================
3
4The WebAssembly version of lld takes WebAssembly binaries as inputs and produces
5a WebAssembly binary as its output.  For the most part it tries to mimic the
6behaviour of traditional ELF linkers and specifically the ELF lld port.  Where
7possible the command line flags and the semantics should be the same.
8
9
10Object file format
11------------------
12
13The WebAssembly object file format used by LLVM and LLD is specified as part of
14the WebAssembly tool conventions on linking_.
15
16This is the object format that the llvm will produce when run with the
17``wasm32-unknown-unknown`` target.
18
19Usage
20-----
21
22The WebAssembly version of lld is installed as **wasm-ld**.  It shared many
23common linker flags with **ld.lld** but also includes several
24WebAssembly-specific options:
25
26.. option:: --no-entry
27
28  Don't search for the entry point symbol (by default ``_start``).
29
30.. option:: --export-table
31
32  Export the function table to the environment.
33
34.. option:: --import-table
35
36  Import the function table from the environment.
37
38.. option:: --export-all
39
40  Export all symbols (normally combined with --no-gc-sections)
41
42  Note that this will not export linker-generated mutable globals unless
43  the resulting binaryen already includes the 'mutable-globals' features
44  since that would otherwise create and invalid binaryen.
45
46.. option:: --export-dynamic
47
48  When building an executable, export any non-hidden symbols.  By default only
49  the entry point and any symbols marked as exports (either via the command line
50  or via the `export-name` source attribute) are exported.
51
52.. option:: --global-base=<value>
53
54  Address at which to place global data.
55
56.. option:: --no-merge-data-segments
57
58  Disable merging of data segments.
59
60.. option:: --stack-first
61
62  Place stack at start of linear memory rather than after data.
63
64.. option:: --compress-relocations
65
66  Relocation targets in the code section are 5-bytes wide in order to
67  potentially accommodate the largest LEB128 value.  This option will cause the
68  linker to shrink the code section to remove any padding from the final
69  output.  However because it affects code offset, this option is not
70  compatible with outputting debug information.
71
72.. option:: --allow-undefined
73
74  Allow undefined symbols in linked binary.  This is the legacy
75  flag which corresponds to ``--unresolve-symbols=ignore`` +
76  ``--import-undefined``.
77
78.. option:: --allow-undefined-file=<filename>
79
80  Like ``--allow-undefined``, but the filename specified a flat list of
81  symbols, one per line, which are allowed to be undefined.
82
83.. option:: --unresolved-symbols=<method>
84
85  This is a more full featured version of ``--allow-undefined``.
86  The semanatics of the different methods are as follows:
87
88  report-all:
89
90     Report all unresolved symbols.  This is the default.  Normally the linker
91     will generate an error message for each reported unresolved symbol but the
92     option ``--warn-unresolved-symbols`` can change this to a warning.
93
94  ignore-all:
95
96     Resolve all undefined symbols to zero.  For data and function addresses
97     this is trivial.  For direct function calls, the linker will generate a
98     trapping stub function in place of the undefined function.
99
100  import-dynamic:
101
102     Undefined symbols generate WebAssembly imports, including undefined data
103     symbols.  This is somewhat similar to the --import-undefined option but
104     works all symbol types.  This options puts limitations on the type of
105     relocations that are allowed for imported data symbols.  Relocations that
106     require absolute data addresses (i.e. All R_WASM_MEMORY_ADDR_I32) will
107     generate an error if they cannot be resolved statically.  For clang/llvm
108     this means inputs should be compiled with `-fPIC` (i.e. `pic` or
109     `dynamic-no-pic` relocation models).  This options is useful for linking
110     binaries that are themselves static (non-relocatable) but whose undefined
111     symbols are resolved by a dynamic linker.  Since the dynamic linking API is
112     experimental, this option currently requires `--experimental-pic` to also
113     be specified.
114
115.. option:: --import-memory
116
117  Import memory from the environment.
118
119.. option:: --import-undefined
120
121   Generate WebAssembly imports for undefined symbols, where possible.  For
122   example, for function symbols this is always possible, but in general this
123   is not possible for undefined data symbols.  Undefined data symbols will
124   still be reported as normal (in accordance with ``--unresolved-symbols``).
125
126.. option:: --initial-heap=<value>
127
128  Initial size of the heap. Default: zero.
129
130.. option:: --initial-memory=<value>
131
132  Initial size of the linear memory. Default: the sum of stack, static data and heap sizes.
133
134.. option:: --max-memory=<value>
135
136  Maximum size of the linear memory. Default: unlimited.
137
138.. option:: --no-growable-memory
139
140  Set maximum size of the linear memory to its initial size, disallowing memory growth.
141
142By default the function table is neither imported nor exported, but defined
143for internal use only.
144
145Behaviour
146---------
147
148In general, where possible, the WebAssembly linker attempts to emulate the
149behaviour of a traditional ELF linker, and in particular the ELF port of lld.
150For more specific details on how this is achieved see the tool conventions on
151linking_.
152
153Function Signatures
154~~~~~~~~~~~~~~~~~~~
155
156One way in which the WebAssembly linker differs from traditional native linkers
157is that function signature checking is strict in WebAssembly.  It is a
158validation error for a module to contain a call site that doesn't agree with
159the target signature.  Even though this is undefined behaviour in C/C++, it is not
160uncommon to find this in real-world C/C++ programs.  For example, a call site in
161one compilation unit which calls a function defined in another compilation
162unit but with too many arguments.
163
164In order not to generate such invalid modules, lld has two modes of handling such
165mismatches: it can simply error-out or it can create stub functions that will
166trap at runtime (functions that contain only an ``unreachable`` instruction)
167and use these stub functions at the otherwise invalid call sites.
168
169The default behaviour is to generate these stub function and to produce
170a warning.  The ``--fatal-warnings`` flag can be used to disable this behaviour
171and error out if mismatched are found.
172
173Exports
174~~~~~~~
175
176When building a shared library any symbols marked as ``visibility=default`` will
177be exported.
178
179When building an executable, only the entry point (``_start``) and symbols with
180the ``WASM_SYMBOL_EXPORTED`` flag are exported by default.  In LLVM the
181``WASM_SYMBOL_EXPORTED`` flag is set by the ``wasm-export-name`` attribute which
182in turn can be set using ``__attribute__((export_name))`` clang attribute.
183
184In addition, symbols can be exported via the linker command line using
185``--export`` (which will error if the symbol is not found) or
186``--export-if-defined`` (which will not).
187
188Finally, just like with native ELF linker the ``--export-dynamic`` flag can be
189used to export symbols in the executable which are marked as
190``visibility=default``.
191
192Imports
193~~~~~~~
194
195By default no undefined symbols are allowed in the final binary.  The flag
196``--allow-undefined`` results in a WebAssembly import being defined for each
197undefined symbol.  It is then up to the runtime to provide such symbols.
198``--allow-undefined-file`` is the same but allows a list of symbols to be
199specified.
200
201Alternatively symbols can be marked in the source code as with the
202``import_name`` and/or ``import_module`` clang attributes which signals that
203they are expected to be undefined at static link time.
204
205Stub Libraries
206~~~~~~~~~~~~~~
207
208Another way to specify imports and exports is via a "stub library".  This
209feature is inspired by the ELF stub objects which are supported by the Solaris
210linker.  Stub libraries are text files that can be passed as normal linker
211inputs, similar to how linker scripts can be passed to the ELF linker.  The stub
212library is a stand-in for a set of symbols that will be available at runtime,
213but doesn't contain any actual code or data.  Instead it contains just a list of
214symbols, one per line.  Each symbol can specify zero or more dependencies.
215These dependencies are symbols that must be defined, and exported, by the output
216module if the symbol is question is imported/required by the output module.
217
218For example, imagine the runtime provides an external symbol ``foo`` that
219depends on the ``malloc`` and ``free``.  This can be expressed simply as::
220
221  #STUB
222  foo: malloc,free
223
224Here we are saying that ``foo`` is allowed to be imported (undefined) but that
225if it is imported, then the output module must also export ``malloc`` and
226``free`` to the runtime.  If ``foo`` is imported (undefined), but the output
227module does not define ``malloc`` and ``free`` then the link will fail.
228
229Stub libraries must begin with ``#STUB`` on a line by itself.
230
231Garbage Collection
232~~~~~~~~~~~~~~~~~~
233
234Since WebAssembly is designed with size in mind the linker defaults to
235``--gc-sections`` which means that all unused functions and data segments will
236be stripped from the binary.
237
238The symbols which are preserved by default are:
239
240- The entry point (by default ``_start``).
241- Any symbol which is to be exported.
242- Any symbol transitively referenced by the above.
243
244Weak Undefined Functions
245~~~~~~~~~~~~~~~~~~~~~~~~
246
247On native platforms, calls to weak undefined functions end up as calls to the
248null function pointer.  With WebAssembly, direct calls must reference a defined
249function (with the correct signature).  In order to handle this case the linker
250will generate function a stub containing only the ``unreachable`` instruction
251and use this for any direct references to an undefined weak function.
252
253For example a runtime call to a weak undefined function ``foo`` will up trapping
254on ``unreachable`` inside and linker-generated function called
255``undefined:foo``.
256
257Missing features
258----------------
259
260- Merging of data section similar to ``SHF_MERGE`` in the ELF world is not
261  supported.
262- No support for creating shared libraries.  The spec for shared libraries in
263  WebAssembly is still in flux:
264  https://github.com/WebAssembly/tool-conventions/blob/main/DynamicLinking.md
265
266.. _linking: https://github.com/WebAssembly/tool-conventions/blob/main/Linking.md
267