10b57cec5SDimitry AndricWebAssembly lld port 20b57cec5SDimitry Andric==================== 30b57cec5SDimitry Andric 40b57cec5SDimitry AndricThe WebAssembly version of lld takes WebAssembly binaries as inputs and produces 50b57cec5SDimitry Andrica WebAssembly binary as its output. For the most part it tries to mimic the 60b57cec5SDimitry Andricbehaviour of traditional ELF linkers and specifically the ELF lld port. Where 70b57cec5SDimitry Andricpossible the command line flags and the semantics should be the same. 80b57cec5SDimitry Andric 90b57cec5SDimitry Andric 100b57cec5SDimitry AndricObject file format 110b57cec5SDimitry Andric------------------ 120b57cec5SDimitry Andric 130b57cec5SDimitry AndricThe WebAssembly object file format used by LLVM and LLD is specified as part of 140b57cec5SDimitry Andricthe WebAssembly tool conventions on linking_. 150b57cec5SDimitry Andric 160b57cec5SDimitry AndricThis is the object format that the llvm will produce when run with the 170b57cec5SDimitry Andric``wasm32-unknown-unknown`` target. 180b57cec5SDimitry Andric 190b57cec5SDimitry AndricUsage 200b57cec5SDimitry Andric----- 210b57cec5SDimitry Andric 220b57cec5SDimitry AndricThe WebAssembly version of lld is installed as **wasm-ld**. It shared many 230b57cec5SDimitry Andriccommon linker flags with **ld.lld** but also includes several 240b57cec5SDimitry AndricWebAssembly-specific options: 250b57cec5SDimitry Andric 260b57cec5SDimitry Andric.. option:: --no-entry 270b57cec5SDimitry Andric 280b57cec5SDimitry Andric Don't search for the entry point symbol (by default ``_start``). 290b57cec5SDimitry Andric 300b57cec5SDimitry Andric.. option:: --export-table 310b57cec5SDimitry Andric 320b57cec5SDimitry Andric Export the function table to the environment. 330b57cec5SDimitry Andric 340b57cec5SDimitry Andric.. option:: --import-table 350b57cec5SDimitry Andric 360b57cec5SDimitry Andric Import the function table from the environment. 370b57cec5SDimitry Andric 380b57cec5SDimitry Andric.. option:: --export-all 390b57cec5SDimitry Andric 400b57cec5SDimitry Andric Export all symbols (normally combined with --no-gc-sections) 410b57cec5SDimitry Andric 42e8d8bef9SDimitry Andric Note that this will not export linker-generated mutable globals unless 43e8d8bef9SDimitry Andric the resulting binaryen already includes the 'mutable-globals' features 44e8d8bef9SDimitry Andric since that would otherwise create and invalid binaryen. 45e8d8bef9SDimitry Andric 460b57cec5SDimitry Andric.. option:: --export-dynamic 470b57cec5SDimitry Andric 480b57cec5SDimitry Andric When building an executable, export any non-hidden symbols. By default only 49480093f4SDimitry Andric the entry point and any symbols marked as exports (either via the command line 50480093f4SDimitry Andric or via the `export-name` source attribute) are exported. 510b57cec5SDimitry Andric 520b57cec5SDimitry Andric.. option:: --global-base=<value> 530b57cec5SDimitry Andric 540b57cec5SDimitry Andric Address at which to place global data. 550b57cec5SDimitry Andric 560b57cec5SDimitry Andric.. option:: --no-merge-data-segments 570b57cec5SDimitry Andric 580b57cec5SDimitry Andric Disable merging of data segments. 590b57cec5SDimitry Andric 600b57cec5SDimitry Andric.. option:: --stack-first 610b57cec5SDimitry Andric 620b57cec5SDimitry Andric Place stack at start of linear memory rather than after data. 630b57cec5SDimitry Andric 640b57cec5SDimitry Andric.. option:: --compress-relocations 650b57cec5SDimitry Andric 66480093f4SDimitry Andric Relocation targets in the code section are 5-bytes wide in order to 67480093f4SDimitry Andric potentially accommodate the largest LEB128 value. This option will cause the 68480093f4SDimitry Andric linker to shrink the code section to remove any padding from the final 69480093f4SDimitry Andric output. However because it affects code offset, this option is not 70480093f4SDimitry Andric compatible with outputting debug information. 710b57cec5SDimitry Andric 720b57cec5SDimitry Andric.. option:: --allow-undefined 730b57cec5SDimitry Andric 74e8d8bef9SDimitry Andric Allow undefined symbols in linked binary. This is the legacy 75fe6060f1SDimitry Andric flag which corresponds to ``--unresolve-symbols=ignore`` + 76fe6060f1SDimitry Andric ``--import-undefined``. 77e8d8bef9SDimitry Andric 782efbaac7SDimitry Andric.. option:: --allow-undefined-file=<filename> 792efbaac7SDimitry Andric 802efbaac7SDimitry Andric Like ``--allow-undefined``, but the filename specified a flat list of 812efbaac7SDimitry Andric symbols, one per line, which are allowed to be undefined. 822efbaac7SDimitry Andric 83e8d8bef9SDimitry Andric.. option:: --unresolved-symbols=<method> 84e8d8bef9SDimitry Andric 85e8d8bef9SDimitry Andric This is a more full featured version of ``--allow-undefined``. 86e8d8bef9SDimitry Andric The semanatics of the different methods are as follows: 87e8d8bef9SDimitry Andric 88e8d8bef9SDimitry Andric report-all: 89e8d8bef9SDimitry Andric 90e8d8bef9SDimitry Andric Report all unresolved symbols. This is the default. Normally the linker 91e8d8bef9SDimitry Andric will generate an error message for each reported unresolved symbol but the 92e8d8bef9SDimitry Andric option ``--warn-unresolved-symbols`` can change this to a warning. 93e8d8bef9SDimitry Andric 94e8d8bef9SDimitry Andric ignore-all: 95e8d8bef9SDimitry Andric 96e8d8bef9SDimitry Andric Resolve all undefined symbols to zero. For data and function addresses 97e8d8bef9SDimitry Andric this is trivial. For direct function calls, the linker will generate a 98e8d8bef9SDimitry Andric trapping stub function in place of the undefined function. 99e8d8bef9SDimitry Andric 10081ad6265SDimitry Andric import-dynamic: 10181ad6265SDimitry Andric 10281ad6265SDimitry Andric Undefined symbols generate WebAssembly imports, including undefined data 10381ad6265SDimitry Andric symbols. This is somewhat similar to the --import-undefined option but 10481ad6265SDimitry Andric works all symbol types. This options puts limitations on the type of 10581ad6265SDimitry Andric relocations that are allowed for imported data symbols. Relocations that 10681ad6265SDimitry Andric require absolute data addresses (i.e. All R_WASM_MEMORY_ADDR_I32) will 10781ad6265SDimitry Andric generate an error if they cannot be resolved statically. For clang/llvm 10881ad6265SDimitry Andric this means inputs should be compiled with `-fPIC` (i.e. `pic` or 10981ad6265SDimitry Andric `dynamic-no-pic` relocation models). This options is useful for linking 11081ad6265SDimitry Andric binaries that are themselves static (non-relocatable) but whose undefined 11181ad6265SDimitry Andric symbols are resolved by a dynamic linker. Since the dynamic linking API is 11281ad6265SDimitry Andric experimental, this option currently requires `--experimental-pic` to also 11381ad6265SDimitry Andric be specified. 11481ad6265SDimitry Andric 1150b57cec5SDimitry Andric.. option:: --import-memory 1160b57cec5SDimitry Andric 1170b57cec5SDimitry Andric Import memory from the environment. 1180b57cec5SDimitry Andric 119fe6060f1SDimitry Andric.. option:: --import-undefined 120fe6060f1SDimitry Andric 121fe6060f1SDimitry Andric Generate WebAssembly imports for undefined symbols, where possible. For 122fe6060f1SDimitry Andric example, for function symbols this is always possible, but in general this 123fe6060f1SDimitry Andric is not possible for undefined data symbols. Undefined data symbols will 124fe6060f1SDimitry Andric still be reported as normal (in accordance with ``--unresolved-symbols``). 125fe6060f1SDimitry Andric 1265f757f3fSDimitry Andric.. option:: --initial-heap=<value> 1275f757f3fSDimitry Andric 1285f757f3fSDimitry Andric Initial size of the heap. Default: zero. 1295f757f3fSDimitry Andric 1300b57cec5SDimitry Andric.. option:: --initial-memory=<value> 1310b57cec5SDimitry Andric 1325f757f3fSDimitry Andric Initial size of the linear memory. Default: the sum of stack, static data and heap sizes. 1330b57cec5SDimitry Andric 1340b57cec5SDimitry Andric.. option:: --max-memory=<value> 1350b57cec5SDimitry Andric 1360b57cec5SDimitry Andric Maximum size of the linear memory. Default: unlimited. 1370b57cec5SDimitry Andric 138*0fca6ea1SDimitry Andric.. option:: --no-growable-memory 139*0fca6ea1SDimitry Andric 140*0fca6ea1SDimitry Andric Set maximum size of the linear memory to its initial size, disallowing memory growth. 141*0fca6ea1SDimitry Andric 1420b57cec5SDimitry AndricBy default the function table is neither imported nor exported, but defined 1430b57cec5SDimitry Andricfor internal use only. 1440b57cec5SDimitry Andric 1450b57cec5SDimitry AndricBehaviour 1460b57cec5SDimitry Andric--------- 1470b57cec5SDimitry Andric 1480b57cec5SDimitry AndricIn general, where possible, the WebAssembly linker attempts to emulate the 1490b57cec5SDimitry Andricbehaviour of a traditional ELF linker, and in particular the ELF port of lld. 1500b57cec5SDimitry AndricFor more specific details on how this is achieved see the tool conventions on 1510b57cec5SDimitry Andriclinking_. 1520b57cec5SDimitry Andric 1530b57cec5SDimitry AndricFunction Signatures 1540b57cec5SDimitry Andric~~~~~~~~~~~~~~~~~~~ 1550b57cec5SDimitry Andric 1560b57cec5SDimitry AndricOne way in which the WebAssembly linker differs from traditional native linkers 1570b57cec5SDimitry Andricis that function signature checking is strict in WebAssembly. It is a 1580b57cec5SDimitry Andricvalidation error for a module to contain a call site that doesn't agree with 1590b57cec5SDimitry Andricthe target signature. Even though this is undefined behaviour in C/C++, it is not 1600b57cec5SDimitry Andricuncommon to find this in real-world C/C++ programs. For example, a call site in 1610b57cec5SDimitry Andricone compilation unit which calls a function defined in another compilation 1620b57cec5SDimitry Andricunit but with too many arguments. 1630b57cec5SDimitry Andric 1640b57cec5SDimitry AndricIn order not to generate such invalid modules, lld has two modes of handling such 1650b57cec5SDimitry Andricmismatches: it can simply error-out or it can create stub functions that will 1660b57cec5SDimitry Andrictrap at runtime (functions that contain only an ``unreachable`` instruction) 1670b57cec5SDimitry Andricand use these stub functions at the otherwise invalid call sites. 1680b57cec5SDimitry Andric 1690b57cec5SDimitry AndricThe default behaviour is to generate these stub function and to produce 17085868e8aSDimitry Andrica warning. The ``--fatal-warnings`` flag can be used to disable this behaviour 1710b57cec5SDimitry Andricand error out if mismatched are found. 1720b57cec5SDimitry Andric 1735ffd83dbSDimitry AndricExports 1745ffd83dbSDimitry Andric~~~~~~~ 1750b57cec5SDimitry Andric 1760b57cec5SDimitry AndricWhen building a shared library any symbols marked as ``visibility=default`` will 177480093f4SDimitry Andricbe exported. 178480093f4SDimitry Andric 179480093f4SDimitry AndricWhen building an executable, only the entry point (``_start``) and symbols with 180480093f4SDimitry Andricthe ``WASM_SYMBOL_EXPORTED`` flag are exported by default. In LLVM the 181480093f4SDimitry Andric``WASM_SYMBOL_EXPORTED`` flag is set by the ``wasm-export-name`` attribute which 182480093f4SDimitry Andricin turn can be set using ``__attribute__((export_name))`` clang attribute. 1830b57cec5SDimitry Andric 1840b57cec5SDimitry AndricIn addition, symbols can be exported via the linker command line using 185fe6060f1SDimitry Andric``--export`` (which will error if the symbol is not found) or 186fe6060f1SDimitry Andric``--export-if-defined`` (which will not). 1870b57cec5SDimitry Andric 1880b57cec5SDimitry AndricFinally, just like with native ELF linker the ``--export-dynamic`` flag can be 189480093f4SDimitry Andricused to export symbols in the executable which are marked as 1900b57cec5SDimitry Andric``visibility=default``. 1910b57cec5SDimitry Andric 1925ffd83dbSDimitry AndricImports 1935ffd83dbSDimitry Andric~~~~~~~ 1945ffd83dbSDimitry Andric 1955ffd83dbSDimitry AndricBy default no undefined symbols are allowed in the final binary. The flag 1965ffd83dbSDimitry Andric``--allow-undefined`` results in a WebAssembly import being defined for each 1975ffd83dbSDimitry Andricundefined symbol. It is then up to the runtime to provide such symbols. 1982efbaac7SDimitry Andric``--allow-undefined-file`` is the same but allows a list of symbols to be 1992efbaac7SDimitry Andricspecified. 2005ffd83dbSDimitry Andric 2015ffd83dbSDimitry AndricAlternatively symbols can be marked in the source code as with the 2025ffd83dbSDimitry Andric``import_name`` and/or ``import_module`` clang attributes which signals that 2035ffd83dbSDimitry Andricthey are expected to be undefined at static link time. 2045ffd83dbSDimitry Andric 2052efbaac7SDimitry AndricStub Libraries 2062efbaac7SDimitry Andric~~~~~~~~~~~~~~ 2072efbaac7SDimitry Andric 2082efbaac7SDimitry AndricAnother way to specify imports and exports is via a "stub library". This 2092efbaac7SDimitry Andricfeature is inspired by the ELF stub objects which are supported by the Solaris 2102efbaac7SDimitry Andriclinker. Stub libraries are text files that can be passed as normal linker 2112efbaac7SDimitry Andricinputs, similar to how linker scripts can be passed to the ELF linker. The stub 2122efbaac7SDimitry Andriclibrary is a stand-in for a set of symbols that will be available at runtime, 2132efbaac7SDimitry Andricbut doesn't contain any actual code or data. Instead it contains just a list of 2142efbaac7SDimitry Andricsymbols, one per line. Each symbol can specify zero or more dependencies. 2152efbaac7SDimitry AndricThese dependencies are symbols that must be defined, and exported, by the output 2162efbaac7SDimitry Andricmodule if the symbol is question is imported/required by the output module. 2172efbaac7SDimitry Andric 2182efbaac7SDimitry AndricFor example, imagine the runtime provides an external symbol ``foo`` that 2192efbaac7SDimitry Andricdepends on the ``malloc`` and ``free``. This can be expressed simply as:: 2202efbaac7SDimitry Andric 2212efbaac7SDimitry Andric #STUB 2222efbaac7SDimitry Andric foo: malloc,free 2232efbaac7SDimitry Andric 2242efbaac7SDimitry AndricHere we are saying that ``foo`` is allowed to be imported (undefined) but that 2252efbaac7SDimitry Andricif it is imported, then the output module must also export ``malloc`` and 2262efbaac7SDimitry Andric``free`` to the runtime. If ``foo`` is imported (undefined), but the output 2272efbaac7SDimitry Andricmodule does not define ``malloc`` and ``free`` then the link will fail. 2282efbaac7SDimitry Andric 2292efbaac7SDimitry AndricStub libraries must begin with ``#STUB`` on a line by itself. 2302efbaac7SDimitry Andric 2310b57cec5SDimitry AndricGarbage Collection 2320b57cec5SDimitry Andric~~~~~~~~~~~~~~~~~~ 2330b57cec5SDimitry Andric 2340b57cec5SDimitry AndricSince WebAssembly is designed with size in mind the linker defaults to 2350b57cec5SDimitry Andric``--gc-sections`` which means that all unused functions and data segments will 2360b57cec5SDimitry Andricbe stripped from the binary. 2370b57cec5SDimitry Andric 2380b57cec5SDimitry AndricThe symbols which are preserved by default are: 2390b57cec5SDimitry Andric 2400b57cec5SDimitry Andric- The entry point (by default ``_start``). 2410b57cec5SDimitry Andric- Any symbol which is to be exported. 2420b57cec5SDimitry Andric- Any symbol transitively referenced by the above. 2430b57cec5SDimitry Andric 2440b57cec5SDimitry AndricWeak Undefined Functions 2450b57cec5SDimitry Andric~~~~~~~~~~~~~~~~~~~~~~~~ 2460b57cec5SDimitry Andric 2470b57cec5SDimitry AndricOn native platforms, calls to weak undefined functions end up as calls to the 2480b57cec5SDimitry Andricnull function pointer. With WebAssembly, direct calls must reference a defined 2490b57cec5SDimitry Andricfunction (with the correct signature). In order to handle this case the linker 2500b57cec5SDimitry Andricwill generate function a stub containing only the ``unreachable`` instruction 2510b57cec5SDimitry Andricand use this for any direct references to an undefined weak function. 2520b57cec5SDimitry Andric 2530b57cec5SDimitry AndricFor example a runtime call to a weak undefined function ``foo`` will up trapping 2540b57cec5SDimitry Andricon ``unreachable`` inside and linker-generated function called 2550b57cec5SDimitry Andric``undefined:foo``. 2560b57cec5SDimitry Andric 2570b57cec5SDimitry AndricMissing features 2580b57cec5SDimitry Andric---------------- 2590b57cec5SDimitry Andric 2600b57cec5SDimitry Andric- Merging of data section similar to ``SHF_MERGE`` in the ELF world is not 2610b57cec5SDimitry Andric supported. 2620b57cec5SDimitry Andric- No support for creating shared libraries. The spec for shared libraries in 2630b57cec5SDimitry Andric WebAssembly is still in flux: 264349cc55cSDimitry Andric https://github.com/WebAssembly/tool-conventions/blob/main/DynamicLinking.md 2650b57cec5SDimitry Andric 266349cc55cSDimitry Andric.. _linking: https://github.com/WebAssembly/tool-conventions/blob/main/Linking.md 267