1.. SPDX-License-Identifier: GPL-2.0-only OR MIT 2.. Copyright (C) 2025 TNG Technology Consulting GmbH 3 4KernelSbom 5========== 6 7Introduction 8------------ 9 10KernelSbom is a Python script ``scripts/sbom/sbom.py`` that can be 11executed after a successful kernel build. When invoked, KernelSbom 12analyzes all files involved in the build and generates Software Bill of 13Materials (SBOM) documents in SPDX 3.0.1 format. 14The generated SBOM documents capture: 15 16* **Final output artifacts**, typically the kernel image and modules 17* **All source files** that contributed to the build with metadata 18 and licensing information 19* **Details of the build process**, including intermediate artifacts 20 and the build commands linking source files to the final output 21 artifacts 22 23KernelSbom is originally developed in the 24`KernelSbom repository <https://github.com/TNG/KernelSbom>`_. 25 26Requirements 27------------ 28 29Python 3.10 or later. No libraries or other dependencies are required. 30 31Basic Usage 32----------- 33 34Run the ``make sbom`` target. 35For example:: 36 37 $ make defconfig O=kernel_build 38 $ make sbom O=kernel_build -j$(nproc) 39 40This will trigger a kernel build. After all build outputs have been 41generated, KernelSbom produces three SPDX documents in the root 42directory of the object tree: 43 44* ``sbom-source.spdx.json`` 45 Describes all source files involved in the build and 46 associates each file with its corresponding license expression. 47 48* ``sbom-output.spdx.json`` 49 Captures all final build outputs (kernel image and ``.ko`` module files) 50 and includes build metadata such as environment variables and 51 a hash of the ``.config`` file used for the build. 52 53* ``sbom-build.spdx.json`` 54 Imports files from the source and output documents and describes every 55 intermediate build artifact. For each artifact, it records the exact 56 build command used and establishes the relationship between 57 input files and generated outputs. 58 59When invoking the sbom target, it is recommended to perform 60out-of-tree builds using ``O=<objtree>``. KernelSbom classifies files as 61source files when they are located in the source tree and not in the 62object tree. For in-tree builds, where the source and object trees are 63the same directory, this distinction can no longer be made reliably. 64In that case, KernelSbom does not generate a dedicated source SBOM. 65Instead, source files are included in the build SBOM. 66 67Standalone Usage 68---------------- 69 70KernelSbom can also be used as a standalone script to generate 71SPDX documents for specific build outputs. For example, after a 72successful x86 kernel build, KernelSbom can generate SPDX documents 73for the ``bzImage`` kernel image:: 74 75 $ SRCARCH=x86 python3 scripts/sbom/sbom.py \ 76 --src-tree . \ 77 --obj-tree ./kernel_build \ 78 --roots arch/x86/boot/bzImage \ 79 --generate-spdx \ 80 --generate-used-files \ 81 --prettify-json \ 82 --debug 83 84Note that when KernelSbom is invoked outside of the ``make`` process, 85the environment variables used during compilation are not available and 86therefore cannot be included in the generated SPDX documents. It is 87recommended to set at least the ``SRCARCH`` environment variable to the 88architecture for which the build was performed. 89 90For a full list of command-line options, run:: 91 92 $ python3 scripts/sbom/sbom.py --help 93 94Output Format 95------------- 96 97KernelSbom generates documents conforming to the 98`SPDX 3.0.1 specification <https://spdx.github.io/spdx-spec/v3.0.1/>`_ 99serialized as JSON-LD. 100 101To reduce file size, the output documents use the JSON-LD ``@context`` 102to define custom prefixes for ``spdxId`` values. While this is compliant 103with the SPDX specification, only a limited number of tools in the 104current SPDX ecosystem support custom JSON-LD contexts. To use such 105tools with the generated documents, the custom JSON-LD context must 106be expanded before providing the documents. 107See https://lists.spdx.org/g/Spdx-tech/message/6064 for more information. 108 109How it Works 110------------ 111 112KernelSbom operates in two major phases: 113 1141. **Generate the cmd graph**, an acyclic directed dependency graph. 1152. **Generate SPDX documents** based on the cmd graph. 116 117KernelSbom begins from the root artifacts specified by the user, e.g., 118``arch/x86/boot/bzImage``. For each root artifact, it collects all 119dependencies required to build that artifact. The dependencies come 120from multiple sources: 121 122* **.cmd files**: The primary source is the ``.cmd`` file of the 123 generated artifact, e.g., ``arch/x86/boot/.bzImage.cmd``. These files 124 contain the exact command used to build the artifact and often include 125 an explicit list of input dependencies. By parsing the ``.cmd`` 126 file, the full list of dependencies can be obtained. 127 128* **.incbin statements**: The second source are include binary 129 ``.incbin`` statements in ``.S`` assembly files. 130 131* **Hardcoded dependencies**: Unfortunately, not all build dependencies 132 can be found via ``.cmd`` files and ``.incbin`` statements. Some build 133 dependencies are directly defined in Makefiles or Kbuild files. 134 Parsing these files is considered too complex for the scope of this 135 project. Instead, the remaining gaps of the graph are filled using a 136 list of manually defined dependencies, see 137 ``scripts/sbom/sbom/cmd_graph/hardcoded_dependencies.py``. This list is 138 known to be incomplete. However, analysis of the cmd graph indicates a 139 ~99% completeness. For more information about the completeness analysis, 140 see `KernelSbom #95 <https://github.com/TNG/KernelSbom/issues/95>`_. 141 142Given the list of dependency files, KernelSbom recursively processes 143each file, expanding the dependency chain all the way to the version 144controlled source files. The result is a complete dependency graph 145where nodes represent files, and edges represent "file A was used to 146build file B" relationships. 147 148Using the cmd graph, KernelSbom produces three SPDX documents. 149For every file in the graph, KernelSbom: 150 151* Parses ``SPDX-License-Identifier`` headers, 152* Computes file hashes, 153* Estimates the file type based on extension and path, 154* Records build relationships between files. 155 156Each root output file is additionally associated with an SPDX Package 157element that captures version information, license data, and copyright. 158 159Advanced Usage 160-------------- 161 162Including Kernel Modules 163~~~~~~~~~~~~~~~~~~~~~~~~ 164 165The list of all ``.ko`` kernel modules produced during a build can be 166extracted from the ``modules.order`` file within the object tree. 167For example:: 168 169 $ echo "arch/x86/boot/bzImage" > sbom-roots.txt 170 $ sed 's/\.o$/.ko/' ./kernel_build/modules.order >> sbom-roots.txt 171 172Then use the generated roots file:: 173 174 $ SRCARCH=x86 python3 scripts/sbom/sbom.py \ 175 --src-tree . \ 176 --obj-tree ./kernel_build \ 177 --roots-file sbom-roots.txt \ 178 --generate-spdx 179 180Equal Source and Object Trees 181~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 182 183When the source tree and object tree are identical (for example, when 184building in-tree), source files can no longer be reliably distinguished 185from generated files. 186In this scenario, KernelSbom does not produce a dedicated 187``sbom-source.spdx.json`` document. Instead, both source files and build 188artifacts are included together in ``sbom-build.spdx.json``, and 189``sbom.used-files.txt`` lists all files referenced in the build document. 190 191Unknown Build Commands 192~~~~~~~~~~~~~~~~~~~~~~ 193 194Because the kernel supports a wide range of configurations and versions, 195KernelSbom may encounter build commands in ``.cmd`` files that it does 196not yet support. By default, KernelSbom will fail if an unknown build 197command is encountered. 198 199If you still wish to generate SPDX documents despite unsupported 200commands, you can use the ``--do-not-fail-on-unknown-build-command`` 201option. KernelSbom will continue and produce the documents, although 202the resulting SBOM will be incomplete. 203 204This option should only be used when the missing portion of the 205dependency graph is small and an incomplete SBOM is acceptable for 206your use case. 207