xref: /linux/Documentation/tools/sbom/sbom.rst (revision 805185b7c7a1069e407b6f7b3bc98e44d415f484)
1.. SPDX-License-Identifier: GPL-2.0-only OR MIT
2.. Copyright (C) 2025 TNG Technology Consulting GmbH
3
4KernelSbom
5==========
6
7Introduction
8------------
9
10KernelSbom is a Python script ``scripts/sbom/sbom.py`` that can be
11executed after a successful kernel build. When invoked, KernelSbom
12analyzes all files involved in the build and generates Software Bill of
13Materials (SBOM) documents in SPDX 3.0.1 format.
14The generated SBOM documents capture:
15
16* **Final output artifacts**, typically the kernel image and modules
17* **All source files** that contributed to the build with metadata
18  and licensing information
19* **Details of the build process**, including intermediate artifacts
20  and the build commands linking source files to the final output
21  artifacts
22
23KernelSbom is originally developed in the
24`KernelSbom repository <https://github.com/TNG/KernelSbom>`_.
25
26Requirements
27------------
28
29Python 3.10 or later. No libraries or other dependencies are required.
30
31Basic Usage
32-----------
33
34Run the ``make sbom`` target.
35For example::
36
37    $ make defconfig O=kernel_build
38    $ make sbom O=kernel_build -j$(nproc)
39
40This will trigger a kernel build. After all build outputs have been
41generated, KernelSbom produces three SPDX documents in the root
42directory of the object tree:
43
44* ``sbom-source.spdx.json``
45  Describes all source files involved in the build and
46  associates each file with its corresponding license expression.
47
48* ``sbom-output.spdx.json``
49  Captures all final build outputs (kernel image and ``.ko`` module files)
50  and includes build metadata such as environment variables and
51  a hash of the ``.config`` file used for the build.
52
53* ``sbom-build.spdx.json``
54  Imports files from the source and output documents and describes every
55  intermediate build artifact. For each artifact, it records the exact
56  build command used and establishes the relationship between
57  input files and generated outputs.
58
59When invoking the sbom target, it is recommended to perform
60out-of-tree builds using ``O=<objtree>``. KernelSbom classifies files as
61source files when they are located in the source tree and not in the
62object tree. For in-tree builds, where the source and object trees are
63the same directory, this distinction can no longer be made reliably.
64In that case, KernelSbom does not generate a dedicated source SBOM.
65Instead, source files are included in the build SBOM.
66
67Standalone Usage
68----------------
69
70KernelSbom can also be used as a standalone script to generate
71SPDX documents for specific build outputs. For example, after a
72successful x86 kernel build, KernelSbom can generate SPDX documents
73for the ``bzImage`` kernel image::
74
75    $ SRCARCH=x86 python3 scripts/sbom/sbom.py \
76        --src-tree . \
77        --obj-tree ./kernel_build \
78        --roots arch/x86/boot/bzImage \
79        --generate-spdx \
80        --generate-used-files \
81        --prettify-json \
82        --debug
83
84Note that when KernelSbom is invoked outside of the ``make`` process,
85the environment variables used during compilation are not available and
86therefore cannot be included in the generated SPDX documents. It is
87recommended to set at least the ``SRCARCH`` environment variable to the
88architecture for which the build was performed.
89
90For a full list of command-line options, run::
91
92    $ python3 scripts/sbom/sbom.py --help
93
94Output Format
95-------------
96
97KernelSbom generates documents conforming to the
98`SPDX 3.0.1 specification <https://spdx.github.io/spdx-spec/v3.0.1/>`_
99serialized as JSON-LD.
100
101To reduce file size, the output documents use the JSON-LD ``@context``
102to define custom prefixes for ``spdxId`` values. While this is compliant
103with the SPDX specification, only a limited number of tools in the
104current SPDX ecosystem support custom JSON-LD contexts. To use such
105tools with the generated documents, the custom JSON-LD context must
106be expanded before providing the documents.
107See https://lists.spdx.org/g/Spdx-tech/message/6064 for more information.
108
109How it Works
110------------
111
112KernelSbom operates in two major phases:
113
1141. **Generate the cmd graph**, an acyclic directed dependency graph.
1152. **Generate SPDX documents** based on the cmd graph.
116
117KernelSbom begins from the root artifacts specified by the user, e.g.,
118``arch/x86/boot/bzImage``. For each root artifact, it collects all
119dependencies required to build that artifact. The dependencies come
120from multiple sources:
121
122* **.cmd files**: The primary source is the ``.cmd`` file of the
123  generated artifact, e.g., ``arch/x86/boot/.bzImage.cmd``. These files
124  contain the exact command used to build the artifact and often include
125  an explicit list of input dependencies. By parsing the ``.cmd``
126  file, the full list of dependencies can be obtained.
127
128* **.incbin statements**: The second source are include binary
129  ``.incbin`` statements in ``.S`` assembly files.
130
131* **Hardcoded dependencies**: Unfortunately, not all build dependencies
132  can be found via ``.cmd`` files and ``.incbin`` statements. Some build
133  dependencies are directly defined in Makefiles or Kbuild files.
134  Parsing these files is considered too complex for the scope of this
135  project. Instead, the remaining gaps of the graph are filled using a
136  list of manually defined dependencies, see
137  ``scripts/sbom/sbom/cmd_graph/hardcoded_dependencies.py``. This list is
138  known to be incomplete. However, analysis of the cmd graph indicates a
139  ~99% completeness. For more information about the completeness analysis,
140  see `KernelSbom #95 <https://github.com/TNG/KernelSbom/issues/95>`_.
141
142Given the list of dependency files, KernelSbom recursively processes
143each file, expanding the dependency chain all the way to the version
144controlled source files. The result is a complete dependency graph
145where nodes represent files, and edges represent "file A was used to
146build file B" relationships.
147
148Using the cmd graph, KernelSbom produces three SPDX documents.
149For every file in the graph, KernelSbom:
150
151* Parses ``SPDX-License-Identifier`` headers,
152* Computes file hashes,
153* Estimates the file type based on extension and path,
154* Records build relationships between files.
155
156Each root output file is additionally associated with an SPDX Package
157element that captures version information, license data, and copyright.
158
159Advanced Usage
160--------------
161
162Including Kernel Modules
163~~~~~~~~~~~~~~~~~~~~~~~~
164
165The list of all ``.ko`` kernel modules produced during a build can be
166extracted from the ``modules.order`` file within the object tree.
167For example::
168
169    $ echo "arch/x86/boot/bzImage" > sbom-roots.txt
170    $ sed 's/\.o$/.ko/' ./kernel_build/modules.order >> sbom-roots.txt
171
172Then use the generated roots file::
173
174    $ SRCARCH=x86 python3 scripts/sbom/sbom.py \
175        --src-tree . \
176        --obj-tree ./kernel_build \
177        --roots-file sbom-roots.txt \
178        --generate-spdx
179
180Equal Source and Object Trees
181~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
182
183When the source tree and object tree are identical (for example, when
184building in-tree), source files can no longer be reliably distinguished
185from generated files.
186In this scenario, KernelSbom does not produce a dedicated
187``sbom-source.spdx.json`` document. Instead, both source files and build
188artifacts are included together in ``sbom-build.spdx.json``, and
189``sbom.used-files.txt`` lists all files referenced in the build document.
190
191Unknown Build Commands
192~~~~~~~~~~~~~~~~~~~~~~
193
194Because the kernel supports a wide range of configurations and versions,
195KernelSbom may encounter build commands in ``.cmd`` files that it does
196not yet support. By default, KernelSbom will fail if an unknown build
197command is encountered.
198
199If you still wish to generate SPDX documents despite unsupported
200commands, you can use the ``--do-not-fail-on-unknown-build-command``
201option. KernelSbom will continue and produce the documents, although
202the resulting SBOM will be incomplete.
203
204This option should only be used when the missing portion of the
205dependency graph is small and an incomplete SBOM is acceptable for
206your use case.
207