xref: /linux/Documentation/kbuild/gendwarfksyms.rst (revision be1ca3ee8f97067fee87fda73ea5959d5ab75bbf)
1=======================
2DWARF module versioning
3=======================
4
5Introduction
6============
7
8When CONFIG_MODVERSIONS is enabled, symbol versions for modules
9are typically calculated from preprocessed source code using the
10**genksyms** tool.  However, this is incompatible with languages such
11as Rust, where the source code has insufficient information about
12the resulting ABI. With CONFIG_GENDWARFKSYMS (and CONFIG_DEBUG_INFO)
13selected, **gendwarfksyms** is used instead to calculate symbol versions
14from the DWARF debugging information, which contains the necessary
15details about the final module ABI.
16
17Dependencies
18------------
19
20gendwarfksyms depends on the libelf, libdw, and zlib libraries.
21
22Here are a few examples of how to install these dependencies:
23
24* Arch Linux and derivatives::
25
26	sudo pacman --needed -S libelf zlib
27
28* Debian, Ubuntu, and derivatives::
29
30	sudo apt install libelf-dev libdw-dev zlib1g-dev
31
32* Fedora and derivatives::
33
34	sudo dnf install elfutils-libelf-devel elfutils-devel zlib-devel
35
36* openSUSE and derivatives::
37
38	sudo zypper install libelf-devel libdw-devel zlib-devel
39
40Usage
41-----
42
43gendwarfksyms accepts a list of object files on the command line, and a
44list of symbol names (one per line) in standard input::
45
46	Usage: gendwarfksyms [options] elf-object-file ... < symbol-list
47
48	Options:
49	  -d, --debug          Print debugging information
50	      --dump-dies      Dump DWARF DIE contents
51	      --dump-die-map   Print debugging information about die_map changes
52	      --dump-types     Dump type strings
53	      --dump-versions  Dump expanded type strings used for symbol versions
54	  -s, --stable         Support kABI stability features
55	  -T, --symtypes file  Write a symtypes file
56	  -h, --help           Print this message
57
58
59Type information availability
60=============================
61
62While symbols are typically exported in the same translation unit (TU)
63where they're defined, it's also perfectly fine for a TU to export
64external symbols. For example, this is done when calculating symbol
65versions for exports in stand-alone assembly code.
66
67To ensure the compiler emits the necessary DWARF type information in the
68TU where symbols are actually exported, gendwarfksyms adds a pointer
69to exported symbols in the `EXPORT_SYMBOL()` macro using the following
70macro::
71
72	#define __GENDWARFKSYMS_EXPORT(sym)				\
73		static typeof(sym) *__gendwarfksyms_ptr_##sym __used	\
74			__section(".discard.gendwarfksyms") = &sym;
75
76
77When a symbol pointer is found in DWARF, gendwarfksyms can use its
78type for calculating symbol versions even if the symbol is defined
79elsewhere. The name of the symbol pointer is expected to start with
80`__gendwarfksyms_ptr_`, followed by the name of the exported symbol.
81
82Symtypes output format
83======================
84
85Similarly to genksyms, gendwarfksyms supports writing a symtypes
86file for each processed object that contain types for exported
87symbols and each referenced type that was used in calculating symbol
88versions. These files can be useful when trying to determine what
89exactly caused symbol versions to change between builds. To generate
90symtypes files during a kernel build, set `KBUILD_SYMTYPES=1`.
91
92Matching the existing format, the first column of each line contains
93either a type reference or a symbol name. Type references have a
94one-letter prefix followed by "#" and the name of the type. Four
95reference types are supported::
96
97	e#<type> = enum
98	s#<type> = struct
99	t#<type> = typedef
100	u#<type> = union
101
102Type names with spaces in them are wrapped in single quotes, e.g.::
103
104	s#'core::result::Result<u8, core::num::error::ParseIntError>'
105
106The rest of the line contains a type string. Unlike with genksyms that
107produces C-style type strings, gendwarfksyms uses the same simple parsed
108DWARF format produced by **--dump-dies**, but with type references
109instead of fully expanded strings.
110
111Maintaining a stable kABI
112=========================
113
114Distribution maintainers often need the ability to make ABI compatible
115changes to kernel data structures due to LTS updates or backports. Using
116the traditional `#ifndef __GENKSYMS__` to hide these changes from symbol
117versioning won't work when processing object files. To support this
118use case, gendwarfksyms provides kABI stability features designed to
119hide changes that won't affect the ABI when calculating versions. These
120features are all gated behind the **--stable** command line flag and are
121not used in the mainline kernel. To use stable features during a kernel
122build, set `KBUILD_GENDWARFKSYMS_STABLE=1`.
123
124Examples for using these features are provided in the
125**scripts/gendwarfksyms/examples** directory, including helper macros
126for source code annotation. Note that as these features are only used to
127transform the inputs for symbol versioning, the user is responsible for
128ensuring that their changes actually won't break the ABI.
129
130kABI rules
131----------
132
133kABI rules allow distributions to fine-tune certain parts
134of gendwarfksyms output and thus control how symbol
135versions are calculated. These rules are defined in the
136`.discard.gendwarfksyms.kabi_rules` section of the object file and
137consist of simple null-terminated strings with the following structure::
138
139	version\0type\0target\0value\0
140
141This string sequence is repeated as many times as needed to express all
142the rules. The fields are as follows:
143
144- `version`: Ensures backward compatibility for future changes to the
145  structure. Currently expected to be "1".
146- `type`: Indicates the type of rule being applied.
147- `target`: Specifies the target of the rule, typically the fully
148  qualified name of the DWARF Debugging Information Entry (DIE).
149- `value`: Provides rule-specific data.
150
151The following helper macros, for example, can be used to specify rules
152in the source code::
153
154	#define ___KABI_RULE(hint, target, value)                            \
155		static const char __PASTE(__gendwarfksyms_rule_,             \
156					  __COUNTER__)[] __used __aligned(1) \
157			__section(".discard.gendwarfksyms.kabi_rules") =     \
158				"1\0" #hint "\0" target "\0" value
159
160	#define __KABI_RULE(hint, target, value) \
161		___KABI_RULE(hint, #target, #value)
162
163
164Currently, only the rules discussed in this section are supported, but
165the format is extensible enough to allow further rules to be added as
166need arises.
167
168Managing definition visibility
169~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
170
171A declaration can change into a full definition when additional includes
172are pulled into the translation unit. This changes the versions of any
173symbol that references the type even if the ABI remains unchanged. As
174it may not be possible to drop includes without breaking the build, the
175`declonly` rule can be used to specify a type as declaration-only, even
176if the debugging information contains the full definition.
177
178The rule fields are expected to be as follows:
179
180- `type`: "declonly"
181- `target`: The fully qualified name of the target data structure
182  (as shown in **--dump-dies** output).
183- `value`: This field is ignored.
184
185Using the `__KABI_RULE` macro, this rule can be defined as::
186
187	#define KABI_DECLONLY(fqn) __KABI_RULE(declonly, fqn, )
188
189Example usage::
190
191	struct s {
192		/* definition */
193	};
194
195	KABI_DECLONLY(s);
196
197Adding enumerators
198~~~~~~~~~~~~~~~~~~
199
200For enums, all enumerators and their values are included in calculating
201symbol versions, which becomes a problem if we later need to add more
202enumerators without changing symbol versions. The `enumerator_ignore`
203rule allows us to hide named enumerators from the input.
204
205The rule fields are expected to be as follows:
206
207- `type`: "enumerator_ignore"
208- `target`: The fully qualified name of the target enum
209  (as shown in **--dump-dies** output) and the name of the
210  enumerator field separated by a space.
211- `value`: This field is ignored.
212
213Using the `__KABI_RULE` macro, this rule can be defined as::
214
215	#define KABI_ENUMERATOR_IGNORE(fqn, field) \
216		__KABI_RULE(enumerator_ignore, fqn field, )
217
218Example usage::
219
220	enum e {
221		A, B, C, D,
222	};
223
224	KABI_ENUMERATOR_IGNORE(e, B);
225	KABI_ENUMERATOR_IGNORE(e, C);
226
227If the enum additionally includes an end marker and new values must
228be added in the middle, we may need to use the old value for the last
229enumerator when calculating versions. The `enumerator_value` rule allows
230us to override the value of an enumerator for version calculation:
231
232- `type`: "enumerator_value"
233- `target`: The fully qualified name of the target enum
234  (as shown in **--dump-dies** output) and the name of the
235  enumerator field separated by a space.
236- `value`: Integer value used for the field.
237
238Using the `__KABI_RULE` macro, this rule can be defined as::
239
240	#define KABI_ENUMERATOR_VALUE(fqn, field, value) \
241		__KABI_RULE(enumerator_value, fqn field, value)
242
243Example usage::
244
245	enum e {
246		A, B, C, LAST,
247	};
248
249	KABI_ENUMERATOR_IGNORE(e, C);
250	KABI_ENUMERATOR_VALUE(e, LAST, 2);
251
252Managing structure size changes
253~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
254
255A data structure can be partially opaque to modules if its allocation is
256handled by the core kernel, and modules only need to access some of its
257members. In this situation, it's possible to append new members to the
258structure without breaking the ABI, as long as the layout for the original
259members remains unchanged.
260
261To append new members, we can hide them from symbol versioning as
262described in section :ref:`Hiding members <hiding_members>`, but we can't
263hide the increase in structure size. The `byte_size` rule allows us to
264override the structure size used for symbol versioning.
265
266The rule fields are expected to be as follows:
267
268- `type`: "byte_size"
269- `target`: The fully qualified name of the target data structure
270  (as shown in **--dump-dies** output).
271- `value`: A positive decimal number indicating the structure size
272  in bytes.
273
274Using the `__KABI_RULE` macro, this rule can be defined as::
275
276	#define KABI_BYTE_SIZE(fqn, value) \
277		__KABI_RULE(byte_size, fqn, value)
278
279Example usage::
280
281	struct s {
282		/* Unchanged original members */
283		unsigned long a;
284		void *p;
285
286		/* Appended new members */
287		KABI_IGNORE(0, unsigned long n);
288	};
289
290	KABI_BYTE_SIZE(s, 16);
291
292Overriding type strings
293~~~~~~~~~~~~~~~~~~~~~~~
294
295In rare situations where distributions must make significant changes to
296otherwise opaque data structures that have inadvertently been included
297in the published ABI, keeping symbol versions stable using the more
298targeted kABI rules can become tedious. The `type_string` rule allows us
299to override the full type string for a type or a symbol, and even add
300types for versioning that no longer exist in the kernel.
301
302The rule fields are expected to be as follows:
303
304- `type`: "type_string"
305- `target`: The fully qualified name of the target data structure
306  (as shown in **--dump-dies** output) or symbol.
307- `value`: A valid type string (as shown in **--symtypes**) output)
308  to use instead of the real type.
309
310Using the `__KABI_RULE` macro, this rule can be defined as::
311
312	#define KABI_TYPE_STRING(type, str) \
313		___KABI_RULE("type_string", type, str)
314
315Example usage::
316
317	/* Override type for a structure */
318	KABI_TYPE_STRING("s#s",
319		"structure_type s { "
320			"member base_type int byte_size(4) "
321				"encoding(5) n "
322			"data_member_location(0) "
323		"} byte_size(8)");
324
325	/* Override type for a symbol */
326	KABI_TYPE_STRING("my_symbol", "variable s#s");
327
328The `type_string` rule should be used only as a last resort if maintaining
329a stable symbol versions cannot be reasonably achieved using other
330means. Overriding a type string increases the risk of actual ABI breakages
331going unnoticed as it hides all changes to the type.
332
333Adding structure members
334------------------------
335
336Perhaps the most common ABI compatible change is adding a member to a
337kernel data structure. When changes to a structure are anticipated,
338distribution maintainers can pre-emptively reserve space in the
339structure and take it into use later without breaking the ABI. If
340changes are needed to data structures without reserved space, existing
341alignment holes can potentially be used instead. While kABI rules could
342be added for these type of changes, using unions is typically a more
343natural method. This section describes gendwarfksyms support for using
344reserved space in data structures and hiding members that don't change
345the ABI when calculating symbol versions.
346
347Reserving space and replacing members
348~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
349
350Space is typically reserved for later use by appending integer types, or
351arrays, to the end of the data structure, but any type can be used. Each
352reserved member needs a unique name, but as the actual purpose is usually
353not known at the time the space is reserved, for convenience, names that
354start with `__kabi_` are left out when calculating symbol versions::
355
356	struct s {
357		long a;
358		long __kabi_reserved_0; /* reserved for future use */
359	};
360
361The reserved space can be taken into use by wrapping the member in a
362union, which includes the original type and the replacement member::
363
364	struct s {
365		long a;
366		union {
367			long __kabi_reserved_0; /* original type */
368			struct b b; /* replaced field */
369		};
370	};
371
372If the `__kabi_` naming scheme was used when reserving space, the name
373of the first member of the union must start with `__kabi_reserved`. This
374ensures the original type is used when calculating versions, but the name
375is again left out. The rest of the union is ignored.
376
377If we're replacing a member that doesn't follow this naming convention,
378we also need to preserve the original name to avoid changing versions,
379which we can do by changing the first union member's name to start with
380`__kabi_renamed` followed by the original name.
381
382The examples include `KABI_(RESERVE|USE|REPLACE)*` macros that help
383simplify the process and also ensure the replacement member is correctly
384aligned and its size won't exceed the reserved space.
385
386.. _hiding_members:
387
388Hiding members
389~~~~~~~~~~~~~~
390
391Predicting which structures will require changes during the support
392timeframe isn't always possible, in which case one might have to resort
393to placing new members into existing alignment holes::
394
395	struct s {
396		int a;
397		/* a 4-byte alignment hole */
398		unsigned long b;
399	};
400
401
402While this won't change the size of the data structure, one needs to
403be able to hide the added members from symbol versioning. Similarly
404to reserved fields, this can be accomplished by wrapping the added
405member to a union where one of the fields has a name starting with
406`__kabi_ignored`::
407
408	struct s {
409		int a;
410		union {
411			char __kabi_ignored_0;
412			int n;
413		};
414		unsigned long b;
415	};
416
417With **--stable**, both versions produce the same symbol version. The
418examples include a `KABI_IGNORE` macro to simplify the code.
419