xref: /freebsd/share/man/man9/kmsan.9 (revision a3266ba2697a383d2ede56803320d941866c7e76)
1.\"-
2.\" Copyright (c) 2021 The FreeBSD Foundation
3.\"
4.\" This documentation was written by Mark Johnston under sponsorship from
5.\" the FreeBSD Foundation.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that the following conditions
9.\" are met:
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\" 2. Redistributions in binary form must reproduce the above copyright
13.\"    notice, this list of conditions and the following disclaimer in the
14.\"    documentation and/or other materials provided with the distribution.
15.\"
16.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
19.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
26.\" SUCH DAMAGE.
27.\"
28.\" $FreeBSD$
29.\"
30.Dd August 10, 2021
31.Dt KMSAN 9
32.Os
33.Sh NAME
34.Nm KMSAN
35.Nd Kernel Memory SANitizer
36.Sh SYNOPSIS
37The
38.Pa GENERIC-KMSAN
39kernel configuration can be used to compile a KMSAN-enabled kernel using
40.Pa GENERIC
41as a base configuration.
42Alternately, to compile KMSAN into the kernel, place the following line in your
43kernel configuration file:
44.Bd -ragged -offset indent
45.Cd "options KMSAN"
46.Ed
47.Pp
48.In sys/msan.h
49.Ft void
50.Fn kmsan_mark "const void *addr" "size_t size" "uint8_t code"
51.Ft void
52.Fn kmsan_orig "const void *addr" "size_t size" "int type" "uintptr_t pc"
53.Ft void
54.Fn kmsan_check "const void *addr" "size_t size" "const char *descr"
55.Ft void
56.Fn kmsan_check_bio "const struct bio *" "const char *descr"
57.Ft void
58.Fn kmsan_check_ccb "const union ccb *" "const char *descr"
59.Ft void
60.Fn kmsan_check_mbuf "const struct mbuf *" "const char *descr"
61.Sh DESCRIPTION
62.Nm
63is a subsystem which leverages compiler instrumentation to detect uses of
64uninitialized memory in the kernel.
65Currently it is implemented only on the amd64 platform.
66.Pp
67When
68.Nm
69is compiled into the kernel, the compiler is configured to emit function
70calls preceding memory accesses.
71The functions are implemented by the
72.Nm
73runtime component and use hidden, byte-granular shadow state to determine
74whether the source operand has been initialized.
75When uninitialized memory is used as a source operand in certain operations,
76such as control flow expressions or memory accesses, the runtime reports
77an error.
78Otherwise, the shadow state is propagated to destination operand.
79For example, a
80variable assignment or a
81.Fn memcpy
82call which copies uninitialized memory will cause the destination buffer or
83variable to be marked uninitialized.
84.Pp
85To report an error, the
86.Nm
87runtime will either trigger a kernel panic or print a message to the console,
88depending on the value of the
89.Sy debug.kmsan.panic_on_violation
90sysctl.
91In both cases, a stack trace and information about the origin of the
92uninitialized memory is included.
93.Pp
94In addition to compiler-detected uses of uninitialized memory,
95various kernel I/O
96.Dq exit points ,
97such as
98.Xr copyout 9 ,
99perform validation of the input's shadow state and will raise an error if
100any uninitialized bytes are detected.
101.Pp
102The
103.Nm
104option imposes a significant performance penalty.
105Kernel code typically runs two or three times slower, and each byte mapped in
106the kernel map requires two bytes of shadow state.
107As a result,
108.Nm
109should be used only for kernel testing and development.
110It is not recommended to enable
111.Nm
112in systems with less than 8GB of physical RAM.
113.Sh FUNCTIONS
114The
115.Fn kmsan_mark
116and
117.Fn kmsan_orig
118functions update
119.Nm
120shadow state.
121.Fn kmsan_mark
122marks an address range as valid or invalid according to the value of the
123.Va code
124parameter.
125The valid values for this parameter are
126.Dv KMSAN_STATE_INITED
127and
128.Dv KMSAN_STATE_UNINIT ,
129which mark the range as initialized and uninitialized, respectively.
130For example, when a piece of memory is freed to a kernel allocator, it will
131typically have been marked initialized; before the memory is reused for a new
132allocation, the allocator should mark it as uninitialized.
133As another example, writes to host memory performed by devices, e.g., via DMA,
134are not intercepted by the sanitizer; to avoid false positives, drivers should
135mark device-written memory as initialized.
136For many drivers this is handled internally by the
137.Xr busdma 9
138subsystem.
139.Pp
140The
141.Fn kmsan_orig
142function updates
143.Dq origin
144shadow state.
145In particular, it associates a given uninitialized buffer with a memory type
146and code address.
147This is used by the
148.Nm
149runtime to track the source of uninitialized memory and is only for debugging
150purposes.
151See
152.Sx IMPLEMENTATION NOTES
153for more details.
154.Pp
155The
156.Fn kmsan_check
157function and its sub-typed siblings validate the shadow state of the region(s)
158of kernel memory passed as input parameters.
159If any byte of the input is marked as uninitialized, the runtime will generate
160a report.
161These functions are useful during debugging, as they can be strategically
162inserted into code paths to narrow down the source of uninitialized memory.
163They are also used to perform validation in various kernel I/O paths, helping
164ensure that, for example, packets transmitted over a network do not contain
165uninitialized kernel memory.
166.Fn kmsan_check
167and related functions also take a
168.Fa descr
169parameter which is inserted into any reports raised by the check.
170.Sh IMPLEMENTATION NOTES
171.Ss Shadow Maps
172The
173.Nm
174runtime makes use of two shadows of the kernel map.
175Each address in the kernel map has a linear mapping to addresses in the
176two shadows.
177The first, simply called the shadow map, tracks the state of the corresponding
178kernel memory.
179A non-zero byte in the shadow map indicates that the corresponding byte of
180kernel memory is uninitialized.
181The
182.Nm
183instrumentation automatically propagates shadow state as the contents of kernel
184memory are transformed and copied.
185.Pp
186The second shadow is called the origin map, and exists only to help debug
187reports from the sanitizer.
188To avoid false positives,
189.Nm
190does not raise reports for certain operations on uninitialized memory, such
191as copying or arithmetic.
192Thus, operations on uninitialized state which raise a report may be far removed
193from the source of the bug, complicating debugging.
194The origin map contains information which can help pinpoint the root cause of
195a particular
196.Nm
197report; when generating a report, the runtime uses state from the origin map
198to provide extra details.
199.Pp
200Unlike the shadow map, the origin map is not byte-granular, but consists of 4-byte
201.Dq cells .
202Each cell describes the corresponding four bytes of mapped kernel memory and
203holds a type and compressed code address.
204When kernel memory is allocated for some purpose, its origin is initialized
205either by the compiler instrumentation or by runtime hooks in the allocator.
206The type indicates the specific allocator, e.g.,
207.Xr uma 9 ,
208and the address provides the location in the kernel code where the memory was
209allocated.
210.Ss Assembly Code
211When
212.Nm
213is configured, the compiler will only emit instrumentation for C code.
214Files containing assembly code are left un-instrumented.
215In some cases this is handled by the sanitizer runtime, which defines
216wrappers for subroutines implemented in assembly.
217These wrappers are referred to as interceptors and handle updating
218shadow state to reflect the operations performed by the original
219subroutines.
220In other cases, C code which calls assembly code or is called from
221assembly code may need to use
222.Fn kmsan_mark
223to manually update shadow state.
224This is typically only necessary in machine-dependent code.
225.Pp
226Inline assembly is instrumented by the compiler to update shadow state
227based on the output operands of the code, and thus does not usually
228require any special handling to avoid false positives.
229.Ss Interrupts and Exceptions
230In addition to the shadow maps, the sanitizer requires some thread-local
231storage (TLS) to track initialization and origin state for function
232parameters and return values.
233The sanitizer instrumentation will automatically fetch, update and
234verify this state.
235In particular, this storage block has a layout defined by the sanitizer
236ABI.
237.Pp
238Most kernel code runs in a context where interrupts or exceptions may
239redirect the CPU to begin execution of unrelated code.
240To ensure that thread-local sanitizer state remains consistent, the
241runtime maintains a stack of TLS blocks for each thread.
242When machine-dependent interrupt and exception handlers begin execution,
243they push a new entry onto the stack before calling into any C code, and
244pop the stack before resuming execution of the interrupted code.
245These operations are performed by the
246.Fn kmsan_intr_enter
247and
248.Fn kmsan_intr_leave
249functions in the sanitizer runtime.
250.Sh EXAMPLES
251The following contrived example demonstrates some of the types of bugs
252that are automatically detected by
253.Nm :
254.Bd -literal -offset indent
255int
256f(size_t osz)
257{
258	struct {
259		uint32_t bar;
260		uint16_t baz;
261		/* A 2-byte hole is here. */
262	} foo;
263	char *buf;
264	size_t sz;
265	int error;
266
267	/*
268	 * This will raise a report since "sz" is uninitialized
269	 * here.  If it is initialized, and "osz" was left uninitialized
270	 * by the caller, a report would also be raised.
271	 */
272	if (sz < osz)
273		return (1);
274
275	buf = malloc(32, M_TEMP, M_WAITOK);
276
277	/*
278	 * This will raise a report since "buf" has not been
279	 * initialized and contains whatever data is left over from the
280	 * previous use of that memory.
281	 */
282	for (i = 0; i < 32; i++)
283		if (buf[i] != '\0')
284			foo.bar++;
285	foo.baz = 0;
286
287	/*
288	 * This will raise a report since the pad bytes in "foo" have
289	 * not been initialized, e.g., by memset(), and this call will
290	 * thus copy uninitialized kernel stack memory into userspace.
291	 */
292	copyout(&foo, uaddr, sizeof(foo));
293
294	/*
295	 * This line itself will not raise a report, but may trigger
296	 * a report in the caller depending on how the return value is
297	 * used.
298	 */
299	return (error);
300}
301.Ed
302.Sh SEE ALSO
303.Xr build 7 ,
304.Xr busdma 9 ,
305.Xr copyout 9 ,
306.Xr KASAN 9 ,
307.Xr uma 9
308.Rs
309.%A Evgeniy Stepanov
310.%A Konstantin Serebryany
311.%T MemorySanitizer: fast detector of uninitialized memory use in C++
312.%J 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)
313.%D 2015
314.Re
315.Sh HISTORY
316.Nm
317was ported from
318.Nx
319and first appeared in
320.Fx 14.0 .
321.Sh BUGS
322Accesses to kernel memory outside of the kernel map are ignored by the
323.Nm
324runtime.
325In particular, memory accesses via the direct map are not validated.
326When memory is copied from outside the kernel map into the kernel map,
327that region of the kernel map is marked as initialized.
328When
329.Nm
330is configured, kernel memory allocators are configured to use the kernel map,
331and filesystems are configured to always map data buffers into the kernel map,
332so usage of the direct map is minimized.
333However, some uses of the direct map remain.
334This is a conservative policy which aims to avoid false positives, but it will
335mask bug in some kernel subsystems.
336.Pp
337On amd64, global variables and the physical page array
338.Va vm_page_array
339are not sanitized.
340This is intentional, as it reduces memory usage by avoiding creating
341shadows of large regions of the kernel map.
342However, this can allow bugs to go undetected by
343.Nm .
344.Pp
345Some kernel memory allocators provide type-stable objects, and code which uses
346them frequently depends on object data being preserved across allocations.
347Such allocations cannot be sanitized by
348.Nm .
349However, in some cases it may be possible to use
350.Fn kmsan_mark
351to manually annotate fields which are known to contain invalid data upon
352allocation.
353