1*3af2dd00SHaoyang Liu.. SPDX-License-Identifier: GPL-2.0 2*3af2dd00SHaoyang Liu 3*3af2dd00SHaoyang Liu.. include:: ../disclaimer-zh_CN.rst 4*3af2dd00SHaoyang Liu 5*3af2dd00SHaoyang Liu:Original: Documentation/dev-tools/kmsan.rst 6*3af2dd00SHaoyang Liu:Translator: 刘浩阳 Haoyang Liu <tttturtleruss@hust.edu.cn> 7*3af2dd00SHaoyang Liu 8*3af2dd00SHaoyang Liu======================= 9*3af2dd00SHaoyang Liu内核内存消毒剂(KMSAN) 10*3af2dd00SHaoyang Liu======================= 11*3af2dd00SHaoyang Liu 12*3af2dd00SHaoyang LiuKMSAN 是一个动态错误检测器,旨在查找未初始化值的使用。它基于编译器插桩,类似于用 13*3af2dd00SHaoyang Liu户空间的 `MemorySanitizer tool`_。 14*3af2dd00SHaoyang Liu 15*3af2dd00SHaoyang Liu需要注意的是 KMSAN 并不适合生产环境,因为它会大幅增加内核内存占用并降低系统运行速度。 16*3af2dd00SHaoyang Liu 17*3af2dd00SHaoyang Liu使用方法 18*3af2dd00SHaoyang Liu======== 19*3af2dd00SHaoyang Liu 20*3af2dd00SHaoyang Liu构建内核 21*3af2dd00SHaoyang Liu-------- 22*3af2dd00SHaoyang Liu 23*3af2dd00SHaoyang Liu要构建带有 KMSAN 的内核,你需要一个较新的 Clang (14.0.6+)。 24*3af2dd00SHaoyang Liu请参阅 `LLVM documentation`_ 了解如何构建 Clang。 25*3af2dd00SHaoyang Liu 26*3af2dd00SHaoyang Liu现在配置并构建一个启用 CONFIG_KMSAN 的内核。 27*3af2dd00SHaoyang Liu 28*3af2dd00SHaoyang Liu示例报告 29*3af2dd00SHaoyang Liu-------- 30*3af2dd00SHaoyang Liu 31*3af2dd00SHaoyang Liu以下是一个 KMSAN 报告的示例:: 32*3af2dd00SHaoyang Liu 33*3af2dd00SHaoyang Liu ===================================================== 34*3af2dd00SHaoyang Liu BUG: KMSAN: uninit-value in test_uninit_kmsan_check_memory+0x1be/0x380 [kmsan_test] 35*3af2dd00SHaoyang Liu test_uninit_kmsan_check_memory+0x1be/0x380 mm/kmsan/kmsan_test.c:273 36*3af2dd00SHaoyang Liu kunit_run_case_internal lib/kunit/test.c:333 37*3af2dd00SHaoyang Liu kunit_try_run_case+0x206/0x420 lib/kunit/test.c:374 38*3af2dd00SHaoyang Liu kunit_generic_run_threadfn_adapter+0x6d/0xc0 lib/kunit/try-catch.c:28 39*3af2dd00SHaoyang Liu kthread+0x721/0x850 kernel/kthread.c:327 40*3af2dd00SHaoyang Liu ret_from_fork+0x1f/0x30 ??:? 41*3af2dd00SHaoyang Liu 42*3af2dd00SHaoyang Liu Uninit was stored to memory at: 43*3af2dd00SHaoyang Liu do_uninit_local_array+0xfa/0x110 mm/kmsan/kmsan_test.c:260 44*3af2dd00SHaoyang Liu test_uninit_kmsan_check_memory+0x1a2/0x380 mm/kmsan/kmsan_test.c:271 45*3af2dd00SHaoyang Liu kunit_run_case_internal lib/kunit/test.c:333 46*3af2dd00SHaoyang Liu kunit_try_run_case+0x206/0x420 lib/kunit/test.c:374 47*3af2dd00SHaoyang Liu kunit_generic_run_threadfn_adapter+0x6d/0xc0 lib/kunit/try-catch.c:28 48*3af2dd00SHaoyang Liu kthread+0x721/0x850 kernel/kthread.c:327 49*3af2dd00SHaoyang Liu ret_from_fork+0x1f/0x30 ??:? 50*3af2dd00SHaoyang Liu 51*3af2dd00SHaoyang Liu Local variable uninit created at: 52*3af2dd00SHaoyang Liu do_uninit_local_array+0x4a/0x110 mm/kmsan/kmsan_test.c:256 53*3af2dd00SHaoyang Liu test_uninit_kmsan_check_memory+0x1a2/0x380 mm/kmsan/kmsan_test.c:271 54*3af2dd00SHaoyang Liu 55*3af2dd00SHaoyang Liu Bytes 4-7 of 8 are uninitialized 56*3af2dd00SHaoyang Liu Memory access of size 8 starts at ffff888083fe3da0 57*3af2dd00SHaoyang Liu 58*3af2dd00SHaoyang Liu CPU: 0 PID: 6731 Comm: kunit_try_catch Tainted: G B E 5.16.0-rc3+ #104 59*3af2dd00SHaoyang Liu Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 60*3af2dd00SHaoyang Liu ===================================================== 61*3af2dd00SHaoyang Liu 62*3af2dd00SHaoyang Liu报告指出本地变量 ``uninit`` 在 ``do_uninit_local_array()`` 中未初始化。 63*3af2dd00SHaoyang Liu第三个堆栈跟踪对应于该变量创建的位置。 64*3af2dd00SHaoyang Liu 65*3af2dd00SHaoyang Liu第一个堆栈跟踪显示了未初始化值的使用位置(在 66*3af2dd00SHaoyang Liu``test_uninit_kmsan_check_memory()``)。 67*3af2dd00SHaoyang Liu工具显示了局部变量中未初始化的字节及其被复制到其他内存位置前的堆栈。 68*3af2dd00SHaoyang Liu 69*3af2dd00SHaoyang LiuKMSAN 会在以下情况下报告未初始化的值 ``v``: 70*3af2dd00SHaoyang Liu 71*3af2dd00SHaoyang Liu - 在条件判断中,例如 ``if (v) { ... }``; 72*3af2dd00SHaoyang Liu - 在索引或指针解引用中,例如 ``array[v]`` 或 ``*v``; 73*3af2dd00SHaoyang Liu - 当它被复制到用户空间或硬件时,例如 ``copy_to_user(..., &v, ...)``; 74*3af2dd00SHaoyang Liu - 当它作为函数参数传递,并且启用 ``CONFIG_KMSAN_CHECK_PARAM_RETVAL`` 时(见下文)。 75*3af2dd00SHaoyang Liu 76*3af2dd00SHaoyang Liu这些情况(除了复制数据到用户空间或硬件外,这是一个安全问题)被视为 C11 标准下的未定义行为。 77*3af2dd00SHaoyang Liu 78*3af2dd00SHaoyang Liu禁用插桩 79*3af2dd00SHaoyang Liu-------- 80*3af2dd00SHaoyang Liu 81*3af2dd00SHaoyang Liu可以用 ``__no_kmsan_checks`` 标记函数。这样,KMSAN 会忽略该函数中的未初始化值, 82*3af2dd00SHaoyang Liu并将其输出标记为已初始化。如此,用户不会收到与该函数相关的 KMSAN 报告。 83*3af2dd00SHaoyang Liu 84*3af2dd00SHaoyang LiuKMSAN 还支持 ``__no_sanitize_memory`` 函数属性。KMSAN 不会对拥有该属性的函数进行 85*3af2dd00SHaoyang Liu插桩,这在我们不希望编译器干扰某些底层代码(例如标记为 ``noinstr`` 的代码,该 86*3af2dd00SHaoyang Liu代码隐式添加了 ``__no_sanitize_memory``)时可能很有用。 87*3af2dd00SHaoyang Liu 88*3af2dd00SHaoyang Liu然而,这会有代价:此类函数的栈分配将具有不正确的影子/初始值,可能导致误报。来 89*3af2dd00SHaoyang Liu自非插桩代码的函数也可能接收到不正确的元数据。 90*3af2dd00SHaoyang Liu 91*3af2dd00SHaoyang Liu 92*3af2dd00SHaoyang Liu作为经验之谈,避免显式使用 ``__no_sanitize_memory``。 93*3af2dd00SHaoyang Liu 94*3af2dd00SHaoyang Liu也可以通过 Makefile 禁用 KMSAN 对某个文件(例如 main.o)的作用:: 95*3af2dd00SHaoyang Liu 96*3af2dd00SHaoyang Liu KMSAN_SANITIZE_main.o := n 97*3af2dd00SHaoyang Liu 98*3af2dd00SHaoyang Liu或者对整个目录:: 99*3af2dd00SHaoyang Liu 100*3af2dd00SHaoyang Liu KMSAN_SANITIZE := n 101*3af2dd00SHaoyang Liu 102*3af2dd00SHaoyang Liu将其应用到文件或目录中的每个函数。大多数用户不会需要 KMSAN_SANITIZE, 103*3af2dd00SHaoyang Liu除非他们的代码被 KMSAN 破坏(例如在早期启动时运行的代码)。 104*3af2dd00SHaoyang Liu 105*3af2dd00SHaoyang Liu还可以通过调用 ``kmsan_disable_current()`` 和 ``kmsan_enable_current()`` 106*3af2dd00SHaoyang Liu暂时对当前任务禁用 KMSAN 检查。每个 ``kmsan_enable_current()`` 必须在 107*3af2dd00SHaoyang Liu``kmsan_disable_current()`` 之后调用;这些调用对可以嵌套。在调用时需要注意保持 108*3af2dd00SHaoyang Liu嵌套区域简短,并且尽可能使用其他方法禁用插桩。 109*3af2dd00SHaoyang Liu 110*3af2dd00SHaoyang Liu支持 111*3af2dd00SHaoyang Liu==== 112*3af2dd00SHaoyang Liu 113*3af2dd00SHaoyang Liu为了使用 KMSAN,内核必须使用 Clang 构建,到目前为止,Clang 是唯一支持 KMSAN 114*3af2dd00SHaoyang Liu的编译器。内核插桩过程基于用户空间的 `MemorySanitizer tool`_。 115*3af2dd00SHaoyang Liu 116*3af2dd00SHaoyang Liu目前运行时库仅支持 x86_64 架构。 117*3af2dd00SHaoyang Liu 118*3af2dd00SHaoyang LiuKMSAN 的工作原理 119*3af2dd00SHaoyang Liu================ 120*3af2dd00SHaoyang Liu 121*3af2dd00SHaoyang LiuKMSAN 阴影内存 122*3af2dd00SHaoyang Liu-------------- 123*3af2dd00SHaoyang Liu 124*3af2dd00SHaoyang LiuKMSAN 将一个元数据字节(也称为阴影字节)与每个内核内存字节关联。仅当内核内存字节 125*3af2dd00SHaoyang Liu的相应位未初始化时,阴影字节中的一个比特位才会被设置。将内存标记为未初始化(即 126*3af2dd00SHaoyang Liu将其阴影字节设置为 ``0xff``)称为中毒,将其标记为已初始化(将阴影字节设置为 127*3af2dd00SHaoyang Liu``0x00``)称为解毒。 128*3af2dd00SHaoyang Liu 129*3af2dd00SHaoyang Liu当在栈上分配新变量时,默认情况下它会中毒,这由编译器插入的插桩代码完成(除非它 130*3af2dd00SHaoyang Liu是立即初始化的栈变量)。任何未使用 ``__GFP_ZERO`` 的堆分配也会中毒。 131*3af2dd00SHaoyang Liu 132*3af2dd00SHaoyang Liu编译器插桩还跟踪阴影值在代码中的使用。当需要时,插桩代码会调用 ``mm/kmsan/`` 中 133*3af2dd00SHaoyang Liu的运行时库以持久化阴影值。 134*3af2dd00SHaoyang Liu 135*3af2dd00SHaoyang Liu基本或复合类型的阴影值是长度相同的字节数组。当常量值写入内存时,该内存会被解毒 136*3af2dd00SHaoyang Liu。当从内存读取值时,其阴影内存也会被获取,并传递到所有使用该值的操作中。对于每 137*3af2dd00SHaoyang Liu个需要一个或多个值的指令,编译器会生成代码根据这些值及其阴影来计算结果的阴影。 138*3af2dd00SHaoyang Liu 139*3af2dd00SHaoyang Liu 140*3af2dd00SHaoyang Liu示例:: 141*3af2dd00SHaoyang Liu 142*3af2dd00SHaoyang Liu int a = 0xff; // i.e. 0x000000ff 143*3af2dd00SHaoyang Liu int b; 144*3af2dd00SHaoyang Liu int c = a | b; 145*3af2dd00SHaoyang Liu 146*3af2dd00SHaoyang Liu在这种情况下, ``a`` 的阴影为 ``0``, ``b`` 的阴影为 ``0xffffffff``, 147*3af2dd00SHaoyang Liu``c`` 的阴影为 ``0xffffff00``。这意味着 ``c`` 的高三个字节未初始化,而低字节已 148*3af2dd00SHaoyang Liu初始化。 149*3af2dd00SHaoyang Liu 150*3af2dd00SHaoyang Liu起源跟踪 151*3af2dd00SHaoyang Liu-------- 152*3af2dd00SHaoyang Liu 153*3af2dd00SHaoyang Liu每四字节的内核内存都有一个所谓的源点与之映射。这个源点描述了在程序执行中,未初 154*3af2dd00SHaoyang Liu始化值的创建点。每个源点都与完整的分配栈(对于堆分配的内存)或包含未初始化变 155*3af2dd00SHaoyang Liu量的函数(对于局部变量)相关联。 156*3af2dd00SHaoyang Liu 157*3af2dd00SHaoyang Liu当一个未初始化的变量在栈或堆上分配时,会创建一个新的源点值,并将该变量的初始值 158*3af2dd00SHaoyang Liu填充为这个值。当从内存中读取一个值时,其初始值也会被读取并与阴影一起保留。对于 159*3af2dd00SHaoyang Liu每个接受一个或多个值的指令,结果的源点是与任何未初始化输入相对应的源点之一。如 160*3af2dd00SHaoyang Liu果一个污染值被写入内存,其起源也会被写入相应的存储中。 161*3af2dd00SHaoyang Liu 162*3af2dd00SHaoyang Liu示例 1:: 163*3af2dd00SHaoyang Liu 164*3af2dd00SHaoyang Liu int a = 42; 165*3af2dd00SHaoyang Liu int b; 166*3af2dd00SHaoyang Liu int c = a + b; 167*3af2dd00SHaoyang Liu 168*3af2dd00SHaoyang Liu在这种情况下, ``b`` 的源点是在函数入口时生成的,并在加法结果写入内存之前存储到 169*3af2dd00SHaoyang Liu``c`` 的源点中。 170*3af2dd00SHaoyang Liu 171*3af2dd00SHaoyang Liu如果几个变量共享相同的源点地址,则它们被存储在同一个四字节块中。在这种情况下, 172*3af2dd00SHaoyang Liu对任何变量的每次写入都会更新所有变量的源点。在这种情况下我们必须牺牲精度,因 173*3af2dd00SHaoyang Liu为为单独的位(甚至字节)存储源点成本过高。 174*3af2dd00SHaoyang Liu 175*3af2dd00SHaoyang Liu示例 2:: 176*3af2dd00SHaoyang Liu 177*3af2dd00SHaoyang Liu int combine(short a, short b) { 178*3af2dd00SHaoyang Liu union ret_t { 179*3af2dd00SHaoyang Liu int i; 180*3af2dd00SHaoyang Liu short s[2]; 181*3af2dd00SHaoyang Liu } ret; 182*3af2dd00SHaoyang Liu ret.s[0] = a; 183*3af2dd00SHaoyang Liu ret.s[1] = b; 184*3af2dd00SHaoyang Liu return ret.i; 185*3af2dd00SHaoyang Liu } 186*3af2dd00SHaoyang Liu 187*3af2dd00SHaoyang Liu如果 ``a`` 已初始化而 ``b`` 未初始化,则结果的阴影为 0xffff0000,结果的源点为 188*3af2dd00SHaoyang Liu``b`` 的源点。 ``ret.s[0]`` 会有相同的起源,但它不会被使用,因为该变量已初始化。 189*3af2dd00SHaoyang Liu 190*3af2dd00SHaoyang Liu如果两个函数参数都未初始化,则只保留第二个参数的源点。 191*3af2dd00SHaoyang Liu 192*3af2dd00SHaoyang Liu源点链 193*3af2dd00SHaoyang Liu~~~~~~ 194*3af2dd00SHaoyang Liu 195*3af2dd00SHaoyang Liu为了便于调试,KMSAN 在每次将未初始化值存储到内存时都会创建一个新的源点。新的源点 196*3af2dd00SHaoyang Liu引用了其创建栈以及值的前一个起源。这可能导致内存消耗增加,因此我们在运行时限制 197*3af2dd00SHaoyang Liu了源点链的长度。 198*3af2dd00SHaoyang Liu 199*3af2dd00SHaoyang LiuClang 插桩 API 200*3af2dd00SHaoyang Liu-------------- 201*3af2dd00SHaoyang Liu 202*3af2dd00SHaoyang LiuClang 插桩通过在内核代码中插入定义在 ``mm/kmsan/instrumentation.c`` 中的函数调用 203*3af2dd00SHaoyang Liu来实现。 204*3af2dd00SHaoyang Liu 205*3af2dd00SHaoyang Liu 206*3af2dd00SHaoyang Liu阴影操作 207*3af2dd00SHaoyang Liu~~~~~~~~ 208*3af2dd00SHaoyang Liu 209*3af2dd00SHaoyang Liu对于每次内存访问,编译器都会发出一个函数调用,该函数返回一对指针,指向给定内存 210*3af2dd00SHaoyang Liu的阴影和原始地址:: 211*3af2dd00SHaoyang Liu 212*3af2dd00SHaoyang Liu typedef struct { 213*3af2dd00SHaoyang Liu void *shadow, *origin; 214*3af2dd00SHaoyang Liu } shadow_origin_ptr_t 215*3af2dd00SHaoyang Liu 216*3af2dd00SHaoyang Liu shadow_origin_ptr_t __msan_metadata_ptr_for_load_{1,2,4,8}(void *addr) 217*3af2dd00SHaoyang Liu shadow_origin_ptr_t __msan_metadata_ptr_for_store_{1,2,4,8}(void *addr) 218*3af2dd00SHaoyang Liu shadow_origin_ptr_t __msan_metadata_ptr_for_load_n(void *addr, uintptr_t size) 219*3af2dd00SHaoyang Liu shadow_origin_ptr_t __msan_metadata_ptr_for_store_n(void *addr, uintptr_t size) 220*3af2dd00SHaoyang Liu 221*3af2dd00SHaoyang Liu函数名依赖于内存访问的大小。 222*3af2dd00SHaoyang Liu 223*3af2dd00SHaoyang Liu编译器确保对于每个加载的值,其阴影和原始值都从内存中读取。当一个值存储到内存时 224*3af2dd00SHaoyang Liu,其阴影和原始值也会通过元数据指针进行存储。 225*3af2dd00SHaoyang Liu 226*3af2dd00SHaoyang Liu处理局部变量 227*3af2dd00SHaoyang Liu~~~~~~~~~~~~ 228*3af2dd00SHaoyang Liu 229*3af2dd00SHaoyang Liu一个特殊的函数用于为局部变量创建一个新的原始值,并将该变量的原始值设置为该值:: 230*3af2dd00SHaoyang Liu 231*3af2dd00SHaoyang Liu void __msan_poison_alloca(void *addr, uintptr_t size, char *descr) 232*3af2dd00SHaoyang Liu 233*3af2dd00SHaoyang Liu访问每个任务数据 234*3af2dd00SHaoyang Liu~~~~~~~~~~~~~~~~ 235*3af2dd00SHaoyang Liu 236*3af2dd00SHaoyang Liu在每个插桩函数的开始处,KMSAN 插入一个对 ``__msan_get_context_state()`` 的调用 237*3af2dd00SHaoyang Liu:: 238*3af2dd00SHaoyang Liu 239*3af2dd00SHaoyang Liu kmsan_context_state *__msan_get_context_state(void) 240*3af2dd00SHaoyang Liu 241*3af2dd00SHaoyang Liu``kmsan_context_state`` 在 ``include/linux/kmsan.h`` 中声明:: 242*3af2dd00SHaoyang Liu 243*3af2dd00SHaoyang Liu struct kmsan_context_state { 244*3af2dd00SHaoyang Liu char param_tls[KMSAN_PARAM_SIZE]; 245*3af2dd00SHaoyang Liu char retval_tls[KMSAN_RETVAL_SIZE]; 246*3af2dd00SHaoyang Liu char va_arg_tls[KMSAN_PARAM_SIZE]; 247*3af2dd00SHaoyang Liu char va_arg_origin_tls[KMSAN_PARAM_SIZE]; 248*3af2dd00SHaoyang Liu u64 va_arg_overflow_size_tls; 249*3af2dd00SHaoyang Liu char param_origin_tls[KMSAN_PARAM_SIZE]; 250*3af2dd00SHaoyang Liu depot_stack_handle_t retval_origin_tls; 251*3af2dd00SHaoyang Liu }; 252*3af2dd00SHaoyang Liu 253*3af2dd00SHaoyang LiuKMSAN 使用此结构体在插桩函数之间传递参数阴影和原始值(除非立刻通过 254*3af2dd00SHaoyang Liu ``CONFIG_KMSAN_CHECK_PARAM_RETVAL`` 检查参数)。 255*3af2dd00SHaoyang Liu 256*3af2dd00SHaoyang Liu将未初始化的值传递给函数 257*3af2dd00SHaoyang Liu~~~~~~~~~~~~~~~~~~~~~~~~ 258*3af2dd00SHaoyang Liu 259*3af2dd00SHaoyang LiuClang 的 MemorySanitizer 插桩有一个选项 ``-fsanitize-memory-param-retval``,该 260*3af2dd00SHaoyang Liu选项使编译器检查按值传递的函数参数,以及函数返回值。 261*3af2dd00SHaoyang Liu 262*3af2dd00SHaoyang Liu该选项由 ``CONFIG_KMSAN_CHECK_PARAM_RETVAL`` 控制,默认启用以便 KMSAN 更早报告 263*3af2dd00SHaoyang Liu未初始化的值。有关更多细节,请参考 `LKML discussion`_。 264*3af2dd00SHaoyang Liu 265*3af2dd00SHaoyang Liu由于 LLVM 中的实现检查的方式(它们仅应用于标记为 ``noundef`` 的参数),并不是所 266*3af2dd00SHaoyang Liu有参数都能保证被检查,因此我们不能放弃 ``kmsan_context_state`` 中的元数据存储 267*3af2dd00SHaoyang Liu。 268*3af2dd00SHaoyang Liu 269*3af2dd00SHaoyang Liu字符串函数 270*3af2dd00SHaoyang Liu~~~~~~~~~~~ 271*3af2dd00SHaoyang Liu 272*3af2dd00SHaoyang Liu编译器将对 ``memcpy()``/``memmove()``/``memset()`` 的调用替换为以下函数。这些函 273*3af2dd00SHaoyang Liu数在数据结构初始化或复制时也会被调用,确保阴影和原始值与数据一起复制:: 274*3af2dd00SHaoyang Liu 275*3af2dd00SHaoyang Liu void *__msan_memcpy(void *dst, void *src, uintptr_t n) 276*3af2dd00SHaoyang Liu void *__msan_memmove(void *dst, void *src, uintptr_t n) 277*3af2dd00SHaoyang Liu void *__msan_memset(void *dst, int c, uintptr_t n) 278*3af2dd00SHaoyang Liu 279*3af2dd00SHaoyang Liu错误报告 280*3af2dd00SHaoyang Liu~~~~~~~~ 281*3af2dd00SHaoyang Liu 282*3af2dd00SHaoyang Liu对于每个值的使用,编译器发出一个阴影检查,在值中毒的情况下调用 283*3af2dd00SHaoyang Liu``__msan_warning()``:: 284*3af2dd00SHaoyang Liu 285*3af2dd00SHaoyang Liu void __msan_warning(u32 origin) 286*3af2dd00SHaoyang Liu 287*3af2dd00SHaoyang Liu``__msan_warning()`` 使 KMSAN 运行时打印错误报告。 288*3af2dd00SHaoyang Liu 289*3af2dd00SHaoyang Liu内联汇编插桩 290*3af2dd00SHaoyang Liu~~~~~~~~~~~~ 291*3af2dd00SHaoyang Liu 292*3af2dd00SHaoyang LiuKMSAN 对每个内联汇编输出进行插桩,调用:: 293*3af2dd00SHaoyang Liu 294*3af2dd00SHaoyang Liu void __msan_instrument_asm_store(void *addr, uintptr_t size) 295*3af2dd00SHaoyang Liu 296*3af2dd00SHaoyang Liu,该函数解除内存区域的污染。 297*3af2dd00SHaoyang Liu 298*3af2dd00SHaoyang Liu这种方法可能会掩盖某些错误,但也有助于避免许多位操作、原子操作等中的假阳性。 299*3af2dd00SHaoyang Liu 300*3af2dd00SHaoyang Liu有时传递给内联汇编的指针不指向有效内存。在这种情况下,它们在运行时被忽略。 301*3af2dd00SHaoyang Liu 302*3af2dd00SHaoyang Liu 303*3af2dd00SHaoyang Liu运行时库 304*3af2dd00SHaoyang Liu-------- 305*3af2dd00SHaoyang Liu 306*3af2dd00SHaoyang Liu代码位于 ``mm/kmsan/``。 307*3af2dd00SHaoyang Liu 308*3af2dd00SHaoyang Liu每个任务 KMSAN 状态 309*3af2dd00SHaoyang Liu~~~~~~~~~~~~~~~~~~~ 310*3af2dd00SHaoyang Liu 311*3af2dd00SHaoyang Liu每个 task_struct 都有一个关联的 KMSAN 任务状态,它保存 KMSAN 312*3af2dd00SHaoyang Liu上下文(见上文)和一个每个任务计数器以禁止 KMSAN 报告:: 313*3af2dd00SHaoyang Liu 314*3af2dd00SHaoyang Liu struct kmsan_context { 315*3af2dd00SHaoyang Liu ... 316*3af2dd00SHaoyang Liu unsigned int depth; 317*3af2dd00SHaoyang Liu struct kmsan_context_state cstate; 318*3af2dd00SHaoyang Liu ... 319*3af2dd00SHaoyang Liu } 320*3af2dd00SHaoyang Liu 321*3af2dd00SHaoyang Liu struct task_struct { 322*3af2dd00SHaoyang Liu ... 323*3af2dd00SHaoyang Liu struct kmsan_context kmsan; 324*3af2dd00SHaoyang Liu ... 325*3af2dd00SHaoyang Liu } 326*3af2dd00SHaoyang Liu 327*3af2dd00SHaoyang LiuKMSAN 上下文 328*3af2dd00SHaoyang Liu~~~~~~~~~~~~ 329*3af2dd00SHaoyang Liu 330*3af2dd00SHaoyang Liu在内核任务上下文中运行时,KMSAN 使用 ``current->kmsan.cstate`` 来 331*3af2dd00SHaoyang Liu保存函数参数和返回值的元数据。 332*3af2dd00SHaoyang Liu 333*3af2dd00SHaoyang Liu但在内核运行于中断、softirq 或 NMI 上下文中, ``current`` 不可用时, 334*3af2dd00SHaoyang LiuKMSAN 切换到每 CPU 中断状态:: 335*3af2dd00SHaoyang Liu 336*3af2dd00SHaoyang Liu DEFINE_PER_CPU(struct kmsan_ctx, kmsan_percpu_ctx); 337*3af2dd00SHaoyang Liu 338*3af2dd00SHaoyang Liu元数据分配 339*3af2dd00SHaoyang Liu~~~~~~~~~~ 340*3af2dd00SHaoyang Liu 341*3af2dd00SHaoyang Liu内核中有多个地方存储元数据。 342*3af2dd00SHaoyang Liu 343*3af2dd00SHaoyang Liu1. 每个 ``struct page`` 实例包含两个指向其影子和内存页面的指针 344*3af2dd00SHaoyang Liu:: 345*3af2dd00SHaoyang Liu 346*3af2dd00SHaoyang Liu struct page { 347*3af2dd00SHaoyang Liu ... 348*3af2dd00SHaoyang Liu struct page *shadow, *origin; 349*3af2dd00SHaoyang Liu ... 350*3af2dd00SHaoyang Liu }; 351*3af2dd00SHaoyang Liu 352*3af2dd00SHaoyang Liu在启动时,内核为每个可用的内核页面分配影子和源页面。这是在内核地址空间已经碎片 353*3af2dd00SHaoyang Liu化时后完成的,完成的相当晚,因此普通数据页面可能与元数据页面任意交错。 354*3af2dd00SHaoyang Liu 355*3af2dd00SHaoyang Liu这意味着通常两个相邻的内存页面,它们的影子/源页面可能不是连续的。因此,如果内存 356*3af2dd00SHaoyang Liu访问跨越内存块的边界,访问影子/源内存可能会破坏其他页面或从中读取错误的值。 357*3af2dd00SHaoyang Liu 358*3af2dd00SHaoyang Liu实际上,由相同 ``alloc_pages()`` 调用返回的连续内存页面将具有连续的元数据,而 359*3af2dd00SHaoyang Liu如果这些页面属于两个不同的分配,它们的元数据页面可能会被碎片化。 360*3af2dd00SHaoyang Liu 361*3af2dd00SHaoyang Liu对于内核数据( ``.data``、 ``.bss`` 等)和每 CPU 内存区域,也没有对元数据连续 362*3af2dd00SHaoyang Liu性的保证。 363*3af2dd00SHaoyang Liu 364*3af2dd00SHaoyang Liu在 ``__msan_metadata_ptr_for_XXX_YYY()`` 遇到两个页面之间的 365*3af2dd00SHaoyang Liu非连续元数据边界时,它返回指向假影子/源区域的指针:: 366*3af2dd00SHaoyang Liu 367*3af2dd00SHaoyang Liu char dummy_load_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); 368*3af2dd00SHaoyang Liu char dummy_store_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); 369*3af2dd00SHaoyang Liu 370*3af2dd00SHaoyang Liu``dummy_load_page`` 被初始化为零,因此读取它始终返回零。对 ``dummy_store_page`` 的 371*3af2dd00SHaoyang Liu所有写入都被忽略。 372*3af2dd00SHaoyang Liu 373*3af2dd00SHaoyang Liu2. 对于 vmalloc 内存和模块,内存范围、影子和源之间有一个直接映射。KMSAN 将 374*3af2dd00SHaoyang Liuvmalloc 区域缩小了 3/4,仅使前四分之一可用于 ``vmalloc()``。vmalloc 375*3af2dd00SHaoyang Liu区域的第二个四分之一包含第一个四分之一的影子内存,第三个四分之一保存源。第四个 376*3af2dd00SHaoyang Liu四分之一的小部分包含内核模块的影子和源。有关更多详细信息,请参阅 377*3af2dd00SHaoyang Liu``arch/x86/include/asm/pgtable_64_types.h``。 378*3af2dd00SHaoyang Liu 379*3af2dd00SHaoyang Liu当一系列页面映射到一个连续的虚拟内存空间时,它们的影子和源页面也以连续区域的方 380*3af2dd00SHaoyang Liu式映射。 381*3af2dd00SHaoyang Liu 382*3af2dd00SHaoyang Liu参考文献 383*3af2dd00SHaoyang Liu======== 384*3af2dd00SHaoyang Liu 385*3af2dd00SHaoyang LiuE. Stepanov, K. Serebryany. `MemorySanitizer: fast detector of uninitialized 386*3af2dd00SHaoyang Liumemory use in C++ 387*3af2dd00SHaoyang Liu<https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43308.pdf>`_. 388*3af2dd00SHaoyang LiuIn Proceedings of CGO 2015. 389*3af2dd00SHaoyang Liu 390*3af2dd00SHaoyang Liu.. _MemorySanitizer tool: https://clang.llvm.org/docs/MemorySanitizer.html 391*3af2dd00SHaoyang Liu.. _LLVM documentation: https://llvm.org/docs/GettingStarted.html 392*3af2dd00SHaoyang Liu.. _LKML discussion: https://lore.kernel.org/all/20220614144853.3693273-1-glider@google.com/ 393