1=========================================== 2Fault injection capabilities infrastructure 3=========================================== 4 5See also drivers/md/md-faulty.c and "every_nth" module option for scsi_debug. 6 7 8Available fault injection capabilities 9-------------------------------------- 10 11- failslab 12 13 injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...) 14 15- fail_page_alloc 16 17 injects page allocation failures. (alloc_pages(), get_free_pages(), ...) 18 19- fail_usercopy 20 21 injects failures in user memory access functions. (copy_from_user(), get_user(), ...) 22 23- fail_futex 24 25 injects futex deadlock and uaddr fault errors. 26 27- fail_sunrpc 28 29 injects kernel RPC client and server failures. 30 31- fail_make_request 32 33 injects disk IO errors on devices permitted by setting 34 /sys/block/<device>/make-it-fail or 35 /sys/block/<device>/<partition>/make-it-fail. (submit_bio_noacct()) 36 37- fail_mmc_request 38 39 injects MMC data errors on devices permitted by setting 40 debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request 41 42- fail_function 43 44 injects error return on specific functions, which are marked by 45 ALLOW_ERROR_INJECTION() macro, by setting debugfs entries 46 under /sys/kernel/debug/fail_function. No boot option supported. 47 48- NVMe fault injection 49 50 inject NVMe status code and retry flag on devices permitted by setting 51 debugfs entries under /sys/kernel/debug/nvme*/fault_inject. The default 52 status code is NVME_SC_INVALID_OPCODE with no retry. The status code and 53 retry flag can be set via the debugfs. 54 55- Null test block driver fault injection 56 57 inject IO timeouts by setting config items under 58 /sys/kernel/config/nullb/<disk>/timeout_inject, 59 inject requeue requests by setting config items under 60 /sys/kernel/config/nullb/<disk>/requeue_inject, and 61 inject init_hctx() errors by setting config items under 62 /sys/kernel/config/nullb/<disk>/init_hctx_fault_inject. 63 64Configure fault-injection capabilities behavior 65----------------------------------------------- 66 67debugfs entries 68^^^^^^^^^^^^^^^ 69 70fault-inject-debugfs kernel module provides some debugfs entries for runtime 71configuration of fault-injection capabilities. 72 73- /sys/kernel/debug/fail*/probability: 74 75 likelihood of failure injection, in percent. 76 77 Format: <percent> 78 79 Note that one-failure-per-hundred is a very high error rate 80 for some testcases. Consider setting probability=100 and configure 81 /sys/kernel/debug/fail*/interval for such testcases. 82 83- /sys/kernel/debug/fail*/interval: 84 85 specifies the interval between failures, for calls to 86 should_fail() that pass all the other tests. 87 88 Note that if you enable this, by setting interval>1, you will 89 probably want to set probability=100. 90 91- /sys/kernel/debug/fail*/times: 92 93 specifies how many times failures may happen at most. A value of -1 94 means "no limit". 95 96- /sys/kernel/debug/fail*/space: 97 98 specifies an initial resource "budget", decremented by "size" 99 on each call to should_fail(,size). Failure injection is 100 suppressed until "space" reaches zero. 101 102- /sys/kernel/debug/fail*/verbose 103 104 Format: { 0 | 1 | 2 } 105 106 specifies the verbosity of the messages when failure is 107 injected. '0' means no messages; '1' will print only a single 108 log line per failure; '2' will print a call trace too -- useful 109 to debug the problems revealed by fault injection. 110 111- /sys/kernel/debug/fail*/task-filter: 112 113 Format: { 'Y' | 'N' } 114 115 A value of 'N' disables filtering by process (default). 116 Any positive value limits failures to only processes indicated by 117 /proc/<pid>/make-it-fail==1. 118 119- /sys/kernel/debug/fail*/require-start, 120 /sys/kernel/debug/fail*/require-end, 121 /sys/kernel/debug/fail*/reject-start, 122 /sys/kernel/debug/fail*/reject-end: 123 124 specifies the range of virtual addresses tested during 125 stacktrace walking. Failure is injected only if some caller 126 in the walked stacktrace lies within the required range, and 127 none lies within the rejected range. 128 Default required range is [0,ULONG_MAX) (whole of virtual address space). 129 Default rejected range is [0,0). 130 131- /sys/kernel/debug/fail*/stacktrace-depth: 132 133 specifies the maximum stacktrace depth walked during search 134 for a caller within [require-start,require-end) OR 135 [reject-start,reject-end). 136 137- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem: 138 139 Format: { 'Y' | 'N' } 140 141 default is 'Y', setting it to 'N' will also inject failures into 142 highmem/user allocations (__GFP_HIGHMEM allocations). 143 144- /sys/kernel/debug/failslab/cache-filter 145 Format: { 'Y' | 'N' } 146 147 default is 'N', setting it to 'Y' will only inject failures when 148 objects are requests from certain caches. 149 150 Select the cache by writing '1' to /sys/kernel/slab/<cache>/failslab: 151 152- /sys/kernel/debug/failslab/ignore-gfp-wait: 153- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait: 154 155 Format: { 'Y' | 'N' } 156 157 default is 'Y', setting it to 'N' will also inject failures 158 into allocations that can sleep (__GFP_DIRECT_RECLAIM allocations). 159 160- /sys/kernel/debug/fail_page_alloc/min-order: 161 162 specifies the minimum page allocation order to be injected 163 failures. 164 165- /sys/kernel/debug/fail_futex/ignore-private: 166 167 Format: { 'Y' | 'N' } 168 169 default is 'N', setting it to 'Y' will disable failure injections 170 when dealing with private (address space) futexes. 171 172- /sys/kernel/debug/fail_sunrpc/ignore-client-disconnect: 173 174 Format: { 'Y' | 'N' } 175 176 default is 'N', setting it to 'Y' will disable disconnect 177 injection on the RPC client. 178 179- /sys/kernel/debug/fail_sunrpc/ignore-server-disconnect: 180 181 Format: { 'Y' | 'N' } 182 183 default is 'N', setting it to 'Y' will disable disconnect 184 injection on the RPC server. 185 186- /sys/kernel/debug/fail_sunrpc/ignore-cache-wait: 187 188 Format: { 'Y' | 'N' } 189 190 default is 'N', setting it to 'Y' will disable cache wait 191 injection on the RPC server. 192 193- /sys/kernel/debug/fail_function/inject: 194 195 Format: { 'function-name' | '!function-name' | '' } 196 197 specifies the target function of error injection by name. 198 If the function name leads '!' prefix, given function is 199 removed from injection list. If nothing specified ('') 200 injection list is cleared. 201 202- /sys/kernel/debug/fail_function/injectable: 203 204 (read only) shows error injectable functions and what type of 205 error values can be specified. The error type will be one of 206 below; 207 - NULL: retval must be 0. 208 - ERRNO: retval must be -1 to -MAX_ERRNO (-4096). 209 - ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096). 210 211- /sys/kernel/debug/fail_function/<function-name>/retval: 212 213 specifies the "error" return value to inject to the given function. 214 This will be created when the user specifies a new injection entry. 215 Note that this file only accepts unsigned values. So, if you want to 216 use a negative errno, you better use 'printf' instead of 'echo', e.g.: 217 $ printf %#x -12 > retval 218 219Boot option 220^^^^^^^^^^^ 221 222In order to inject faults while debugfs is not available (early boot time), 223use the boot option:: 224 225 failslab= 226 fail_page_alloc= 227 fail_usercopy= 228 fail_make_request= 229 fail_futex= 230 mmc_core.fail_request=<interval>,<probability>,<space>,<times> 231 232proc entries 233^^^^^^^^^^^^ 234 235- /proc/<pid>/fail-nth, 236 /proc/self/task/<tid>/fail-nth: 237 238 Write to this file of integer N makes N-th call in the task fail. 239 Read from this file returns a integer value. A value of '0' indicates 240 that the fault setup with a previous write to this file was injected. 241 A positive integer N indicates that the fault wasn't yet injected. 242 Note that this file enables all types of faults (slab, futex, etc). 243 This setting takes precedence over all other generic debugfs settings 244 like probability, interval, times, etc. But per-capability settings 245 (e.g. fail_futex/ignore-private) take precedence over it. 246 247 This feature is intended for systematic testing of faults in a single 248 system call. See an example below. 249 250 251Error Injectable Functions 252-------------------------- 253 254This part is for the kernel developers considering to add a function to 255ALLOW_ERROR_INJECTION() macro. 256 257Requirements for the Error Injectable Functions 258^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 259 260Since the function-level error injection forcibly changes the code path 261and returns an error even if the input and conditions are proper, this can 262cause unexpected kernel crash if you allow error injection on the function 263which is NOT error injectable. Thus, you (and reviewers) must ensure; 264 265- The function returns an error code if it fails, and the callers must check 266 it correctly (need to recover from it). 267 268- The function does not execute any code which can change any state before 269 the first error return. The state includes global or local, or input 270 variable. For example, clear output address storage (e.g. `*ret = NULL`), 271 increments/decrements counter, set a flag, preempt/irq disable or get 272 a lock (if those are recovered before returning error, that will be OK.) 273 274The first requirement is important, and it will result in that the release 275(free objects) functions are usually harder to inject errors than allocate 276functions. If errors of such release functions are not correctly handled 277it will cause a memory leak easily (the caller will confuse that the object 278has been released or corrupted.) 279 280The second one is for the caller which expects the function should always 281does something. Thus if the function error injection skips whole of the 282function, the expectation is betrayed and causes an unexpected error. 283 284Type of the Error Injectable Functions 285^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 286 287Each error injectable functions will have the error type specified by the 288ALLOW_ERROR_INJECTION() macro. You have to choose it carefully if you add 289a new error injectable function. If the wrong error type is chosen, the 290kernel may crash because it may not be able to handle the error. 291There are 4 types of errors defined in include/asm-generic/error-injection.h 292 293EI_ETYPE_NULL 294 This function will return `NULL` if it fails. e.g. return an allocated 295 object address. 296 297EI_ETYPE_ERRNO 298 This function will return an `-errno` error code if it fails. e.g. return 299 -EINVAL if the input is wrong. This will include the functions which will 300 return an address which encodes `-errno` by ERR_PTR() macro. 301 302EI_ETYPE_ERRNO_NULL 303 This function will return an `-errno` or `NULL` if it fails. If the caller 304 of this function checks the return value with IS_ERR_OR_NULL() macro, this 305 type will be appropriate. 306 307EI_ETYPE_TRUE 308 This function will return `true` (non-zero positive value) if it fails. 309 310If you specifies a wrong type, for example, EI_TYPE_ERRNO for the function 311which returns an allocated object, it may cause a problem because the returned 312value is not an object address and the caller can not access to the address. 313 314 315How to add new fault injection capability 316----------------------------------------- 317 318- #include <linux/fault-inject.h> 319 320- define the fault attributes 321 322 DECLARE_FAULT_ATTR(name); 323 324 Please see the definition of struct fault_attr in fault-inject.h 325 for details. 326 327- provide a way to configure fault attributes 328 329- boot option 330 331 If you need to enable the fault injection capability from boot time, you can 332 provide boot option to configure it. There is a helper function for it: 333 334 setup_fault_attr(attr, str); 335 336- debugfs entries 337 338 failslab, fail_page_alloc, fail_usercopy, and fail_make_request use this way. 339 Helper functions: 340 341 fault_create_debugfs_attr(name, parent, attr); 342 343- module parameters 344 345 If the scope of the fault injection capability is limited to a 346 single kernel module, it is better to provide module parameters to 347 configure the fault attributes. 348 349- add a hook to insert failures 350 351 Upon should_fail() returning true, client code should inject a failure: 352 353 should_fail(attr, size); 354 355Application Examples 356-------------------- 357 358- Inject slab allocation failures into module init/exit code:: 359 360 #!/bin/bash 361 362 FAILTYPE=failslab 363 echo Y > /sys/kernel/debug/$FAILTYPE/task-filter 364 echo 10 > /sys/kernel/debug/$FAILTYPE/probability 365 echo 100 > /sys/kernel/debug/$FAILTYPE/interval 366 echo -1 > /sys/kernel/debug/$FAILTYPE/times 367 echo 0 > /sys/kernel/debug/$FAILTYPE/space 368 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose 369 echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait 370 371 faulty_system() 372 { 373 bash -c "echo 1 > /proc/self/make-it-fail && exec $*" 374 } 375 376 if [ $# -eq 0 ] 377 then 378 echo "Usage: $0 modulename [ modulename ... ]" 379 exit 1 380 fi 381 382 for m in $* 383 do 384 echo inserting $m... 385 faulty_system modprobe $m 386 387 echo removing $m... 388 faulty_system modprobe -r $m 389 done 390 391------------------------------------------------------------------------------ 392 393- Inject page allocation failures only for a specific module:: 394 395 #!/bin/bash 396 397 FAILTYPE=fail_page_alloc 398 module=$1 399 400 if [ -z $module ] 401 then 402 echo "Usage: $0 <modulename>" 403 exit 1 404 fi 405 406 modprobe $module 407 408 if [ ! -d /sys/module/$module/sections ] 409 then 410 echo Module $module is not loaded 411 exit 1 412 fi 413 414 cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start 415 cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end 416 417 echo N > /sys/kernel/debug/$FAILTYPE/task-filter 418 echo 10 > /sys/kernel/debug/$FAILTYPE/probability 419 echo 100 > /sys/kernel/debug/$FAILTYPE/interval 420 echo -1 > /sys/kernel/debug/$FAILTYPE/times 421 echo 0 > /sys/kernel/debug/$FAILTYPE/space 422 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose 423 echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait 424 echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem 425 echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth 426 427 trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT 428 429 echo "Injecting errors into the module $module... (interrupt to stop)" 430 sleep 1000000 431 432------------------------------------------------------------------------------ 433 434- Inject open_ctree error while btrfs mount:: 435 436 #!/bin/bash 437 438 rm -f testfile.img 439 dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1 440 DEVICE=$(losetup --show -f testfile.img) 441 mkfs.btrfs -f $DEVICE 442 mkdir -p tmpmnt 443 444 FAILTYPE=fail_function 445 FAILFUNC=open_ctree 446 echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject 447 printf %#x -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval 448 echo N > /sys/kernel/debug/$FAILTYPE/task-filter 449 echo 100 > /sys/kernel/debug/$FAILTYPE/probability 450 echo 0 > /sys/kernel/debug/$FAILTYPE/interval 451 echo -1 > /sys/kernel/debug/$FAILTYPE/times 452 echo 0 > /sys/kernel/debug/$FAILTYPE/space 453 echo 1 > /sys/kernel/debug/$FAILTYPE/verbose 454 455 mount -t btrfs $DEVICE tmpmnt 456 if [ $? -ne 0 ] 457 then 458 echo "SUCCESS!" 459 else 460 echo "FAILED!" 461 umount tmpmnt 462 fi 463 464 echo > /sys/kernel/debug/$FAILTYPE/inject 465 466 rmdir tmpmnt 467 losetup -d $DEVICE 468 rm testfile.img 469 470------------------------------------------------------------------------------ 471 472- Inject only skbuff allocation failures :: 473 474 # mark skbuff_head_cache as faulty 475 echo 1 > /sys/kernel/slab/skbuff_head_cache/failslab 476 # Turn on cache filter (off by default) 477 echo 1 > /sys/kernel/debug/failslab/cache-filter 478 # Turn on fault injection 479 echo 1 > /sys/kernel/debug/failslab/times 480 echo 1 > /sys/kernel/debug/failslab/probability 481 482 483Tool to run command with failslab or fail_page_alloc 484---------------------------------------------------- 485In order to make it easier to accomplish the tasks mentioned above, we can use 486tools/testing/fault-injection/failcmd.sh. Please run a command 487"./tools/testing/fault-injection/failcmd.sh --help" for more information and 488see the following examples. 489 490Examples: 491 492Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab 493allocation failure:: 494 495 # ./tools/testing/fault-injection/failcmd.sh \ 496 -- make -C tools/testing/selftests/ run_tests 497 498Same as above except to specify 100 times failures at most instead of one time 499at most by default:: 500 501 # ./tools/testing/fault-injection/failcmd.sh --times=100 \ 502 -- make -C tools/testing/selftests/ run_tests 503 504Same as above except to inject page allocation failure instead of slab 505allocation failure:: 506 507 # env FAILCMD_TYPE=fail_page_alloc \ 508 ./tools/testing/fault-injection/failcmd.sh --times=100 \ 509 -- make -C tools/testing/selftests/ run_tests 510 511Systematic faults using fail-nth 512--------------------------------- 513 514The following code systematically faults 0-th, 1-st, 2-nd and so on 515capabilities in the socketpair() system call:: 516 517 #include <sys/types.h> 518 #include <sys/stat.h> 519 #include <sys/socket.h> 520 #include <sys/syscall.h> 521 #include <fcntl.h> 522 #include <unistd.h> 523 #include <string.h> 524 #include <stdlib.h> 525 #include <stdio.h> 526 #include <errno.h> 527 528 int main() 529 { 530 int i, err, res, fail_nth, fds[2]; 531 char buf[128]; 532 533 system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait"); 534 sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid)); 535 fail_nth = open(buf, O_RDWR); 536 for (i = 1;; i++) { 537 sprintf(buf, "%d", i); 538 write(fail_nth, buf, strlen(buf)); 539 res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds); 540 err = errno; 541 pread(fail_nth, buf, sizeof(buf), 0); 542 if (res == 0) { 543 close(fds[0]); 544 close(fds[1]); 545 } 546 printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y', 547 res, err); 548 if (atoi(buf)) 549 break; 550 } 551 return 0; 552 } 553 554An example output:: 555 556 1-th fault Y: res=-1/23 557 2-th fault Y: res=-1/23 558 3-th fault Y: res=-1/12 559 4-th fault Y: res=-1/12 560 5-th fault Y: res=-1/23 561 6-th fault Y: res=-1/23 562 7-th fault Y: res=-1/23 563 8-th fault Y: res=-1/12 564 9-th fault Y: res=-1/12 565 10-th fault Y: res=-1/12 566 11-th fault Y: res=-1/12 567 12-th fault Y: res=-1/12 568 13-th fault Y: res=-1/12 569 14-th fault Y: res=-1/12 570 15-th fault Y: res=-1/12 571 16-th fault N: res=0/12 572