110ffebbeSMauro Carvalho Chehab=========================================== 210ffebbeSMauro Carvalho ChehabFault injection capabilities infrastructure 310ffebbeSMauro Carvalho Chehab=========================================== 410ffebbeSMauro Carvalho Chehab 510ffebbeSMauro Carvalho ChehabSee also drivers/md/md-faulty.c and "every_nth" module option for scsi_debug. 610ffebbeSMauro Carvalho Chehab 710ffebbeSMauro Carvalho Chehab 810ffebbeSMauro Carvalho ChehabAvailable fault injection capabilities 910ffebbeSMauro Carvalho Chehab-------------------------------------- 1010ffebbeSMauro Carvalho Chehab 1110ffebbeSMauro Carvalho Chehab- failslab 1210ffebbeSMauro Carvalho Chehab 1310ffebbeSMauro Carvalho Chehab injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...) 1410ffebbeSMauro Carvalho Chehab 1510ffebbeSMauro Carvalho Chehab- fail_page_alloc 1610ffebbeSMauro Carvalho Chehab 1710ffebbeSMauro Carvalho Chehab injects page allocation failures. (alloc_pages(), get_free_pages(), ...) 1810ffebbeSMauro Carvalho Chehab 192c739cedSAlbert van der Linde- fail_usercopy 202c739cedSAlbert van der Linde 212c739cedSAlbert van der Linde injects failures in user memory access functions. (copy_from_user(), get_user(), ...) 222c739cedSAlbert van der Linde 2310ffebbeSMauro Carvalho Chehab- fail_futex 2410ffebbeSMauro Carvalho Chehab 2510ffebbeSMauro Carvalho Chehab injects futex deadlock and uaddr fault errors. 2610ffebbeSMauro Carvalho Chehab 27400edd8cSChuck Lever- fail_sunrpc 28400edd8cSChuck Lever 29400edd8cSChuck Lever injects kernel RPC client and server failures. 30400edd8cSChuck Lever 3110ffebbeSMauro Carvalho Chehab- fail_make_request 3210ffebbeSMauro Carvalho Chehab 3310ffebbeSMauro Carvalho Chehab injects disk IO errors on devices permitted by setting 3410ffebbeSMauro Carvalho Chehab /sys/block/<device>/make-it-fail or 35ed00aabdSChristoph Hellwig /sys/block/<device>/<partition>/make-it-fail. (submit_bio_noacct()) 3610ffebbeSMauro Carvalho Chehab 3710ffebbeSMauro Carvalho Chehab- fail_mmc_request 3810ffebbeSMauro Carvalho Chehab 3910ffebbeSMauro Carvalho Chehab injects MMC data errors on devices permitted by setting 4010ffebbeSMauro Carvalho Chehab debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request 4110ffebbeSMauro Carvalho Chehab 4210ffebbeSMauro Carvalho Chehab- fail_function 4310ffebbeSMauro Carvalho Chehab 4410ffebbeSMauro Carvalho Chehab injects error return on specific functions, which are marked by 4510ffebbeSMauro Carvalho Chehab ALLOW_ERROR_INJECTION() macro, by setting debugfs entries 4610ffebbeSMauro Carvalho Chehab under /sys/kernel/debug/fail_function. No boot option supported. 4710ffebbeSMauro Carvalho Chehab 4810ffebbeSMauro Carvalho Chehab- NVMe fault injection 4910ffebbeSMauro Carvalho Chehab 5010ffebbeSMauro Carvalho Chehab inject NVMe status code and retry flag on devices permitted by setting 5110ffebbeSMauro Carvalho Chehab debugfs entries under /sys/kernel/debug/nvme*/fault_inject. The default 5210ffebbeSMauro Carvalho Chehab status code is NVME_SC_INVALID_OPCODE with no retry. The status code and 5310ffebbeSMauro Carvalho Chehab retry flag can be set via the debugfs. 5410ffebbeSMauro Carvalho Chehab 55*bb4c19e0SAkinobu Mita- Null test block driver fault injection 56*bb4c19e0SAkinobu Mita 57*bb4c19e0SAkinobu Mita inject IO timeouts by setting config items under 58*bb4c19e0SAkinobu Mita /sys/kernel/config/nullb/<disk>/timeout_inject, 59*bb4c19e0SAkinobu Mita inject requeue requests by setting config items under 60*bb4c19e0SAkinobu Mita /sys/kernel/config/nullb/<disk>/requeue_inject, and 61*bb4c19e0SAkinobu Mita inject init_hctx() errors by setting config items under 62*bb4c19e0SAkinobu Mita /sys/kernel/config/nullb/<disk>/init_hctx_fault_inject. 6310ffebbeSMauro Carvalho Chehab 6410ffebbeSMauro Carvalho ChehabConfigure fault-injection capabilities behavior 6510ffebbeSMauro Carvalho Chehab----------------------------------------------- 6610ffebbeSMauro Carvalho Chehab 6710ffebbeSMauro Carvalho Chehabdebugfs entries 6810ffebbeSMauro Carvalho Chehab^^^^^^^^^^^^^^^ 6910ffebbeSMauro Carvalho Chehab 7010ffebbeSMauro Carvalho Chehabfault-inject-debugfs kernel module provides some debugfs entries for runtime 7110ffebbeSMauro Carvalho Chehabconfiguration of fault-injection capabilities. 7210ffebbeSMauro Carvalho Chehab 7310ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail*/probability: 7410ffebbeSMauro Carvalho Chehab 7510ffebbeSMauro Carvalho Chehab likelihood of failure injection, in percent. 7610ffebbeSMauro Carvalho Chehab 7710ffebbeSMauro Carvalho Chehab Format: <percent> 7810ffebbeSMauro Carvalho Chehab 7910ffebbeSMauro Carvalho Chehab Note that one-failure-per-hundred is a very high error rate 8010ffebbeSMauro Carvalho Chehab for some testcases. Consider setting probability=100 and configure 8110ffebbeSMauro Carvalho Chehab /sys/kernel/debug/fail*/interval for such testcases. 8210ffebbeSMauro Carvalho Chehab 8310ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail*/interval: 8410ffebbeSMauro Carvalho Chehab 8510ffebbeSMauro Carvalho Chehab specifies the interval between failures, for calls to 8610ffebbeSMauro Carvalho Chehab should_fail() that pass all the other tests. 8710ffebbeSMauro Carvalho Chehab 8810ffebbeSMauro Carvalho Chehab Note that if you enable this, by setting interval>1, you will 8910ffebbeSMauro Carvalho Chehab probably want to set probability=100. 9010ffebbeSMauro Carvalho Chehab 9110ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail*/times: 9210ffebbeSMauro Carvalho Chehab 9300574752SWolfram Sang specifies how many times failures may happen at most. A value of -1 94d472cf79SAkinobu Mita means "no limit". 9510ffebbeSMauro Carvalho Chehab 9610ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail*/space: 9710ffebbeSMauro Carvalho Chehab 9810ffebbeSMauro Carvalho Chehab specifies an initial resource "budget", decremented by "size" 9910ffebbeSMauro Carvalho Chehab on each call to should_fail(,size). Failure injection is 10010ffebbeSMauro Carvalho Chehab suppressed until "space" reaches zero. 10110ffebbeSMauro Carvalho Chehab 10210ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail*/verbose 10310ffebbeSMauro Carvalho Chehab 10410ffebbeSMauro Carvalho Chehab Format: { 0 | 1 | 2 } 10510ffebbeSMauro Carvalho Chehab 10610ffebbeSMauro Carvalho Chehab specifies the verbosity of the messages when failure is 10710ffebbeSMauro Carvalho Chehab injected. '0' means no messages; '1' will print only a single 10810ffebbeSMauro Carvalho Chehab log line per failure; '2' will print a call trace too -- useful 10910ffebbeSMauro Carvalho Chehab to debug the problems revealed by fault injection. 11010ffebbeSMauro Carvalho Chehab 11110ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail*/task-filter: 11210ffebbeSMauro Carvalho Chehab 11310ffebbeSMauro Carvalho Chehab Format: { 'Y' | 'N' } 11410ffebbeSMauro Carvalho Chehab 11510ffebbeSMauro Carvalho Chehab A value of 'N' disables filtering by process (default). 11610ffebbeSMauro Carvalho Chehab Any positive value limits failures to only processes indicated by 11710ffebbeSMauro Carvalho Chehab /proc/<pid>/make-it-fail==1. 11810ffebbeSMauro Carvalho Chehab 11910ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail*/require-start, 12010ffebbeSMauro Carvalho Chehab /sys/kernel/debug/fail*/require-end, 12110ffebbeSMauro Carvalho Chehab /sys/kernel/debug/fail*/reject-start, 12210ffebbeSMauro Carvalho Chehab /sys/kernel/debug/fail*/reject-end: 12310ffebbeSMauro Carvalho Chehab 12410ffebbeSMauro Carvalho Chehab specifies the range of virtual addresses tested during 12510ffebbeSMauro Carvalho Chehab stacktrace walking. Failure is injected only if some caller 12610ffebbeSMauro Carvalho Chehab in the walked stacktrace lies within the required range, and 12710ffebbeSMauro Carvalho Chehab none lies within the rejected range. 12810ffebbeSMauro Carvalho Chehab Default required range is [0,ULONG_MAX) (whole of virtual address space). 12910ffebbeSMauro Carvalho Chehab Default rejected range is [0,0). 13010ffebbeSMauro Carvalho Chehab 13110ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail*/stacktrace-depth: 13210ffebbeSMauro Carvalho Chehab 13310ffebbeSMauro Carvalho Chehab specifies the maximum stacktrace depth walked during search 13410ffebbeSMauro Carvalho Chehab for a caller within [require-start,require-end) OR 13510ffebbeSMauro Carvalho Chehab [reject-start,reject-end). 13610ffebbeSMauro Carvalho Chehab 13710ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem: 13810ffebbeSMauro Carvalho Chehab 13910ffebbeSMauro Carvalho Chehab Format: { 'Y' | 'N' } 14010ffebbeSMauro Carvalho Chehab 141bad3fbb2SDylan Yudaken default is 'Y', setting it to 'N' will also inject failures into 142bad3fbb2SDylan Yudaken highmem/user allocations (__GFP_HIGHMEM allocations). 14310ffebbeSMauro Carvalho Chehab 14410ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/failslab/ignore-gfp-wait: 14510ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait: 14610ffebbeSMauro Carvalho Chehab 14710ffebbeSMauro Carvalho Chehab Format: { 'Y' | 'N' } 14810ffebbeSMauro Carvalho Chehab 149bad3fbb2SDylan Yudaken default is 'Y', setting it to 'N' will also inject failures 150bad3fbb2SDylan Yudaken into allocations that can sleep (__GFP_DIRECT_RECLAIM allocations). 15110ffebbeSMauro Carvalho Chehab 15210ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail_page_alloc/min-order: 15310ffebbeSMauro Carvalho Chehab 15410ffebbeSMauro Carvalho Chehab specifies the minimum page allocation order to be injected 15510ffebbeSMauro Carvalho Chehab failures. 15610ffebbeSMauro Carvalho Chehab 15710ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail_futex/ignore-private: 15810ffebbeSMauro Carvalho Chehab 15910ffebbeSMauro Carvalho Chehab Format: { 'Y' | 'N' } 16010ffebbeSMauro Carvalho Chehab 16110ffebbeSMauro Carvalho Chehab default is 'N', setting it to 'Y' will disable failure injections 16210ffebbeSMauro Carvalho Chehab when dealing with private (address space) futexes. 16310ffebbeSMauro Carvalho Chehab 164400edd8cSChuck Lever- /sys/kernel/debug/fail_sunrpc/ignore-client-disconnect: 165400edd8cSChuck Lever 166400edd8cSChuck Lever Format: { 'Y' | 'N' } 167400edd8cSChuck Lever 168400edd8cSChuck Lever default is 'N', setting it to 'Y' will disable disconnect 169400edd8cSChuck Lever injection on the RPC client. 170400edd8cSChuck Lever 171400edd8cSChuck Lever- /sys/kernel/debug/fail_sunrpc/ignore-server-disconnect: 172400edd8cSChuck Lever 173400edd8cSChuck Lever Format: { 'Y' | 'N' } 174400edd8cSChuck Lever 175400edd8cSChuck Lever default is 'N', setting it to 'Y' will disable disconnect 176400edd8cSChuck Lever injection on the RPC server. 177400edd8cSChuck Lever 17836f2ef2dSChuck Lever- /sys/kernel/debug/fail_sunrpc/ignore-cache-wait: 17936f2ef2dSChuck Lever 18036f2ef2dSChuck Lever Format: { 'Y' | 'N' } 18136f2ef2dSChuck Lever 18236f2ef2dSChuck Lever default is 'N', setting it to 'Y' will disable cache wait 18336f2ef2dSChuck Lever injection on the RPC server. 18436f2ef2dSChuck Lever 18510ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail_function/inject: 18610ffebbeSMauro Carvalho Chehab 18710ffebbeSMauro Carvalho Chehab Format: { 'function-name' | '!function-name' | '' } 18810ffebbeSMauro Carvalho Chehab 18910ffebbeSMauro Carvalho Chehab specifies the target function of error injection by name. 19010ffebbeSMauro Carvalho Chehab If the function name leads '!' prefix, given function is 19110ffebbeSMauro Carvalho Chehab removed from injection list. If nothing specified ('') 19210ffebbeSMauro Carvalho Chehab injection list is cleared. 19310ffebbeSMauro Carvalho Chehab 19410ffebbeSMauro Carvalho Chehab- /sys/kernel/debug/fail_function/injectable: 19510ffebbeSMauro Carvalho Chehab 19610ffebbeSMauro Carvalho Chehab (read only) shows error injectable functions and what type of 19710ffebbeSMauro Carvalho Chehab error values can be specified. The error type will be one of 19810ffebbeSMauro Carvalho Chehab below; 19910ffebbeSMauro Carvalho Chehab - NULL: retval must be 0. 20010ffebbeSMauro Carvalho Chehab - ERRNO: retval must be -1 to -MAX_ERRNO (-4096). 20110ffebbeSMauro Carvalho Chehab - ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096). 20210ffebbeSMauro Carvalho Chehab 20300574752SWolfram Sang- /sys/kernel/debug/fail_function/<function-name>/retval: 20410ffebbeSMauro Carvalho Chehab 20500574752SWolfram Sang specifies the "error" return value to inject to the given function. 20600574752SWolfram Sang This will be created when the user specifies a new injection entry. 20700574752SWolfram Sang Note that this file only accepts unsigned values. So, if you want to 20800574752SWolfram Sang use a negative errno, you better use 'printf' instead of 'echo', e.g.: 20900574752SWolfram Sang $ printf %#x -12 > retval 21010ffebbeSMauro Carvalho Chehab 21110ffebbeSMauro Carvalho ChehabBoot option 21210ffebbeSMauro Carvalho Chehab^^^^^^^^^^^ 21310ffebbeSMauro Carvalho Chehab 21410ffebbeSMauro Carvalho ChehabIn order to inject faults while debugfs is not available (early boot time), 21510ffebbeSMauro Carvalho Chehabuse the boot option:: 21610ffebbeSMauro Carvalho Chehab 21710ffebbeSMauro Carvalho Chehab failslab= 21810ffebbeSMauro Carvalho Chehab fail_page_alloc= 2192c739cedSAlbert van der Linde fail_usercopy= 22010ffebbeSMauro Carvalho Chehab fail_make_request= 22110ffebbeSMauro Carvalho Chehab fail_futex= 22210ffebbeSMauro Carvalho Chehab mmc_core.fail_request=<interval>,<probability>,<space>,<times> 22310ffebbeSMauro Carvalho Chehab 22410ffebbeSMauro Carvalho Chehabproc entries 22510ffebbeSMauro Carvalho Chehab^^^^^^^^^^^^ 22610ffebbeSMauro Carvalho Chehab 22710ffebbeSMauro Carvalho Chehab- /proc/<pid>/fail-nth, 22810ffebbeSMauro Carvalho Chehab /proc/self/task/<tid>/fail-nth: 22910ffebbeSMauro Carvalho Chehab 23010ffebbeSMauro Carvalho Chehab Write to this file of integer N makes N-th call in the task fail. 23110ffebbeSMauro Carvalho Chehab Read from this file returns a integer value. A value of '0' indicates 23210ffebbeSMauro Carvalho Chehab that the fault setup with a previous write to this file was injected. 23310ffebbeSMauro Carvalho Chehab A positive integer N indicates that the fault wasn't yet injected. 23410ffebbeSMauro Carvalho Chehab Note that this file enables all types of faults (slab, futex, etc). 23510ffebbeSMauro Carvalho Chehab This setting takes precedence over all other generic debugfs settings 23610ffebbeSMauro Carvalho Chehab like probability, interval, times, etc. But per-capability settings 23710ffebbeSMauro Carvalho Chehab (e.g. fail_futex/ignore-private) take precedence over it. 23810ffebbeSMauro Carvalho Chehab 23910ffebbeSMauro Carvalho Chehab This feature is intended for systematic testing of faults in a single 24010ffebbeSMauro Carvalho Chehab system call. See an example below. 24110ffebbeSMauro Carvalho Chehab 242bef7ec4eSMasami Hiramatsu (Google) 243bef7ec4eSMasami Hiramatsu (Google)Error Injectable Functions 244bef7ec4eSMasami Hiramatsu (Google)-------------------------- 245bef7ec4eSMasami Hiramatsu (Google) 246bef7ec4eSMasami Hiramatsu (Google)This part is for the kenrel developers considering to add a function to 247bef7ec4eSMasami Hiramatsu (Google)ALLOW_ERROR_INJECTION() macro. 248bef7ec4eSMasami Hiramatsu (Google) 249bef7ec4eSMasami Hiramatsu (Google)Requirements for the Error Injectable Functions 250bef7ec4eSMasami Hiramatsu (Google)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 251bef7ec4eSMasami Hiramatsu (Google) 252bef7ec4eSMasami Hiramatsu (Google)Since the function-level error injection forcibly changes the code path 253bef7ec4eSMasami Hiramatsu (Google)and returns an error even if the input and conditions are proper, this can 254bef7ec4eSMasami Hiramatsu (Google)cause unexpected kernel crash if you allow error injection on the function 255bef7ec4eSMasami Hiramatsu (Google)which is NOT error injectable. Thus, you (and reviewers) must ensure; 256bef7ec4eSMasami Hiramatsu (Google) 257bef7ec4eSMasami Hiramatsu (Google)- The function returns an error code if it fails, and the callers must check 258bef7ec4eSMasami Hiramatsu (Google) it correctly (need to recover from it). 259bef7ec4eSMasami Hiramatsu (Google) 260bef7ec4eSMasami Hiramatsu (Google)- The function does not execute any code which can change any state before 261bef7ec4eSMasami Hiramatsu (Google) the first error return. The state includes global or local, or input 262bef7ec4eSMasami Hiramatsu (Google) variable. For example, clear output address storage (e.g. `*ret = NULL`), 263bef7ec4eSMasami Hiramatsu (Google) increments/decrements counter, set a flag, preempt/irq disable or get 264bef7ec4eSMasami Hiramatsu (Google) a lock (if those are recovered before returning error, that will be OK.) 265bef7ec4eSMasami Hiramatsu (Google) 266bef7ec4eSMasami Hiramatsu (Google)The first requirement is important, and it will result in that the release 267bef7ec4eSMasami Hiramatsu (Google)(free objects) functions are usually harder to inject errors than allocate 268bef7ec4eSMasami Hiramatsu (Google)functions. If errors of such release functions are not correctly handled 269bef7ec4eSMasami Hiramatsu (Google)it will cause a memory leak easily (the caller will confuse that the object 270bef7ec4eSMasami Hiramatsu (Google)has been released or corrupted.) 271bef7ec4eSMasami Hiramatsu (Google) 272bef7ec4eSMasami Hiramatsu (Google)The second one is for the caller which expects the function should always 273bef7ec4eSMasami Hiramatsu (Google)does something. Thus if the function error injection skips whole of the 274bef7ec4eSMasami Hiramatsu (Google)function, the expectation is betrayed and causes an unexpected error. 275bef7ec4eSMasami Hiramatsu (Google) 276bef7ec4eSMasami Hiramatsu (Google)Type of the Error Injectable Functions 277bef7ec4eSMasami Hiramatsu (Google)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 278bef7ec4eSMasami Hiramatsu (Google) 279bef7ec4eSMasami Hiramatsu (Google)Each error injectable functions will have the error type specified by the 280bef7ec4eSMasami Hiramatsu (Google)ALLOW_ERROR_INJECTION() macro. You have to choose it carefully if you add 281bef7ec4eSMasami Hiramatsu (Google)a new error injectable function. If the wrong error type is chosen, the 282bef7ec4eSMasami Hiramatsu (Google)kernel may crash because it may not be able to handle the error. 283bef7ec4eSMasami Hiramatsu (Google)There are 4 types of errors defined in include/asm-generic/error-injection.h 284bef7ec4eSMasami Hiramatsu (Google) 285bef7ec4eSMasami Hiramatsu (Google)EI_ETYPE_NULL 286bef7ec4eSMasami Hiramatsu (Google) This function will return `NULL` if it fails. e.g. return an allocateed 287bef7ec4eSMasami Hiramatsu (Google) object address. 288bef7ec4eSMasami Hiramatsu (Google) 289bef7ec4eSMasami Hiramatsu (Google)EI_ETYPE_ERRNO 290bef7ec4eSMasami Hiramatsu (Google) This function will return an `-errno` error code if it fails. e.g. return 291bef7ec4eSMasami Hiramatsu (Google) -EINVAL if the input is wrong. This will include the functions which will 292bef7ec4eSMasami Hiramatsu (Google) return an address which encodes `-errno` by ERR_PTR() macro. 293bef7ec4eSMasami Hiramatsu (Google) 294bef7ec4eSMasami Hiramatsu (Google)EI_ETYPE_ERRNO_NULL 295bef7ec4eSMasami Hiramatsu (Google) This function will return an `-errno` or `NULL` if it fails. If the caller 296bef7ec4eSMasami Hiramatsu (Google) of this function checks the return value with IS_ERR_OR_NULL() macro, this 297bef7ec4eSMasami Hiramatsu (Google) type will be appropriate. 298bef7ec4eSMasami Hiramatsu (Google) 299bef7ec4eSMasami Hiramatsu (Google)EI_ETYPE_TRUE 300bef7ec4eSMasami Hiramatsu (Google) This function will return `true` (non-zero positive value) if it fails. 301bef7ec4eSMasami Hiramatsu (Google) 302bef7ec4eSMasami Hiramatsu (Google)If you specifies a wrong type, for example, EI_TYPE_ERRNO for the function 303bef7ec4eSMasami Hiramatsu (Google)which returns an allocated object, it may cause a problem because the returned 304bef7ec4eSMasami Hiramatsu (Google)value is not an object address and the caller can not access to the address. 305bef7ec4eSMasami Hiramatsu (Google) 306bef7ec4eSMasami Hiramatsu (Google) 30710ffebbeSMauro Carvalho ChehabHow to add new fault injection capability 30810ffebbeSMauro Carvalho Chehab----------------------------------------- 30910ffebbeSMauro Carvalho Chehab 31010ffebbeSMauro Carvalho Chehab- #include <linux/fault-inject.h> 31110ffebbeSMauro Carvalho Chehab 31210ffebbeSMauro Carvalho Chehab- define the fault attributes 31310ffebbeSMauro Carvalho Chehab 31410ffebbeSMauro Carvalho Chehab DECLARE_FAULT_ATTR(name); 31510ffebbeSMauro Carvalho Chehab 31610ffebbeSMauro Carvalho Chehab Please see the definition of struct fault_attr in fault-inject.h 31710ffebbeSMauro Carvalho Chehab for details. 31810ffebbeSMauro Carvalho Chehab 31910ffebbeSMauro Carvalho Chehab- provide a way to configure fault attributes 32010ffebbeSMauro Carvalho Chehab 32110ffebbeSMauro Carvalho Chehab- boot option 32210ffebbeSMauro Carvalho Chehab 32310ffebbeSMauro Carvalho Chehab If you need to enable the fault injection capability from boot time, you can 32410ffebbeSMauro Carvalho Chehab provide boot option to configure it. There is a helper function for it: 32510ffebbeSMauro Carvalho Chehab 32610ffebbeSMauro Carvalho Chehab setup_fault_attr(attr, str); 32710ffebbeSMauro Carvalho Chehab 32810ffebbeSMauro Carvalho Chehab- debugfs entries 32910ffebbeSMauro Carvalho Chehab 3302c739cedSAlbert van der Linde failslab, fail_page_alloc, fail_usercopy, and fail_make_request use this way. 33110ffebbeSMauro Carvalho Chehab Helper functions: 33210ffebbeSMauro Carvalho Chehab 33310ffebbeSMauro Carvalho Chehab fault_create_debugfs_attr(name, parent, attr); 33410ffebbeSMauro Carvalho Chehab 33510ffebbeSMauro Carvalho Chehab- module parameters 33610ffebbeSMauro Carvalho Chehab 33710ffebbeSMauro Carvalho Chehab If the scope of the fault injection capability is limited to a 33810ffebbeSMauro Carvalho Chehab single kernel module, it is better to provide module parameters to 33910ffebbeSMauro Carvalho Chehab configure the fault attributes. 34010ffebbeSMauro Carvalho Chehab 34110ffebbeSMauro Carvalho Chehab- add a hook to insert failures 34210ffebbeSMauro Carvalho Chehab 34310ffebbeSMauro Carvalho Chehab Upon should_fail() returning true, client code should inject a failure: 34410ffebbeSMauro Carvalho Chehab 34510ffebbeSMauro Carvalho Chehab should_fail(attr, size); 34610ffebbeSMauro Carvalho Chehab 34710ffebbeSMauro Carvalho ChehabApplication Examples 34810ffebbeSMauro Carvalho Chehab-------------------- 34910ffebbeSMauro Carvalho Chehab 35010ffebbeSMauro Carvalho Chehab- Inject slab allocation failures into module init/exit code:: 35110ffebbeSMauro Carvalho Chehab 35210ffebbeSMauro Carvalho Chehab #!/bin/bash 35310ffebbeSMauro Carvalho Chehab 35410ffebbeSMauro Carvalho Chehab FAILTYPE=failslab 35510ffebbeSMauro Carvalho Chehab echo Y > /sys/kernel/debug/$FAILTYPE/task-filter 35610ffebbeSMauro Carvalho Chehab echo 10 > /sys/kernel/debug/$FAILTYPE/probability 35710ffebbeSMauro Carvalho Chehab echo 100 > /sys/kernel/debug/$FAILTYPE/interval 358d472cf79SAkinobu Mita echo -1 > /sys/kernel/debug/$FAILTYPE/times 35910ffebbeSMauro Carvalho Chehab echo 0 > /sys/kernel/debug/$FAILTYPE/space 36010ffebbeSMauro Carvalho Chehab echo 2 > /sys/kernel/debug/$FAILTYPE/verbose 361bad3fbb2SDylan Yudaken echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait 36210ffebbeSMauro Carvalho Chehab 36310ffebbeSMauro Carvalho Chehab faulty_system() 36410ffebbeSMauro Carvalho Chehab { 36510ffebbeSMauro Carvalho Chehab bash -c "echo 1 > /proc/self/make-it-fail && exec $*" 36610ffebbeSMauro Carvalho Chehab } 36710ffebbeSMauro Carvalho Chehab 36810ffebbeSMauro Carvalho Chehab if [ $# -eq 0 ] 36910ffebbeSMauro Carvalho Chehab then 37010ffebbeSMauro Carvalho Chehab echo "Usage: $0 modulename [ modulename ... ]" 37110ffebbeSMauro Carvalho Chehab exit 1 37210ffebbeSMauro Carvalho Chehab fi 37310ffebbeSMauro Carvalho Chehab 37410ffebbeSMauro Carvalho Chehab for m in $* 37510ffebbeSMauro Carvalho Chehab do 37610ffebbeSMauro Carvalho Chehab echo inserting $m... 37710ffebbeSMauro Carvalho Chehab faulty_system modprobe $m 37810ffebbeSMauro Carvalho Chehab 37910ffebbeSMauro Carvalho Chehab echo removing $m... 38010ffebbeSMauro Carvalho Chehab faulty_system modprobe -r $m 38110ffebbeSMauro Carvalho Chehab done 38210ffebbeSMauro Carvalho Chehab 38310ffebbeSMauro Carvalho Chehab------------------------------------------------------------------------------ 38410ffebbeSMauro Carvalho Chehab 38510ffebbeSMauro Carvalho Chehab- Inject page allocation failures only for a specific module:: 38610ffebbeSMauro Carvalho Chehab 38710ffebbeSMauro Carvalho Chehab #!/bin/bash 38810ffebbeSMauro Carvalho Chehab 38910ffebbeSMauro Carvalho Chehab FAILTYPE=fail_page_alloc 39010ffebbeSMauro Carvalho Chehab module=$1 39110ffebbeSMauro Carvalho Chehab 39210ffebbeSMauro Carvalho Chehab if [ -z $module ] 39310ffebbeSMauro Carvalho Chehab then 39410ffebbeSMauro Carvalho Chehab echo "Usage: $0 <modulename>" 39510ffebbeSMauro Carvalho Chehab exit 1 39610ffebbeSMauro Carvalho Chehab fi 39710ffebbeSMauro Carvalho Chehab 39810ffebbeSMauro Carvalho Chehab modprobe $module 39910ffebbeSMauro Carvalho Chehab 40010ffebbeSMauro Carvalho Chehab if [ ! -d /sys/module/$module/sections ] 40110ffebbeSMauro Carvalho Chehab then 40210ffebbeSMauro Carvalho Chehab echo Module $module is not loaded 40310ffebbeSMauro Carvalho Chehab exit 1 40410ffebbeSMauro Carvalho Chehab fi 40510ffebbeSMauro Carvalho Chehab 40610ffebbeSMauro Carvalho Chehab cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start 40710ffebbeSMauro Carvalho Chehab cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end 40810ffebbeSMauro Carvalho Chehab 40910ffebbeSMauro Carvalho Chehab echo N > /sys/kernel/debug/$FAILTYPE/task-filter 41010ffebbeSMauro Carvalho Chehab echo 10 > /sys/kernel/debug/$FAILTYPE/probability 41110ffebbeSMauro Carvalho Chehab echo 100 > /sys/kernel/debug/$FAILTYPE/interval 412d472cf79SAkinobu Mita echo -1 > /sys/kernel/debug/$FAILTYPE/times 41310ffebbeSMauro Carvalho Chehab echo 0 > /sys/kernel/debug/$FAILTYPE/space 41410ffebbeSMauro Carvalho Chehab echo 2 > /sys/kernel/debug/$FAILTYPE/verbose 415bad3fbb2SDylan Yudaken echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait 416bad3fbb2SDylan Yudaken echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem 41710ffebbeSMauro Carvalho Chehab echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth 41810ffebbeSMauro Carvalho Chehab 41910ffebbeSMauro Carvalho Chehab trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT 42010ffebbeSMauro Carvalho Chehab 42110ffebbeSMauro Carvalho Chehab echo "Injecting errors into the module $module... (interrupt to stop)" 42210ffebbeSMauro Carvalho Chehab sleep 1000000 42310ffebbeSMauro Carvalho Chehab 42410ffebbeSMauro Carvalho Chehab------------------------------------------------------------------------------ 42510ffebbeSMauro Carvalho Chehab 42610ffebbeSMauro Carvalho Chehab- Inject open_ctree error while btrfs mount:: 42710ffebbeSMauro Carvalho Chehab 42810ffebbeSMauro Carvalho Chehab #!/bin/bash 42910ffebbeSMauro Carvalho Chehab 43010ffebbeSMauro Carvalho Chehab rm -f testfile.img 43110ffebbeSMauro Carvalho Chehab dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1 43210ffebbeSMauro Carvalho Chehab DEVICE=$(losetup --show -f testfile.img) 43310ffebbeSMauro Carvalho Chehab mkfs.btrfs -f $DEVICE 43410ffebbeSMauro Carvalho Chehab mkdir -p tmpmnt 43510ffebbeSMauro Carvalho Chehab 43610ffebbeSMauro Carvalho Chehab FAILTYPE=fail_function 43710ffebbeSMauro Carvalho Chehab FAILFUNC=open_ctree 43810ffebbeSMauro Carvalho Chehab echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject 43900574752SWolfram Sang printf %#x -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval 44010ffebbeSMauro Carvalho Chehab echo N > /sys/kernel/debug/$FAILTYPE/task-filter 44110ffebbeSMauro Carvalho Chehab echo 100 > /sys/kernel/debug/$FAILTYPE/probability 44210ffebbeSMauro Carvalho Chehab echo 0 > /sys/kernel/debug/$FAILTYPE/interval 443d472cf79SAkinobu Mita echo -1 > /sys/kernel/debug/$FAILTYPE/times 44410ffebbeSMauro Carvalho Chehab echo 0 > /sys/kernel/debug/$FAILTYPE/space 44510ffebbeSMauro Carvalho Chehab echo 1 > /sys/kernel/debug/$FAILTYPE/verbose 44610ffebbeSMauro Carvalho Chehab 44710ffebbeSMauro Carvalho Chehab mount -t btrfs $DEVICE tmpmnt 44810ffebbeSMauro Carvalho Chehab if [ $? -ne 0 ] 44910ffebbeSMauro Carvalho Chehab then 45010ffebbeSMauro Carvalho Chehab echo "SUCCESS!" 45110ffebbeSMauro Carvalho Chehab else 45210ffebbeSMauro Carvalho Chehab echo "FAILED!" 45310ffebbeSMauro Carvalho Chehab umount tmpmnt 45410ffebbeSMauro Carvalho Chehab fi 45510ffebbeSMauro Carvalho Chehab 45610ffebbeSMauro Carvalho Chehab echo > /sys/kernel/debug/$FAILTYPE/inject 45710ffebbeSMauro Carvalho Chehab 45810ffebbeSMauro Carvalho Chehab rmdir tmpmnt 45910ffebbeSMauro Carvalho Chehab losetup -d $DEVICE 46010ffebbeSMauro Carvalho Chehab rm testfile.img 46110ffebbeSMauro Carvalho Chehab 46210ffebbeSMauro Carvalho Chehab 46310ffebbeSMauro Carvalho ChehabTool to run command with failslab or fail_page_alloc 46410ffebbeSMauro Carvalho Chehab---------------------------------------------------- 46510ffebbeSMauro Carvalho ChehabIn order to make it easier to accomplish the tasks mentioned above, we can use 46610ffebbeSMauro Carvalho Chehabtools/testing/fault-injection/failcmd.sh. Please run a command 46710ffebbeSMauro Carvalho Chehab"./tools/testing/fault-injection/failcmd.sh --help" for more information and 46810ffebbeSMauro Carvalho Chehabsee the following examples. 46910ffebbeSMauro Carvalho Chehab 47010ffebbeSMauro Carvalho ChehabExamples: 47110ffebbeSMauro Carvalho Chehab 47210ffebbeSMauro Carvalho ChehabRun a command "make -C tools/testing/selftests/ run_tests" with injecting slab 47310ffebbeSMauro Carvalho Chehaballocation failure:: 47410ffebbeSMauro Carvalho Chehab 47510ffebbeSMauro Carvalho Chehab # ./tools/testing/fault-injection/failcmd.sh \ 47610ffebbeSMauro Carvalho Chehab -- make -C tools/testing/selftests/ run_tests 47710ffebbeSMauro Carvalho Chehab 47810ffebbeSMauro Carvalho ChehabSame as above except to specify 100 times failures at most instead of one time 47910ffebbeSMauro Carvalho Chehabat most by default:: 48010ffebbeSMauro Carvalho Chehab 48110ffebbeSMauro Carvalho Chehab # ./tools/testing/fault-injection/failcmd.sh --times=100 \ 48210ffebbeSMauro Carvalho Chehab -- make -C tools/testing/selftests/ run_tests 48310ffebbeSMauro Carvalho Chehab 48410ffebbeSMauro Carvalho ChehabSame as above except to inject page allocation failure instead of slab 48510ffebbeSMauro Carvalho Chehaballocation failure:: 48610ffebbeSMauro Carvalho Chehab 48710ffebbeSMauro Carvalho Chehab # env FAILCMD_TYPE=fail_page_alloc \ 48810ffebbeSMauro Carvalho Chehab ./tools/testing/fault-injection/failcmd.sh --times=100 \ 48910ffebbeSMauro Carvalho Chehab -- make -C tools/testing/selftests/ run_tests 49010ffebbeSMauro Carvalho Chehab 49110ffebbeSMauro Carvalho ChehabSystematic faults using fail-nth 49210ffebbeSMauro Carvalho Chehab--------------------------------- 49310ffebbeSMauro Carvalho Chehab 49410ffebbeSMauro Carvalho ChehabThe following code systematically faults 0-th, 1-st, 2-nd and so on 49510ffebbeSMauro Carvalho Chehabcapabilities in the socketpair() system call:: 49610ffebbeSMauro Carvalho Chehab 49710ffebbeSMauro Carvalho Chehab #include <sys/types.h> 49810ffebbeSMauro Carvalho Chehab #include <sys/stat.h> 49910ffebbeSMauro Carvalho Chehab #include <sys/socket.h> 50010ffebbeSMauro Carvalho Chehab #include <sys/syscall.h> 50110ffebbeSMauro Carvalho Chehab #include <fcntl.h> 50210ffebbeSMauro Carvalho Chehab #include <unistd.h> 50310ffebbeSMauro Carvalho Chehab #include <string.h> 50410ffebbeSMauro Carvalho Chehab #include <stdlib.h> 50510ffebbeSMauro Carvalho Chehab #include <stdio.h> 50610ffebbeSMauro Carvalho Chehab #include <errno.h> 50710ffebbeSMauro Carvalho Chehab 50810ffebbeSMauro Carvalho Chehab int main() 50910ffebbeSMauro Carvalho Chehab { 51010ffebbeSMauro Carvalho Chehab int i, err, res, fail_nth, fds[2]; 51110ffebbeSMauro Carvalho Chehab char buf[128]; 51210ffebbeSMauro Carvalho Chehab 51310ffebbeSMauro Carvalho Chehab system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait"); 51410ffebbeSMauro Carvalho Chehab sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid)); 51510ffebbeSMauro Carvalho Chehab fail_nth = open(buf, O_RDWR); 51610ffebbeSMauro Carvalho Chehab for (i = 1;; i++) { 51710ffebbeSMauro Carvalho Chehab sprintf(buf, "%d", i); 51810ffebbeSMauro Carvalho Chehab write(fail_nth, buf, strlen(buf)); 51910ffebbeSMauro Carvalho Chehab res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds); 52010ffebbeSMauro Carvalho Chehab err = errno; 52110ffebbeSMauro Carvalho Chehab pread(fail_nth, buf, sizeof(buf), 0); 52210ffebbeSMauro Carvalho Chehab if (res == 0) { 52310ffebbeSMauro Carvalho Chehab close(fds[0]); 52410ffebbeSMauro Carvalho Chehab close(fds[1]); 52510ffebbeSMauro Carvalho Chehab } 52610ffebbeSMauro Carvalho Chehab printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y', 52710ffebbeSMauro Carvalho Chehab res, err); 52810ffebbeSMauro Carvalho Chehab if (atoi(buf)) 52910ffebbeSMauro Carvalho Chehab break; 53010ffebbeSMauro Carvalho Chehab } 53110ffebbeSMauro Carvalho Chehab return 0; 53210ffebbeSMauro Carvalho Chehab } 53310ffebbeSMauro Carvalho Chehab 53410ffebbeSMauro Carvalho ChehabAn example output:: 53510ffebbeSMauro Carvalho Chehab 53610ffebbeSMauro Carvalho Chehab 1-th fault Y: res=-1/23 53710ffebbeSMauro Carvalho Chehab 2-th fault Y: res=-1/23 53810ffebbeSMauro Carvalho Chehab 3-th fault Y: res=-1/12 53910ffebbeSMauro Carvalho Chehab 4-th fault Y: res=-1/12 54010ffebbeSMauro Carvalho Chehab 5-th fault Y: res=-1/23 54110ffebbeSMauro Carvalho Chehab 6-th fault Y: res=-1/23 54210ffebbeSMauro Carvalho Chehab 7-th fault Y: res=-1/23 54310ffebbeSMauro Carvalho Chehab 8-th fault Y: res=-1/12 54410ffebbeSMauro Carvalho Chehab 9-th fault Y: res=-1/12 54510ffebbeSMauro Carvalho Chehab 10-th fault Y: res=-1/12 54610ffebbeSMauro Carvalho Chehab 11-th fault Y: res=-1/12 54710ffebbeSMauro Carvalho Chehab 12-th fault Y: res=-1/12 54810ffebbeSMauro Carvalho Chehab 13-th fault Y: res=-1/12 54910ffebbeSMauro Carvalho Chehab 14-th fault Y: res=-1/12 55010ffebbeSMauro Carvalho Chehab 15-th fault Y: res=-1/12 55110ffebbeSMauro Carvalho Chehab 16-th fault N: res=0/12 552