xref: /freebsd/share/man/man9/fail.9 (revision 3fc36ee018bb836bd1796067cf4ef8683f166ebc)
1.\"
2.\" Copyright (c) 2009 Isilon Inc http://www.isilon.com/
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice(s), this list of conditions and the following disclaimer as
9.\"    the first lines of this file unmodified other than the possible
10.\"    addition of one or more copyright notices.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice(s), this list of conditions and the following disclaimer in the
13.\"    documentation and/or other materials provided with the distribution.
14.\"
15.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY
16.\" EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
17.\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
18.\" DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY
19.\" DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
20.\" (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
21.\" SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
22.\" CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
25.\" DAMAGE.
26.\"
27.\" $FreeBSD$
28.\"
29.Dd March 15, 2016
30.Dt FAIL 9
31.Os
32.Sh NAME
33.Nm KFAIL_POINT_CODE ,
34.Nm KFAIL_POINT_CODE_FLAGS ,
35.Nm KFAIL_POINT_CODE_COND ,
36.Nm KFAIL_POINT_RETURN ,
37.Nm KFAIL_POINT_RETURN_VOID ,
38.Nm KFAIL_POINT_ERROR ,
39.Nm KFAIL_POINT_GOTO ,
40.Nm KFAIL_POINT_SLEEP_CALLBACKS ,
41.Nm fail_point ,
42.Nm DEBUG_FP
43.Nd fail points
44.Sh SYNOPSIS
45.In sys/fail.h
46.Fn KFAIL_POINT_CODE "parent" "name" "code"
47.Fn KFAIL_POINT_CODE_FLAGS "parent" "name" "flags" "code"
48.Fn KFAIL_POINT_CODE_COND "parent" "name" "cond" "flags" "code"
49.Fn KFAIL_POINT_RETURN "parent" "name"
50.Fn KFAIL_POINT_RETURN_VOID "parent" "name"
51.Fn KFAIL_POINT_ERROR "parent" "name" "error_var"
52.Fn KFAIL_POINT_GOTO "parent" "name" "error_var" "label"
53.Fn KFAIL_POINT_SLEEP_CALLBACKS "parent" "name" "pre_func" "pre_arg" "post_func" "post_arg" "code"
54.Sh DESCRIPTION
55Fail points are used to add code points where errors may be injected
56in a user controlled fashion.
57Fail points provide a convenient wrapper around user-provided error
58injection code, providing a
59.Xr sysctl 9
60MIB, and a parser for that MIB that describes how the error
61injection code should fire.
62.Pp
63The base fail point macro is
64.Fn KFAIL_POINT_CODE
65where
66.Fa parent
67is a sysctl tree (frequently
68.Sy DEBUG_FP
69for kernel fail points, but various subsystems may wish to provide
70their own fail point trees), and
71.Fa name
72is the name of the MIB in that tree, and
73.Fa code
74is the error injection code.
75The
76.Fa code
77argument does not require braces, but it is considered good style to
78use braces for any multi-line code arguments.
79Inside the
80.Fa code
81argument, the evaluation of
82.Sy RETURN_VALUE
83is derived from the
84.Fn return
85value set in the sysctl MIB.
86.Pp
87Additionally,
88.Fn KFAIL_POINT_CODE_FLAGS
89provides a
90.Fa flags
91argument which controls the fail point's behaviour.
92This can be used to e.g., mark the fail point's context as non-sleepable,
93which causes the
94.Sy sleep
95action to be coerced to a busy wait.
96The supported flags are:
97.Bl -ohang -offset indent
98.It FAIL_POINT_USE_TIMEOUT_PATH
99Rather than sleeping on a
100.Fn sleep
101call, just fire the post-sleep function after a timeout fires.
102.It FAIL_POINT_NONSLEEPABLE
103Mark the fail point as being in a non-sleepable context, which coerces
104.Fn sleep
105calls to
106.Fn delay
107calls.
108.El
109.Pp
110Likewise,
111.Fn KFAIL_POINT_CODE_COND
112supplies a
113.Fa cond
114argument, which allows you to set the condition under which the fail point's
115code may fire.
116This is equivalent to:
117.Bd -literal
118	if (cond)
119		KFAIL_POINT_CODE_FLAGS(...);
120
121.Ed
122See
123.Sx SYSCTL VARIABLES
124below.
125.Pp
126The remaining
127.Fn KFAIL_POINT_*
128macros are wrappers around common error injection paths:
129.Bl -inset
130.It Fn KFAIL_POINT_RETURN parent name
131is the equivalent of
132.Sy KFAIL_POINT_CODE(..., return RETURN_VALUE)
133.It Fn KFAIL_POINT_RETURN_VOID parent name
134is the equivalent of
135.Sy KFAIL_POINT_CODE(..., return)
136.It Fn KFAIL_POINT_ERROR parent name error_var
137is the equivalent of
138.Sy KFAIL_POINT_CODE(..., error_var = RETURN_VALUE)
139.It Fn KFAIL_POINT_GOTO parent name error_var label
140is the equivalent of
141.Sy KFAIL_POINT_CODE(..., { error_var = RETURN_VALUE; goto label;})
142.El
143.Sh SYSCTL VARIABLES
144The
145.Fn KFAIL_POINT_*
146macros add sysctl MIBs where specified.
147Many base kernel MIBs can be found in the
148.Sy debug.fail_point
149tree (referenced in code by
150.Sy DEBUG_FP ) .
151.Pp
152The sysctl variable may be set in a number of ways:
153.Bd -literal
154  [<pct>%][<cnt>*]<type>[(args...)][-><more terms>]
155.Ed
156.Pp
157The <type> argument specifies which action to take; it can be one of:
158.Bl -tag -width ".Dv return"
159.It Sy off
160Take no action (does not trigger fail point code)
161.It Sy return
162Trigger fail point code with specified argument
163.It Sy sleep
164Sleep the specified number of milliseconds
165.It Sy panic
166Panic
167.It Sy break
168Break into the debugger, or trap if there is no debugger support
169.It Sy print
170Print that the fail point executed
171.It Sy pause
172Threads sleep at the fail point until the fail point is set to
173.Sy off
174.It Sy yield
175Thread yields the cpu when the fail point is evaluated
176.It Sy delay
177Similar to sleep, but busy waits the cpu.
178(Useful in non-sleepable contexts.)
179.El
180.Pp
181The <pct>% and <cnt>* modifiers prior to <type> control when
182<type> is executed.
183The <pct>% form (e.g. "1.2%") can be used to specify a
184probability that <type> will execute.
185This is a decimal in the range (0, 100] which can specify up to
1861/10,000% precision.
187The <cnt>* form (e.g. "5*") can be used to specify the number of
188times <type> should be executed before this <term> is disabled.
189Only the last probability and the last count are used if multiple
190are specified, i.e. "1.2%2%" is the same as "2%".
191When both a probability and a count are specified, the probability
192is evaluated before the count, i.e. "2%5*" means "2% of the time,
193but only 5 times total".
194.Pp
195The operator -> can be used to express cascading terms.
196If you specify <term1>-><term2>, it means that if <term1> does not
197.Ql execute ,
198<term2> is evaluated.
199For the purpose of this operator, the return() and print() operators
200are the only types that cascade.
201A return() term only cascades if the code executes, and a print()
202term only cascades when passed a non-zero argument.
203A pid can optionally be specified.
204The fail point term is only executed when invoked by a process with a
205matching p_pid.
206.Sh EXAMPLES
207.Bl -tag -width Sy
208.It Sy sysctl debug.fail_point.foobar="2.1%return(5)"
20921/1000ths of the time, execute
210.Fa code
211with RETURN_VALUE set to 5.
212.It Sy sysctl debug.fail_point.foobar="2%return(5)->5%return(22)"
2132/100ths of the time, execute
214.Fa code
215with RETURN_VALUE set to 5.
216If that does not happen, 5% of the time execute
217.Fa code
218with RETURN_VALUE set to 22.
219.It Sy sysctl debug.fail_point.foobar="5*return(5)->0.1%return(22)"
220For 5 times, return 5.
221After that, 1/1000th of the time, return 22.
222.It Sy sysctl debug.fail_point.foobar="0.1%5*return(5)"
223Return 5 for 1 in 1000 executions, but only 5 times total.
224.It Sy sysctl debug.fail_point.foobar="1%*sleep(50)"
2251/100th of the time, sleep 50ms.
226.It Sy sysctl debug.fail_point.foobar="1*return(5)[pid 1234]"
227Return 5 once, when pid 1234 executes the fail point.
228.El
229.Sh AUTHORS
230.An -nosplit
231This manual page was written by
232.Pp
233.An Matthew Bryan Aq Mt matthew.bryan@isilon.com
234and
235.Pp
236.An Zach Loafman Aq Mt zml@FreeBSD.org .
237.Sh CAVEATS
238It is easy to shoot yourself in the foot by setting fail points too
239aggressively or setting too many in combination.
240For example, forcing
241.Fn malloc
242to fail consistently is potentially harmful to uptime.
243.Pp
244The
245.Fn sleep
246sysctl setting may not be appropriate in all situations.
247Currently,
248.Fn fail_point_eval
249does not verify whether the context is appropriate for calling
250.Fn msleep .
251You can force it to evaluate a
252.Sy sleep
253action as a
254.Sy delay
255action by specifying the
256.Sy FAIL_POINT_NONSLEEPABLE
257flag at the point the fail point is declared.
258