xref: /freebsd/share/man/man9/fail.9 (revision b88cc53d4ddda4486683ee2121f131b10ed21c30)
1.\"
2.\" Copyright (c) 2009-2019 Dell EMC Isilon http://www.isilon.com/
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice(s), this list of conditions and the following disclaimer as
9.\"    the first lines of this file unmodified other than the possible
10.\"    addition of one or more copyright notices.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice(s), this list of conditions and the following disclaimer in the
13.\"    documentation and/or other materials provided with the distribution.
14.\"
15.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY
16.\" EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
17.\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
18.\" DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY
19.\" DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
20.\" (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
21.\" SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
22.\" CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
25.\" DAMAGE.
26.\"
27.\" $FreeBSD$
28.\"
29.Dd June 6, 2019
30.Dt FAIL 9
31.Os
32.Sh NAME
33.Nm DEBUG_FP ,
34.Nm KFAIL_POINT_CODE ,
35.Nm KFAIL_POINT_CODE_FLAGS ,
36.Nm KFAIL_POINT_CODE_COND ,
37.Nm KFAIL_POINT_ERROR ,
38.Nm KFAIL_POINT_EVAL ,
39.Nm KFAIL_POINT_DECLARE ,
40.Nm KFAIL_POINT_DEFINE ,
41.Nm KFAIL_POINT_GOTO ,
42.Nm KFAIL_POINT_RETURN ,
43.Nm KFAIL_POINT_RETURN_VOID ,
44.Nm KFAIL_POINT_SLEEP_CALLBACKS ,
45.Nm fail_point
46.Nd fail points
47.Sh SYNOPSIS
48.In sys/fail.h
49.Fn KFAIL_POINT_CODE "parent" "name" "code"
50.Fn KFAIL_POINT_CODE_FLAGS "parent" "name" "flags" "code"
51.Fn KFAIL_POINT_CODE_COND "parent" "name" "cond" "flags" "code"
52.Fn KFAIL_POINT_ERROR "parent" "name" "error_var"
53.Fn KFAIL_POINT_EVAL "name" "code"
54.Fn KFAIL_POINT_DECLARE "name"
55.Fn KFAIL_POINT_DEFINE "parent" "name" "flags"
56.Fn KFAIL_POINT_GOTO "parent" "name" "error_var" "label"
57.Fn KFAIL_POINT_RETURN "parent" "name"
58.Fn KFAIL_POINT_RETURN_VOID "parent" "name"
59.Fn KFAIL_POINT_SLEEP_CALLBACKS "parent" "name" "pre_func" "pre_arg" "post_func" "post_arg" "code"
60.Sh DESCRIPTION
61Fail points are used to add code points where errors may be injected
62in a user controlled fashion.
63Fail points provide a convenient wrapper around user-provided error
64injection code, providing a
65.Xr sysctl 9
66MIB, and a parser for that MIB that describes how the error
67injection code should fire.
68.Pp
69The base fail point macro is
70.Fn KFAIL_POINT_CODE
71where
72.Fa parent
73is a sysctl tree (frequently
74.Sy DEBUG_FP
75for kernel fail points, but various subsystems may wish to provide
76their own fail point trees), and
77.Fa name
78is the name of the MIB in that tree, and
79.Fa code
80is the error injection code.
81The
82.Fa code
83argument does not require braces, but it is considered good style to
84use braces for any multi-line code arguments.
85Inside the
86.Fa code
87argument, the evaluation of
88.Sy RETURN_VALUE
89is derived from the
90.Fn return
91value set in the sysctl MIB.
92.Pp
93Additionally,
94.Fn KFAIL_POINT_CODE_FLAGS
95provides a
96.Fa flags
97argument which controls the fail point's behaviour.
98This can be used to e.g., mark the fail point's context as non-sleepable,
99which causes the
100.Sy sleep
101action to be coerced to a busy wait.
102The supported flags are:
103.Bl -ohang -offset indent
104.It FAIL_POINT_USE_TIMEOUT_PATH
105Rather than sleeping on a
106.Fn sleep
107call, just fire the post-sleep function after a timeout fires.
108.It FAIL_POINT_NONSLEEPABLE
109Mark the fail point as being in a non-sleepable context, which coerces
110.Fn sleep
111calls to
112.Fn delay
113calls.
114.El
115.Pp
116Likewise,
117.Fn KFAIL_POINT_CODE_COND
118supplies a
119.Fa cond
120argument, which allows you to set the condition under which the fail point's
121code may fire.
122This is equivalent to:
123.Bd -literal
124	if (cond)
125		KFAIL_POINT_CODE_FLAGS(...);
126
127.Ed
128See
129.Sx SYSCTL VARIABLES
130below.
131.Pp
132The remaining
133.Fn KFAIL_POINT_*
134macros are wrappers around common error injection paths:
135.Bl -inset
136.It Fn KFAIL_POINT_RETURN parent name
137is the equivalent of
138.Sy KFAIL_POINT_CODE(..., return RETURN_VALUE)
139.It Fn KFAIL_POINT_RETURN_VOID parent name
140is the equivalent of
141.Sy KFAIL_POINT_CODE(..., return)
142.It Fn KFAIL_POINT_ERROR parent name error_var
143is the equivalent of
144.Sy KFAIL_POINT_CODE(..., error_var = RETURN_VALUE)
145.It Fn KFAIL_POINT_GOTO parent name error_var label
146is the equivalent of
147.Sy KFAIL_POINT_CODE(..., { error_var = RETURN_VALUE; goto label;})
148.El
149.Pp
150You can also introduce fail points by separating the declaration,
151definition, and evaluation portions.
152.Bl -inset
153.It Fn KFAIL_POINT_DECLARE name
154is used to declare the
155.Sy fail_point
156struct.
157.It Fn KFAIL_POINT_DEFINE parent name flags
158defines and initializes the
159.Sy fail_point
160and sets up its
161.Xr sysctl 9 .
162.It Fn KFAIL_POINT_EVAL name code
163is used at the point that the fail point is executed.
164.El
165.Sh SYSCTL VARIABLES
166The
167.Fn KFAIL_POINT_*
168macros add sysctl MIBs where specified.
169Many base kernel MIBs can be found in the
170.Sy debug.fail_point
171tree (referenced in code by
172.Sy DEBUG_FP ) .
173.Pp
174The sysctl variable may be set in a number of ways:
175.Bd -literal
176  [<pct>%][<cnt>*]<type>[(args...)][-><more terms>]
177.Ed
178.Pp
179The <type> argument specifies which action to take; it can be one of:
180.Bl -tag -width ".Dv return"
181.It Sy off
182Take no action (does not trigger fail point code)
183.It Sy return
184Trigger fail point code with specified argument
185.It Sy sleep
186Sleep the specified number of milliseconds
187.It Sy panic
188Panic
189.It Sy break
190Break into the debugger, or trap if there is no debugger support
191.It Sy print
192Print that the fail point executed
193.It Sy pause
194Threads sleep at the fail point until the fail point is set to
195.Sy off
196.It Sy yield
197Thread yields the cpu when the fail point is evaluated
198.It Sy delay
199Similar to sleep, but busy waits the cpu.
200(Useful in non-sleepable contexts.)
201.El
202.Pp
203The <pct>% and <cnt>* modifiers prior to <type> control when
204<type> is executed.
205The <pct>% form (e.g. "1.2%") can be used to specify a
206probability that <type> will execute.
207This is a decimal in the range (0, 100] which can specify up to
2081/10,000% precision.
209The <cnt>* form (e.g. "5*") can be used to specify the number of
210times <type> should be executed before this <term> is disabled.
211Only the last probability and the last count are used if multiple
212are specified, i.e. "1.2%2%" is the same as "2%".
213When both a probability and a count are specified, the probability
214is evaluated before the count, i.e. "2%5*" means "2% of the time,
215but only 5 times total".
216.Pp
217The operator -> can be used to express cascading terms.
218If you specify <term1>-><term2>, it means that if <term1> does not
219.Ql execute ,
220<term2> is evaluated.
221For the purpose of this operator, the return() and print() operators
222are the only types that cascade.
223A return() term only cascades if the code executes, and a print()
224term only cascades when passed a non-zero argument.
225A pid can optionally be specified.
226The fail point term is only executed when invoked by a process with a
227matching p_pid.
228.Sh EXAMPLES
229.Bl -tag -width Sy
230.It Sy sysctl debug.fail_point.foobar="2.1%return(5)"
23121/1000ths of the time, execute
232.Fa code
233with RETURN_VALUE set to 5.
234.It Sy sysctl debug.fail_point.foobar="2%return(5)->5%return(22)"
2352/100ths of the time, execute
236.Fa code
237with RETURN_VALUE set to 5.
238If that does not happen, 5% of the time execute
239.Fa code
240with RETURN_VALUE set to 22.
241.It Sy sysctl debug.fail_point.foobar="5*return(5)->0.1%return(22)"
242For 5 times, return 5.
243After that, 1/1000th of the time, return 22.
244.It Sy sysctl debug.fail_point.foobar="0.1%5*return(5)"
245Return 5 for 1 in 1000 executions, but only 5 times total.
246.It Sy sysctl debug.fail_point.foobar="1%*sleep(50)"
2471/100th of the time, sleep 50ms.
248.It Sy sysctl debug.fail_point.foobar="1*return(5)[pid 1234]"
249Return 5 once, when pid 1234 executes the fail point.
250.El
251.Sh AUTHORS
252.An -nosplit
253This manual page was written by
254.Pp
255.An Matthew Bryan Aq Mt matthew.bryan@isilon.com
256and
257.Pp
258.An Zach Loafman Aq Mt zml@FreeBSD.org .
259.Sh CAVEATS
260It is easy to shoot yourself in the foot by setting fail points too
261aggressively or setting too many in combination.
262For example, forcing
263.Fn malloc
264to fail consistently is potentially harmful to uptime.
265.Pp
266The
267.Fn sleep
268sysctl setting may not be appropriate in all situations.
269Currently,
270.Fn fail_point_eval
271does not verify whether the context is appropriate for calling
272.Fn msleep .
273You can force it to evaluate a
274.Sy sleep
275action as a
276.Sy delay
277action by specifying the
278.Sy FAIL_POINT_NONSLEEPABLE
279flag at the point the fail point is declared.
280