xref: /freebsd/usr.sbin/watchdogd/watchdogd.8 (revision d940bfec8c329dd82d8d54efebd81c8aa420503b)
1.\" Copyright (c) 2013  iXsystems.com,
2.\"                     author: Alfred Perlstein <alfred@freebsd.org>
3.\" Copyright (c) 2004  Poul-Henning Kamp <phk@FreeBSD.org>
4.\" Copyright (c) 2003  Sean M. Kelly <smkelly@FreeBSD.org>
5.\" All rights reserved.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that the following conditions
9.\" are met:
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\" 2. Redistributions in binary form must reproduce the above copyright
13.\"    notice, this list of conditions and the following disclaimer in the
14.\"    documentation and/or other materials provided with the distribution.
15.\"
16.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
19.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
26.\" SUCH DAMAGE.
27.\"
28.\" $FreeBSD$
29.\"
30.Dd July 27, 2013
31.Dt WATCHDOGD 8
32.Os
33.Sh NAME
34.Nm watchdogd
35.Nd watchdog daemon
36.Sh SYNOPSIS
37.Nm
38.Op Fl dnSw
39.Op Fl -debug
40.Op Fl -softtimeout
41.Op Fl -softtimeout-action Ar action
42.Op Fl -pretimeout Ar timeout
43.Op Fl -pretimeout-action Ar action
44.Op Fl e Ar cmd
45.Op Fl I Ar file
46.Op Fl s Ar sleep
47.Op Fl t Ar timeout
48.Op Fl T Ar script_timeout
49.Sh DESCRIPTION
50The
51.Nm
52utility interfaces with the kernel's watchdog facility to ensure
53that the system is in a working state.
54If
55.Nm
56is unable to interface with the kernel over a specific timeout,
57the kernel will take actions to assist in debugging or restarting the computer.
58.Pp
59If
60.Fl e Ar cmd
61is specified,
62.Nm
63will attempt to execute this command with
64.Xr system 3 ,
65and only if the command returns with a zero exit code will the
66watchdog be reset.
67If
68.Fl e Ar cmd
69is not specified, the daemon will perform a trivial file system
70check instead.
71.Pp
72The
73.Fl n
74argument 'dry-run' will cause watchdog not to arm the system watchdog and
75instead only run the watchdog function and report on failures.
76This is useful for developing new watchdogd scripts as the system will not
77reboot if there are problems with the script.
78.Pp
79The
80.Fl s Ar sleep
81argument can be used to control the sleep period between each execution
82of the check and defaults to one second.
83.Pp
84The
85.Fl t Ar timeout
86specifies the desired timeout period in seconds.
87The default timeout is 16 seconds.
88.Pp
89One possible circumstance which will cause a watchdog timeout is an interrupt
90storm.
91If this occurs,
92.Nm
93will no longer execute and thus the kernel's watchdog routines will take
94action after a configurable timeout.
95.Pp
96The
97.Fl T Ar script_timeout
98specifies the threshold (in seconds) at which the watchdogd will complain
99that its script has run for too long.
100If unset
101.Ar script_timeout
102defaults to the value specified by the
103.Fl s Ar sleep
104option.
105.Pp
106Upon receiving the
107.Dv SIGTERM
108or
109.Dv SIGINT
110signals,
111.Nm
112will first instruct the kernel to no longer perform watchdog checks and then
113will terminate.
114.Pp
115The
116.Nm
117utility recognizes the following runtime options:
118.Bl -tag -width 30m
119.It Fl I Ar file
120Write the process ID of the
121.Nm
122utility in the specified file.
123.It Fl d Fl -debug
124Do not fork.
125When this option is specified,
126.Nm
127will not fork into the background at startup.
128.Pp
129.It Fl S
130Do not send a message to the system logger when the watchdog command takes
131longer than expected to execute.
132The default behaviour is to log a warning via the system logger with the
133LOG_DAEMON facility, and to output a warning to standard error.
134.Pp
135.It Fl w
136Complain when the watchdog script takes too long.
137This flag will cause watchdogd to complain when the amount of time to
138execute the watchdog script exceeds the threshold of 'sleep' option.
139.Pp
140.It Fl -pretimeout Ar timeout
141Set a "pretimeout" watchdog.
142At "timeout" seconds before the watchdog will fire attempt an action.
143The action is set by the --pretimeout-action flag.
144The default is just to log a message (WD_SOFT_LOG) via
145.Xr log 9 .
146.Pp
147.It Fl -pretimeout-action Ar action
148Set the timeout action for the pretimeout.
149See the section
150.Sx Timeout Actions .
151.Pp
152.It Fl -softtimeout
153Instead of arming the various hardware watchdogs, only use a basic software
154watchdog.
155The default action is just to
156.Xr log 9
157a message (WD_SOFT_LOG).
158.Pp
159.It Fl -softtimeout-action Ar action
160Set the timeout action for the softtimeout.
161See the section
162.Sx Timeout Actions .
163.Pp
164.El
165.Sh Timeout Actions
166The following timeout actions are available via the
167.Fl -pretimeout-action
168and
169.Fl -softtimeout-action
170flags:
171.Bl -tag -width ".Ar printf  "
172.It Ar panic
173Call
174.Xr panic 9
175when the timeout is reached.
176.Pp
177.It Ar ddb
178Enter the kernel debugger via
179.Xr kdb_enter 9
180when the timeout is reached.
181.Pp
182.It Ar log
183Log a message using
184.Xr log 9
185when the timeout is reached.
186.Pp
187.It Ar printf
188call the kernel
189.Xr printf 9
190to display a message to the console and
191.Xr dmesg 8
192buffer.
193.Pp
194.El
195Actions can be combined in a comma separated list as so:
196.Ar log,printf
197which would both
198.Xr printf 9
199and
200.Xr log 9
201which will send messages both to
202.Xr dmesg 8
203and the kernel
204.Xr log 4
205device for
206.Xr syslog 8 .
207.Sh FILES
208.Bl -tag -width ".Pa /var/run/watchdogd.pid" -compact
209.It Pa /var/run/watchdogd.pid
210.El
211.Sh EXAMPLES
212.Ss Debugging watchdogd and/or your watchdog script.
213This is a useful recipe for debugging
214.Nm
215and your watchdog script.
216.Pp
217(Note that ^C works oddly because
218.Nm
219calls
220.Xr system 3
221so the
222first ^C will terminate the "sleep" command.)
223.Pp
224Explanation of options used:
225.Bl -enum -offset indent -compact
226.It
227Set Debug on (--debug)
228.It
229Set the watchdog to trip at 30 seconds. (-t 30)
230.It
231Use of a softtimeout:
232.Bl -enum -offset indent -compact -nested
233.It
234Use a softtimeout (do not arm the hardware watchdog).
235(--softtimeout)
236.It
237Set the softtimeout action to do both kernel
238.Xr printf 9
239and
240.Xr log 9
241when it trips.
242(--softtimeout-action log,printf)
243.El
244.It
245Use of a pre-timeout:
246.Bl -enum -offset indent -compact -nested
247.It
248Set a pre-timeout of 15 seconds (this will later trigger a panic/dump).
249(--pretimeout 15)
250.It
251Set the action to also kernel
252.Xr printf 9
253and
254.Xr log 9
255when it trips.
256(--pretimeout-action log,printf)
257.El
258.It
259Use of a script:
260.Bl -enum -offset indent -compact -nested
261.It
262Run "sleep 60" as a shell command that acts as the watchdog (-e 'sleep 60')
263.It
264Warn us when the script takes longer than 1 second to run (-w)
265.El
266.El
267.Bd -literal
268watchdogd --debug -t 30 \\
269  --softtimeout --softtimeout-action log,printf \\
270  --pretimeout 15 --pretimeout-action log,printf \\
271  -e 'sleep 60' -w
272.Ed
273.Ss Production use of example
274.Bl -enum -offset indent -compact
275.It
276Set hard timeout to 120 seconds (-t 120)
277.It
278Set a panic to happen at 60 seconds (to trigger a
279.Xr crash 8
280for dump analysis):
281.Bl -enum -offset indent -compact -nested
282.It
283Use of pre-timeout (--pretimeout 60)
284.It
285Specify pre-timeout action (--pretimeout-action log,printf,panic )
286.El
287.It
288Use of a script:
289.Bl -enum -offset indent -compact -nested
290.It
291Run your script (-e '/path/to/your/script 60')
292.It
293Log if your script takes a longer than 15 seconds to run time. (-w -T 15)
294.El
295.El
296.Bd -literal
297watchdogd  -t 120 \\
298  --pretimeout 60 --pretimeout-action log,printf,panic \\
299  -e '/path/to/your/script 60' -w -T 15
300.Ed
301.Sh SEE ALSO
302.Xr watchdog 4 ,
303.Xr watchdog 8 ,
304.Xr watchdog 9
305.Sh HISTORY
306The
307.Nm
308utility appeared in
309.Fx 5.1 .
310.Sh AUTHORS
311.An -nosplit
312The
313.Nm
314utility and manual page were written by
315.An Sean Kelly Aq smkelly@FreeBSD.org
316and
317.An Poul-Henning Kamp Aq phk@FreeBSD.org .
318.Pp
319Some contributions made by
320.An Jeff Roberson Aq jeff@FreeBSD.org .
321.Pp
322The pretimeout and softtimeout action system was added by
323.An Alfred Perlstein Aq alfred@freebsd.org .
324