xref: /freebsd/usr.sbin/watchdogd/watchdogd.8 (revision ff0ba87247820afbdfdc1b307c803f7923d0e4d3)
1.\" Copyright (c) 2013  iXsystems.com,
2.\"                     author: Alfred Perlstein <alfred@freebsd.org>
3.\" Copyright (c) 2004  Poul-Henning Kamp <phk@FreeBSD.org>
4.\" Copyright (c) 2003  Sean M. Kelly <smkelly@FreeBSD.org>
5.\" All rights reserved.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that the following conditions
9.\" are met:
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\" 2. Redistributions in binary form must reproduce the above copyright
13.\"    notice, this list of conditions and the following disclaimer in the
14.\"    documentation and/or other materials provided with the distribution.
15.\"
16.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
19.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
26.\" SUCH DAMAGE.
27.\"
28.\" $FreeBSD$
29.\"
30.Dd November 16, 2014
31.Dt WATCHDOGD 8
32.Os
33.Sh NAME
34.Nm watchdogd
35.Nd watchdog daemon
36.Sh SYNOPSIS
37.Nm
38.Op Fl dnSw
39.Op Fl -debug
40.Op Fl -softtimeout
41.Op Fl -softtimeout-action Ar action
42.Op Fl -pretimeout Ar timeout
43.Op Fl -pretimeout-action Ar action
44.Op Fl e Ar cmd
45.Op Fl I Ar file
46.Op Fl s Ar sleep
47.Op Fl t Ar timeout
48.Op Fl T Ar script_timeout
49.Sh DESCRIPTION
50The
51.Nm
52utility interfaces with the kernel's watchdog facility to ensure
53that the system is in a working state.
54If
55.Nm
56is unable to interface with the kernel over a specific timeout,
57the kernel will take actions to assist in debugging or restarting the computer.
58.Pp
59If
60.Fl e Ar cmd
61is specified,
62.Nm
63will attempt to execute this command with
64.Xr system 3 ,
65and only if the command returns with a zero exit code will the
66watchdog be reset.
67If
68.Fl e Ar cmd
69is not specified, the daemon will perform a trivial file system
70check instead.
71.Pp
72The
73.Fl n
74argument 'dry-run' will cause watchdog not to arm the system watchdog and
75instead only run the watchdog function and report on failures.
76This is useful for developing new watchdogd scripts as the system will not
77reboot if there are problems with the script.
78.Pp
79The
80.Fl s Ar sleep
81argument can be used to control the sleep period between each execution
82of the check and defaults to 10 seconds.
83.Pp
84The
85.Fl t Ar timeout
86specifies the desired timeout period in seconds.
87The default timeout is 128 seconds.
88.Pp
89One possible circumstance which will cause a watchdog timeout is an interrupt
90storm.
91If this occurs,
92.Nm
93will no longer execute and thus the kernel's watchdog routines will take
94action after a configurable timeout.
95.Pp
96The
97.Fl T Ar script_timeout
98specifies the threshold (in seconds) at which the watchdogd will complain
99that its script has run for too long.
100If unset
101.Ar script_timeout
102defaults to the value specified by the
103.Fl s Ar sleep
104option.
105.Pp
106Upon receiving the
107.Dv SIGTERM
108or
109.Dv SIGINT
110signals,
111.Nm
112will first instruct the kernel to no longer perform watchdog checks and then
113will terminate.
114.Pp
115The
116.Nm
117utility recognizes the following runtime options:
118.Bl -tag -width 30m
119.It Fl I Ar file
120Write the process ID of the
121.Nm
122utility in the specified file.
123.It Fl d Fl -debug
124Do not fork.
125When this option is specified,
126.Nm
127will not fork into the background at startup.
128.It Fl S
129Do not send a message to the system logger when the watchdog command takes
130longer than expected to execute.
131The default behaviour is to log a warning via the system logger with the
132LOG_DAEMON facility, and to output a warning to standard error.
133.It Fl w
134Complain when the watchdog script takes too long.
135This flag will cause watchdogd to complain when the amount of time to
136execute the watchdog script exceeds the threshold of 'sleep' option.
137.It Fl -pretimeout Ar timeout
138Set a "pretimeout" watchdog.
139At "timeout" seconds before the watchdog will fire attempt an action.
140The action is set by the --pretimeout-action flag.
141The default is just to log a message (WD_SOFT_LOG) via
142.Xr log 9 .
143.It Fl -pretimeout-action Ar action
144Set the timeout action for the pretimeout.
145See the section
146.Sx Timeout Actions .
147.It Fl -softtimeout
148Instead of arming the various hardware watchdogs, only use a basic software
149watchdog.
150The default action is just to
151.Xr log 9
152a message (WD_SOFT_LOG).
153.It Fl -softtimeout-action Ar action
154Set the timeout action for the softtimeout.
155See the section
156.Sx Timeout Actions .
157.El
158.Sh Timeout Actions
159The following timeout actions are available via the
160.Fl -pretimeout-action
161and
162.Fl -softtimeout-action
163flags:
164.Bl -tag -width ".Ar printf  "
165.It Ar panic
166Call
167.Xr panic 9
168when the timeout is reached.
169.It Ar ddb
170Enter the kernel debugger via
171.Xr kdb_enter 9
172when the timeout is reached.
173.It Ar log
174Log a message using
175.Xr log 9
176when the timeout is reached.
177.It Ar printf
178call the kernel
179.Xr printf 9
180to display a message to the console and
181.Xr dmesg 8
182buffer.
183.El
184.Pp
185Actions can be combined in a comma separated list as so:
186.Ar log,printf
187which would both
188.Xr printf 9
189and
190.Xr log 9
191which will send messages both to
192.Xr dmesg 8
193and the kernel
194.Xr log 4
195device for
196.Xr syslog 8 .
197.Sh FILES
198.Bl -tag -width ".Pa /var/run/watchdogd.pid" -compact
199.It Pa /var/run/watchdogd.pid
200.El
201.Sh EXAMPLES
202.Ss Debugging watchdogd and/or your watchdog script.
203This is a useful recipe for debugging
204.Nm
205and your watchdog script.
206.Pp
207(Note that ^C works oddly because
208.Nm
209calls
210.Xr system 3
211so the
212first ^C will terminate the "sleep" command.)
213.Pp
214Explanation of options used:
215.Bl -enum -offset indent -compact
216.It
217Set Debug on (--debug)
218.It
219Set the watchdog to trip at 30 seconds. (-t 30)
220.It
221Use of a softtimeout:
222.Bl -enum -offset indent -compact -nested
223.It
224Use a softtimeout (do not arm the hardware watchdog).
225(--softtimeout)
226.It
227Set the softtimeout action to do both kernel
228.Xr printf 9
229and
230.Xr log 9
231when it trips.
232(--softtimeout-action log,printf)
233.El
234.It
235Use of a pre-timeout:
236.Bl -enum -offset indent -compact -nested
237.It
238Set a pre-timeout of 15 seconds (this will later trigger a panic/dump).
239(--pretimeout 15)
240.It
241Set the action to also kernel
242.Xr printf 9
243and
244.Xr log 9
245when it trips.
246(--pretimeout-action log,printf)
247.El
248.It
249Use of a script:
250.Bl -enum -offset indent -compact -nested
251.It
252Run "sleep 60" as a shell command that acts as the watchdog (-e 'sleep 60')
253.It
254Warn us when the script takes longer than 1 second to run (-w)
255.El
256.El
257.Bd -literal
258watchdogd --debug -t 30 \\
259  --softtimeout --softtimeout-action log,printf \\
260  --pretimeout 15 --pretimeout-action log,printf \\
261  -e 'sleep 60' -w
262.Ed
263.Ss Production use of example
264.Bl -enum -offset indent -compact
265.It
266Set hard timeout to 120 seconds (-t 120)
267.It
268Set a panic to happen at 60 seconds (to trigger a
269.Xr crash 8
270for dump analysis):
271.Bl -enum -offset indent -compact -nested
272.It
273Use of pre-timeout (--pretimeout 60)
274.It
275Specify pre-timeout action (--pretimeout-action log,printf,panic )
276.El
277.It
278Use of a script:
279.Bl -enum -offset indent -compact -nested
280.It
281Run your script (-e '/path/to/your/script 60')
282.It
283Log if your script takes a longer than 15 seconds to run time. (-w -T 15)
284.El
285.El
286.Bd -literal
287watchdogd  -t 120 \\
288  --pretimeout 60 --pretimeout-action log,printf,panic \\
289  -e '/path/to/your/script 60' -w -T 15
290.Ed
291.Sh SEE ALSO
292.Xr watchdog 4 ,
293.Xr watchdog 8 ,
294.Xr watchdog 9
295.Sh HISTORY
296The
297.Nm
298utility appeared in
299.Fx 5.1 .
300.Sh AUTHORS
301.An -nosplit
302The
303.Nm
304utility and manual page were written by
305.An Sean Kelly Aq Mt smkelly@FreeBSD.org
306and
307.An Poul-Henning Kamp Aq Mt phk@FreeBSD.org .
308.Pp
309Some contributions made by
310.An Jeff Roberson Aq Mt jeff@FreeBSD.org .
311.Pp
312The pretimeout and softtimeout action system was added by
313.An Alfred Perlstein Aq Mt alfred@freebsd.org .
314