1.\" Copyright (c) 2013 iXsystems.com, 2.\" author: Alfred Perlstein <alfred@freebsd.org> 3.\" Copyright (c) 2004 Poul-Henning Kamp <phk@FreeBSD.org> 4.\" Copyright (c) 2003 Sean M. Kelly <smkelly@FreeBSD.org> 5.\" All rights reserved. 6.\" 7.\" Redistribution and use in source and binary forms, with or without 8.\" modification, are permitted provided that the following conditions 9.\" are met: 10.\" 1. Redistributions of source code must retain the above copyright 11.\" notice, this list of conditions and the following disclaimer. 12.\" 2. Redistributions in binary form must reproduce the above copyright 13.\" notice, this list of conditions and the following disclaimer in the 14.\" documentation and/or other materials provided with the distribution. 15.\" 16.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 19.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 26.\" SUCH DAMAGE. 27.\" 28.Dd May 11, 2015 29.Dt WATCHDOGD 8 30.Os 31.Sh NAME 32.Nm watchdogd 33.Nd watchdog daemon 34.Sh SYNOPSIS 35.Nm 36.Op Fl dnSw 37.Op Fl -debug 38.Op Fl -softtimeout 39.Op Fl -softtimeout-action Ar action 40.Op Fl -pretimeout Ar timeout 41.Op Fl -pretimeout-action Ar action 42.Op Fl e Ar cmd 43.Op Fl I Ar file 44.Op Fl s Ar sleep 45.Op Fl t Ar timeout 46.Op Fl T Ar script_timeout 47.Op Fl x Ar exit_timeout 48.Sh DESCRIPTION 49The 50.Nm 51utility interfaces with the kernel's watchdog facility to ensure 52that the system is in a working state. 53If 54.Nm 55is unable to interface with the kernel over a specific timeout, 56the kernel will take actions to assist in debugging or restarting the computer. 57.Pp 58If 59.Fl e Ar cmd 60is specified, 61.Nm 62will attempt to execute this command with 63.Xr system 3 , 64and only if the command returns with a zero exit code will the 65watchdog be reset. 66If 67.Fl e Ar cmd 68is not specified, the daemon will perform a trivial file system 69check instead. 70.Pp 71The 72.Fl n 73argument 'dry-run' will cause watchdog not to arm the system watchdog and 74instead only run the watchdog function and report on failures. 75This is useful for developing new watchdogd scripts as the system will not 76reboot if there are problems with the script. 77.Pp 78The 79.Fl s Ar sleep 80argument can be used to control the sleep period between each execution 81of the check and defaults to 10 seconds. 82.Pp 83The 84.Fl t Ar timeout 85specifies the desired timeout period in seconds. 86The default timeout is 128 seconds. 87.Pp 88One possible circumstance which will cause a watchdog timeout is an interrupt 89storm. 90If this occurs, 91.Nm 92will no longer execute and thus the kernel's watchdog routines will take 93action after a configurable timeout. 94.Pp 95The 96.Fl T Ar script_timeout 97specifies the threshold (in seconds) at which the watchdogd will complain 98that its script has run for too long. 99If unset 100.Ar script_timeout 101defaults to the value specified by the 102.Fl s Ar sleep 103option. 104.Pp 105The 106.Fl x Ar exit_timeout 107argument is the timeout period (in seconds) to leave in effect when the 108program exits. 109Using 110.Fl x 111with a non-zero value protects against lockup during a reboot by 112triggering a hardware reset if the software reboot doesn't complete 113before the given timeout expires. 114.Pp 115Upon receiving the 116.Dv SIGTERM 117or 118.Dv SIGINT 119signals, 120.Nm 121will terminate, after first instructing the kernel to either disable the 122timeout or reset it to the value given by 123.Fl x Ar exit_timeout . 124.Pp 125The 126.Nm 127utility recognizes the following runtime options: 128.Bl -tag -width 30m 129.It Fl I Ar file 130Write the process ID of the 131.Nm 132utility in the specified file. 133.It Fl d Fl -debug 134Do not fork. 135When this option is specified, 136.Nm 137will not fork into the background at startup. 138.It Fl S 139Do not send a message to the system logger when the watchdog command takes 140longer than expected to execute. 141The default behaviour is to log a warning via the system logger with the 142LOG_DAEMON facility, and to output a warning to standard error. 143.It Fl w 144Complain when the watchdog script takes too long. 145This flag will cause watchdogd to complain when the amount of time to 146execute the watchdog script exceeds the threshold of 'sleep' option. 147.It Fl -pretimeout Ar timeout 148Set a "pretimeout" watchdog. 149At "timeout" seconds before the watchdog will fire attempt an action. 150The action is set by the --pretimeout-action flag. 151The default is just to log a message (WD_SOFT_LOG) via 152.Xr log 9 . 153.It Fl -pretimeout-action Ar action 154Set the timeout action for the pretimeout. 155See the section 156.Sx Timeout Actions . 157.It Fl -softtimeout 158Instead of arming the various hardware watchdogs, only use a basic software 159watchdog. 160The default action is just to 161.Xr log 9 162a message (WD_SOFT_LOG). 163.It Fl -softtimeout-action Ar action 164Set the timeout action for the softtimeout. 165See the section 166.Sx Timeout Actions . 167.El 168.Sh Timeout Actions 169The following timeout actions are available via the 170.Fl -pretimeout-action 171and 172.Fl -softtimeout-action 173flags: 174.Bl -tag -width ".Ar printf " 175.It Ar panic 176Call 177.Xr panic 9 178when the timeout is reached. 179.It Ar ddb 180Enter the kernel debugger via 181.Xr kdb_enter 9 182when the timeout is reached. 183.It Ar log 184Log a message using 185.Xr log 9 186when the timeout is reached. 187.It Ar printf 188call the kernel 189.Xr printf 9 190to display a message to the console and 191.Xr dmesg 8 192buffer. 193.El 194.Pp 195Actions can be combined in a comma separated list as so: 196.Ar log,printf 197which would both 198.Xr printf 9 199and 200.Xr log 9 201which will send messages both to 202.Xr dmesg 8 203and the kernel 204.Xr log 4 205device for 206.Xr syslogd 8 . 207.Sh FILES 208.Bl -tag -width ".Pa /var/run/watchdogd.pid" -compact 209.It Pa /var/run/watchdogd.pid 210.El 211.Sh EXAMPLES 212.Ss Debugging watchdogd and/or your watchdog script. 213This is a useful recipe for debugging 214.Nm 215and your watchdog script. 216.Pp 217(Note that ^C works oddly because 218.Nm 219calls 220.Xr system 3 221so the 222first ^C will terminate the "sleep" command.) 223.Pp 224Explanation of options used: 225.Bl -enum -offset indent -compact 226.It 227Set Debug on (--debug) 228.It 229Set the watchdog to trip at 30 seconds. (-t 30) 230.It 231Use of a softtimeout: 232.Bl -enum -offset indent -compact -nested 233.It 234Use a softtimeout (do not arm the hardware watchdog). 235(--softtimeout) 236.It 237Set the softtimeout action to do both kernel 238.Xr printf 9 239and 240.Xr log 9 241when it trips. 242(--softtimeout-action log,printf) 243.El 244.It 245Use of a pre-timeout: 246.Bl -enum -offset indent -compact -nested 247.It 248Set a pre-timeout of 15 seconds (this will later trigger a panic/dump). 249(--pretimeout 15) 250.It 251Set the action to also kernel 252.Xr printf 9 253and 254.Xr log 9 255when it trips. 256(--pretimeout-action log,printf) 257.El 258.It 259Use of a script: 260.Bl -enum -offset indent -compact -nested 261.It 262Run "sleep 60" as a shell command that acts as the watchdog (-e 'sleep 60') 263.It 264Warn us when the script takes longer than 1 second to run (-w) 265.El 266.El 267.Bd -literal 268watchdogd --debug -t 30 \\ 269 --softtimeout --softtimeout-action log,printf \\ 270 --pretimeout 15 --pretimeout-action log,printf \\ 271 -e 'sleep 60' -w 272.Ed 273.Ss Production use of example 274.Bl -enum -offset indent -compact 275.It 276Set hard timeout to 120 seconds (-t 120) 277.It 278Set a panic to happen at 60 seconds (to trigger a 279.Xr crash 8 280for dump analysis): 281.Bl -enum -offset indent -compact -nested 282.It 283Use of pre-timeout (--pretimeout 60) 284.It 285Specify pre-timeout action (--pretimeout-action log,printf,panic ) 286.El 287.It 288Use of a script: 289.Bl -enum -offset indent -compact -nested 290.It 291Run your script (-e '/path/to/your/script 60') 292.It 293Log if your script takes a longer than 15 seconds to run time. (-w -T 15) 294.El 295.El 296.Bd -literal 297watchdogd -t 120 \\ 298 --pretimeout 60 --pretimeout-action log,printf,panic \\ 299 -e '/path/to/your/script 60' -w -T 15 300.Ed 301.Sh SEE ALSO 302.Xr watchdog 4 , 303.Xr watchdog 8 , 304.Xr watchdog 9 305.Sh HISTORY 306The 307.Nm 308utility appeared in 309.Fx 5.1 . 310.Sh AUTHORS 311.An -nosplit 312The 313.Nm 314utility and manual page were written by 315.An Sean Kelly Aq Mt smkelly@FreeBSD.org 316and 317.An Poul-Henning Kamp Aq Mt phk@FreeBSD.org . 318.Pp 319Some contributions made by 320.An Jeff Roberson Aq Mt jeff@FreeBSD.org . 321.Pp 322The pretimeout and softtimeout action system was added by 323.An Alfred Perlstein Aq Mt alfred@freebsd.org . 324