1.\" Copyright (c) 2010 The FreeBSD Foundation 2.\" Copyright (c) 2010-2011 Pawel Jakub Dawidek <pawel@dawidek.net> 3.\" All rights reserved. 4.\" 5.\" This software was developed by Pawel Jakub Dawidek under sponsorship from 6.\" the FreeBSD Foundation. 7.\" 8.\" Redistribution and use in source and binary forms, with or without 9.\" modification, are permitted provided that the following conditions 10.\" are met: 11.\" 1. Redistributions of source code must retain the above copyright 12.\" notice, this list of conditions and the following disclaimer. 13.\" 2. Redistributions in binary form must reproduce the above copyright 14.\" notice, this list of conditions and the following disclaimer in the 15.\" documentation and/or other materials provided with the distribution. 16.\" 17.\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND 18.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 19.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 20.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE 21.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 22.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 23.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 24.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 25.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 26.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 27.\" SUCH DAMAGE. 28.\" 29.\" $FreeBSD$ 30.\" 31.Dd May 20, 2011 32.Dt HAST.CONF 5 33.Os 34.Sh NAME 35.Nm hast.conf 36.Nd configuration file for the 37.Xr hastd 8 38daemon and the 39.Xr hastctl 8 40utility. 41.Sh DESCRIPTION 42The 43.Nm 44file is used by both 45.Xr hastd 8 46daemon 47and 48.Xr hastctl 8 49control utility. 50Configuration file is designed in a way that exactly the same file can be 51(and should be) used on both HAST nodes. 52Every line starting with # is treated as comment and ignored. 53.Sh CONFIGURATION FILE SYNTAX 54General syntax of the 55.Nm 56file is following: 57.Bd -literal -offset indent 58# Global section 59control <addr> 60listen <addr> 61replication <mode> 62checksum <algorithm> 63compression <algorithm> 64timeout <seconds> 65exec <path> 66metaflush "on" | "off" 67 68on <node> { 69 # Node section 70 control <addr> 71 listen <addr> 72} 73 74on <node> { 75 # Node section 76 control <addr> 77 listen <addr> 78} 79 80resource <name> { 81 # Resource section 82 replication <mode> 83 checksum <algorithm> 84 compression <algorithm> 85 name <name> 86 local <path> 87 timeout <seconds> 88 exec <path> 89 metaflush "on" | "off" 90 91 on <node> { 92 # Resource-node section 93 name <name> 94 # Required 95 local <path> 96 metaflush "on" | "off" 97 # Required 98 remote <addr> 99 source <addr> 100 } 101 on <node> { 102 # Resource-node section 103 name <name> 104 # Required 105 local <path> 106 metaflush "on" | "off" 107 # Required 108 remote <addr> 109 source <addr> 110 } 111} 112.Ed 113.Pp 114Most of the various available configuration parameters are optional. 115If parameter is not defined in the particular section, it will be 116inherited from the parent section. 117For example, if the 118.Ic listen 119parameter is not defined in the node section, it will be inherited from 120the global section. 121In case the global section does not define the 122.Ic listen 123parameter at all, the default value will be used. 124.Sh CONFIGURATION FILE DESCRIPTION 125The 126.Aq node 127argument can be replaced either by a full hostname as obtained by 128.Xr gethostname 3 , 129only first part of the hostname, or by node's UUID as found in the 130.Va kern.hostuuid 131.Xr sysctl 8 132variable. 133.Pp 134The following statements are available: 135.Bl -tag -width ".Ic xxxx" 136.It Ic control Aq addr 137.Pp 138Address for communication with 139.Xr hastctl 8 . 140Each of the following examples defines the same control address: 141.Bd -literal -offset indent 142uds:///var/run/hastctl 143unix:///var/run/hastctl 144/var/run/hastctl 145.Ed 146.Pp 147The default value is 148.Pa uds:///var/run/hastctl . 149.It Ic listen Aq addr 150.Pp 151Address to listen on in form of: 152.Bd -literal -offset indent 153protocol://protocol-specific-address 154.Ed 155.Pp 156Each of the following examples defines the same listen address: 157.Bd -literal -offset indent 1580.0.0.0 1590.0.0.0:8457 160tcp://0.0.0.0 161tcp://0.0.0.0:8457 162tcp4://0.0.0.0 163tcp4://0.0.0.0:8457 164.Ed 165.Pp 166Multiple listen addresses can be specified. 167By default 168.Nm hastd 169listens on 170.Pa tcp4://0.0.0.0:8457 171and 172.Pa tcp6://[::]:8457 173if kernel supports IPv4 and IPv6 respectively. 174.It Ic replication Aq mode 175.Pp 176Replication mode should be one of the following: 177.Bl -tag -width ".Ic xxxx" 178.It Ic memsync 179.Pp 180Report the write operation as completed when local write completes and 181when the remote node acknowledges the data receipt, but before it 182actually stores the data. 183The data on remote node will be stored directly after sending 184acknowledgement. 185This mode is intended to reduce latency, but still provides a very good 186reliability. 187The only situation where some small amount of data could be lost is when 188the data is stored on primary node and sent to the secondary. 189Secondary node then acknowledges data receipt and primary reports 190success to an application. 191However, it may happen that the secondary goes down before the received 192data is really stored locally. 193Before secondary node returns, primary node dies entirely. 194When the secondary node comes back to life it becomes the new primary. 195Unfortunately some small amount of data which was confirmed to be stored 196to the application was lost. 197The risk of such a situation is very small. 198The 199.Ic memsync 200replication mode is currently not implemented. 201.It Ic fullsync 202.Pp 203Mark the write operation as completed when local as well as remote 204write completes. 205This is the safest and the slowest replication mode. 206The 207.Ic fullsync 208replication mode is the default. 209.It Ic async 210.Pp 211The write operation is reported as complete right after the local write 212completes. 213This is the fastest and the most dangerous replication mode. 214This mode should be used when replicating to a distant node where 215latency is too high for other modes. 216The 217.Ic async 218replication mode is currently not implemented. 219.El 220.It Ic checksum Aq algorithm 221.Pp 222Checksum algorithm should be one of the following: 223.Bl -tag -width ".Ic sha256" 224.It Ic none 225No checksum will be calculated for the data being send over the network. 226This is the default setting. 227.It Ic crc32 228CRC32 checksum will be calculated. 229.It Ic sha256 230SHA256 checksum will be calculated. 231.El 232.It Ic compression Aq algorithm 233.Pp 234Compression algorithm should be one of the following: 235.Bl -tag -width ".Ic none" 236.It Ic none 237Data send over the network will not be compressed. 238.It Ic hole 239Only blocks that contain all zeros will be compressed. 240This is very useful for initial synchronization where potentially many blocks 241are still all zeros. 242There should be no measurable performance overhead when this algorithm is being 243used. 244This is the default setting. 245.It Ic lzf 246The LZF algorithm by Marc Alexander Lehmann will be used to compress the data 247send over the network. 248LZF is very fast, general purpose compression algorithm. 249.El 250.It Ic timeout Aq seconds 251.Pp 252Connection timeout in seconds. 253The default value is 254.Va 20 . 255.It Ic exec Aq path 256.Pp 257Execute the given program on various HAST events. 258Below is the list of currently implemented events and arguments the given 259program is executed with: 260.Bl -tag -width ".Ic xxxx" 261.It Ic "<path> role <resource> <oldrole> <newrole>" 262.Pp 263Executed on both primary and secondary nodes when resource role is changed. 264.Pp 265.It Ic "<path> connect <resource>" 266.Pp 267Executed on both primary and secondary nodes when connection for the given 268resource between the nodes is established. 269.Pp 270.It Ic "<path> disconnect <resource>" 271.Pp 272Executed on both primary and secondary nodes when connection for the given 273resource between the nodes is lost. 274.Pp 275.It Ic "<path> syncstart <resource>" 276.Pp 277Executed on primary node when synchronization process of secondary node is 278started. 279.Pp 280.It Ic "<path> syncdone <resource>" 281.Pp 282Executed on primary node when synchronization process of secondary node is 283completed successfully. 284.Pp 285.It Ic "<path> syncintr <resource>" 286.Pp 287Executed on primary node when synchronization process of secondary node is 288interrupted, most likely due to secondary node outage or connection failure 289between the nodes. 290.Pp 291.It Ic "<path> split-brain <resource>" 292.Pp 293Executed on both primary and secondary nodes when split-brain condition is 294detected. 295.Pp 296.El 297The 298.Aq path 299argument should contain full path to executable program. 300If the given program exits with code different than 301.Va 0 , 302.Nm hastd 303will log it as an error. 304.Pp 305The 306.Aq resource 307argument is resource name from the configuration file. 308.Pp 309The 310.Aq oldrole 311argument is previous resource role (before the change). 312It can be one of: 313.Ar init , 314.Ar secondary , 315.Ar primary . 316.Pp 317The 318.Aq newrole 319argument is current resource role (after the change). 320It can be one of: 321.Ar init , 322.Ar secondary , 323.Ar primary . 324.Pp 325.It Ic metaflush on | off 326.Pp 327When set to 328.Va on , 329flush write cache of the local provider after every metadata (activemap) update. 330Flushing write cache ensures that provider will not reorder writes and that 331metadata will be properly updated before real data is stored. 332If the local provider does not support flushing write cache (it returns 333.Er EOPNOTSUPP 334on the 335.Cm BIO_FLUSH 336request), 337.Nm hastd 338will disable 339.Ic metaflush 340automatically. 341The default value is 342.Va on . 343.Pp 344.It Ic name Aq name 345.Pp 346GEOM provider name that will appear as 347.Pa /dev/hast/<name> . 348If name is not defined, resource name will be used as provider name. 349.It Ic local Aq path 350.Pp 351Path to the local component which will be used as backend provider for 352the resource. 353This can be either GEOM provider or regular file. 354.It Ic remote Aq addr 355.Pp 356Address of the remote 357.Nm hastd 358daemon. 359Format is the same as for the 360.Ic listen 361statement. 362When operating as a primary node this address will be used to connect to 363the secondary node. 364When operating as a secondary node only connections from this address 365will be accepted. 366.Pp 367A special value of 368.Va none 369can be used when the remote address is not yet known (eg. the other node is not 370set up yet). 371.It Ic source Aq addr 372.Pp 373Local address to bind to before connecting to the remote 374.Nm hastd 375daemon. 376Format is the same as for the 377.Ic listen 378statement. 379.El 380.Sh FILES 381.Bl -tag -width ".Pa /var/run/hastctl" -compact 382.It Pa /etc/hast.conf 383The default 384.Nm 385configuration file. 386.It Pa /var/run/hastctl 387Control socket used by the 388.Xr hastctl 8 389control utility to communicate with the 390.Xr hastd 8 391daemon. 392.El 393.Sh EXAMPLES 394The example configuration file can look as follows: 395.Bd -literal -offset indent 396listen tcp://0.0.0.0 397 398on hasta { 399 listen tcp://2001:db8::1/64 400} 401on hastb { 402 listen tcp://2001:db8::2/64 403} 404 405resource shared { 406 local /dev/da0 407 408 on hasta { 409 remote tcp://10.0.0.2 410 } 411 on hastb { 412 remote tcp://10.0.0.1 413 } 414} 415resource tank { 416 on hasta { 417 local /dev/mirror/tanka 418 source tcp://10.0.0.1 419 remote tcp://10.0.0.2 420 } 421 on hastb { 422 local /dev/mirror/tankb 423 source tcp://10.0.0.2 424 remote tcp://10.0.0.1 425 } 426} 427.Ed 428.Sh SEE ALSO 429.Xr gethostname 3 , 430.Xr geom 4 , 431.Xr hastctl 8 , 432.Xr hastd 8 . 433.Sh AUTHORS 434The 435.Nm 436was written by 437.An Pawel Jakub Dawidek Aq pjd@FreeBSD.org 438under sponsorship of the FreeBSD Foundation. 439