1.\" Copyright (c) 2010 The FreeBSD Foundation 2.\" Copyright (c) 2010-2012 Pawel Jakub Dawidek <pawel@dawidek.net> 3.\" All rights reserved. 4.\" 5.\" This documentation was written by Pawel Jakub Dawidek under sponsorship from 6.\" the FreeBSD Foundation. 7.\" 8.\" Redistribution and use in source and binary forms, with or without 9.\" modification, are permitted provided that the following conditions 10.\" are met: 11.\" 1. Redistributions of source code must retain the above copyright 12.\" notice, this list of conditions and the following disclaimer. 13.\" 2. Redistributions in binary form must reproduce the above copyright 14.\" notice, this list of conditions and the following disclaimer in the 15.\" documentation and/or other materials provided with the distribution. 16.\" 17.\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND 18.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 19.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 20.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE 21.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 22.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 23.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 24.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 25.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 26.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 27.\" SUCH DAMAGE. 28.\" 29.Dd January 25, 2012 30.Dt HAST.CONF 5 31.Os 32.Sh NAME 33.Nm hast.conf 34.Nd configuration file for the 35.Xr hastd 8 36daemon and the 37.Xr hastctl 8 38utility 39.Sh DESCRIPTION 40The 41.Nm 42file is used by both 43.Xr hastd 8 44daemon 45and 46.Xr hastctl 8 47control utility. 48Configuration file is designed in a way that exactly the same file can be 49(and should be) used on both HAST nodes. 50Every line starting with # is treated as comment and ignored. 51.Sh CONFIGURATION FILE SYNTAX 52General syntax of the 53.Nm 54file is following: 55.Bd -literal -offset indent 56# Global section 57control <addr> 58listen <addr> 59replication <mode> 60checksum <algorithm> 61compression <algorithm> 62timeout <seconds> 63exec <path> 64metaflush on | off 65pidfile <path> 66 67on <node> { 68 # Node section 69 control <addr> 70 listen <addr> 71 pidfile <path> 72} 73 74on <node> { 75 # Node section 76 control <addr> 77 listen <addr> 78 pidfile <path> 79} 80 81resource <name> { 82 # Resource section 83 replication <mode> 84 checksum <algorithm> 85 compression <algorithm> 86 name <name> 87 local <path> 88 timeout <seconds> 89 exec <path> 90 metaflush on | off 91 92 on <node> { 93 # Resource-node section 94 name <name> 95 # Required 96 local <path> 97 metaflush on | off 98 # Required 99 remote <addr> 100 source <addr> 101 } 102 on <node> { 103 # Resource-node section 104 name <name> 105 # Required 106 local <path> 107 metaflush on | off 108 # Required 109 remote <addr> 110 source <addr> 111 } 112} 113.Ed 114.Pp 115Most of the various available configuration parameters are optional. 116If parameter is not defined in the particular section, it will be 117inherited from the parent section. 118For example, if the 119.Ic listen 120parameter is not defined in the node section, it will be inherited from 121the global section. 122In case the global section does not define the 123.Ic listen 124parameter at all, the default value will be used. 125.Sh CONFIGURATION FILE DESCRIPTION 126The 127.Aq node 128argument can be replaced either by a full hostname as obtained by 129.Xr gethostname 3 , 130only first part of the hostname, by node's UUID as found in the 131.Va kern.hostuuid 132.Xr sysctl 8 133variable 134or by node's hostid as found in the 135.Va kern.hostid 136.Xr sysctl 8 137variable. 138.Pp 139The following statements are available: 140.Bl -tag -width ".Ic xxxx" 141.It Ic control Aq addr 142.Pp 143Address for communication with 144.Xr hastctl 8 . 145Each of the following examples defines the same control address: 146.Bd -literal -offset indent 147uds:///var/run/hastctl 148unix:///var/run/hastctl 149/var/run/hastctl 150.Ed 151.Pp 152The default value is 153.Pa uds:///var/run/hastctl . 154.It Ic pidfile Aq path 155.Pp 156File in which to store the process ID of the main 157.Xr hastd 8 158process. 159.Pp 160The default value is 161.Pa /var/run/hastd.pid . 162.It Ic listen Aq addr 163.Pp 164Address to listen on in form of: 165.Bd -literal -offset indent 166protocol://protocol-specific-address 167.Ed 168.Pp 169Each of the following examples defines the same listen address: 170.Bd -literal -offset indent 1710.0.0.0 1720.0.0.0:8457 173tcp://0.0.0.0 174tcp://0.0.0.0:8457 175tcp4://0.0.0.0 176tcp4://0.0.0.0:8457 177.Ed 178.Pp 179Multiple listen addresses can be specified. 180By default 181.Nm hastd 182listens on 183.Pa tcp4://0.0.0.0:8457 184and 185.Pa tcp6://[::]:8457 186if kernel supports IPv4 and IPv6 respectively. 187.It Ic replication Aq mode 188.Pp 189Replication mode should be one of the following: 190.Bl -tag -width ".Ic xxxx" 191.It Ic memsync 192.Pp 193Report the write operation as completed when local write completes and 194when the remote node acknowledges the data receipt, but before it 195actually stores the data. 196The data on remote node will be stored directly after sending 197acknowledgement. 198This mode is intended to reduce latency, but still provides a very good 199reliability. 200The only situation where some small amount of data could be lost is when 201the data is stored on primary node and sent to the secondary. 202Secondary node then acknowledges data receipt and primary reports 203success to an application. 204However, it may happen that the secondary goes down before the received 205data is really stored locally. 206Before secondary node returns, primary node dies entirely. 207When the secondary node comes back to life it becomes the new primary. 208Unfortunately some small amount of data which was confirmed to be stored 209to the application was lost. 210The risk of such a situation is very small. 211The 212.Ic memsync 213replication mode is the default. 214.It Ic fullsync 215.Pp 216Mark the write operation as completed when local as well as remote 217write completes. 218This is the safest and the slowest replication mode. 219.It Ic async 220.Pp 221The write operation is reported as complete right after the local write 222completes. 223This is the fastest and the most dangerous replication mode. 224This mode should be used when replicating to a distant node where 225latency is too high for other modes. 226.El 227.It Ic checksum Aq algorithm 228.Pp 229Checksum algorithm should be one of the following: 230.Bl -tag -width ".Ic sha256" 231.It Ic none 232No checksum will be calculated for the data being send over the network. 233This is the default setting. 234.It Ic crc32 235CRC32 checksum will be calculated. 236.It Ic sha256 237SHA256 checksum will be calculated. 238.El 239.It Ic compression Aq algorithm 240.Pp 241Compression algorithm should be one of the following: 242.Bl -tag -width ".Ic none" 243.It Ic none 244Data send over the network will not be compressed. 245.It Ic hole 246Only blocks that contain all zeros will be compressed. 247This is very useful for initial synchronization where potentially many blocks 248are still all zeros. 249There should be no measurable performance overhead when this algorithm is being 250used. 251This is the default setting. 252.It Ic lzf 253The LZF algorithm by Marc Alexander Lehmann will be used to compress the data 254send over the network. 255LZF is very fast, general purpose compression algorithm. 256.El 257.It Ic timeout Aq seconds 258.Pp 259Connection timeout in seconds. 260The default value is 261.Va 20 . 262.It Ic exec Aq path 263.Pp 264Execute the given program on various HAST events. 265Below is the list of currently implemented events and arguments the given 266program is executed with: 267.Bl -tag -width ".Ic xxxx" 268.It Ic "<path> role <resource> <oldrole> <newrole>" 269.Pp 270Executed on both primary and secondary nodes when resource role is changed. 271.It Ic "<path> connect <resource>" 272.Pp 273Executed on both primary and secondary nodes when connection for the given 274resource between the nodes is established. 275.It Ic "<path> disconnect <resource>" 276.Pp 277Executed on both primary and secondary nodes when connection for the given 278resource between the nodes is lost. 279.It Ic "<path> syncstart <resource>" 280.Pp 281Executed on primary node when synchronization process of secondary node is 282started. 283.It Ic "<path> syncdone <resource>" 284.Pp 285Executed on primary node when synchronization process of secondary node is 286completed successfully. 287.It Ic "<path> syncintr <resource>" 288.Pp 289Executed on primary node when synchronization process of secondary node is 290interrupted, most likely due to secondary node outage or connection failure 291between the nodes. 292.It Ic "<path> split-brain <resource>" 293.Pp 294Executed on both primary and secondary nodes when split-brain condition is 295detected. 296.El 297.Pp 298The 299.Aq path 300argument should contain full path to executable program. 301If the given program exits with code different than 302.Va 0 , 303.Nm hastd 304will log it as an error. 305.Pp 306The 307.Aq resource 308argument is resource name from the configuration file. 309.Pp 310The 311.Aq oldrole 312argument is previous resource role (before the change). 313It can be one of: 314.Ar init , 315.Ar secondary , 316.Ar primary . 317.Pp 318The 319.Aq newrole 320argument is current resource role (after the change). 321It can be one of: 322.Ar init , 323.Ar secondary , 324.Ar primary . 325.It Ic metaflush on | off 326.Pp 327When set to 328.Va on , 329flush write cache of the local provider after every metadata (activemap) update. 330Flushing write cache ensures that provider will not reorder writes and that 331metadata will be properly updated before real data is stored. 332If the local provider does not support flushing write cache (it returns 333.Er EOPNOTSUPP 334on the 335.Cm BIO_FLUSH 336request), 337.Nm hastd 338will disable 339.Ic metaflush 340automatically. 341The default value is 342.Va on . 343.It Ic name Aq name 344.Pp 345GEOM provider name that will appear as 346.Pa /dev/hast/<name> . 347If name is not defined, resource name will be used as provider name. 348.It Ic local Aq path 349.Pp 350Path to the local component which will be used as backend provider for 351the resource. 352This can be either GEOM provider or regular file. 353.It Ic remote Aq addr 354.Pp 355Address of the remote 356.Nm hastd 357daemon. 358Format is the same as for the 359.Ic listen 360statement. 361When operating as a primary node this address will be used to connect to 362the secondary node. 363When operating as a secondary node only connections from this address 364will be accepted. 365.Pp 366A special value of 367.Va none 368can be used when the remote address is not yet known (eg. the other node is not 369set up yet). 370.It Ic source Aq addr 371.Pp 372Local address to bind to before connecting to the remote 373.Nm hastd 374daemon. 375Format is the same as for the 376.Ic listen 377statement. 378.El 379.Sh FILES 380.Bl -tag -width ".Pa /var/run/hastctl" -compact 381.It Pa /etc/hast.conf 382The default 383.Xr hastctl 8 384and 385.Xr hastd 8 386configuration file. 387.It Pa /var/run/hastctl 388Control socket used by the 389.Xr hastctl 8 390control utility to communicate with the 391.Xr hastd 8 392daemon. 393.El 394.Sh EXAMPLES 395The example configuration file can look as follows: 396.Bd -literal -offset indent 397listen tcp://0.0.0.0 398 399on hasta { 400 listen tcp://2001:db8::1/64 401} 402on hastb { 403 listen tcp://2001:db8::2/64 404} 405 406resource shared { 407 local /dev/da0 408 409 on hasta { 410 remote tcp://10.0.0.2 411 } 412 on hastb { 413 remote tcp://10.0.0.1 414 } 415} 416resource tank { 417 on hasta { 418 local /dev/mirror/tanka 419 source tcp://10.0.0.1 420 remote tcp://10.0.0.2 421 } 422 on hastb { 423 local /dev/mirror/tankb 424 source tcp://10.0.0.2 425 remote tcp://10.0.0.1 426 } 427} 428.Ed 429.Sh SEE ALSO 430.Xr gethostname 3 , 431.Xr geom 4 , 432.Xr hastctl 8 , 433.Xr hastd 8 434.Sh AUTHORS 435The 436.Nm 437was written by 438.An Pawel Jakub Dawidek Aq Mt pjd@FreeBSD.org 439under sponsorship of the FreeBSD Foundation. 440