xref: /freebsd/sbin/hastd/hast.conf.5 (revision 3b8f08459569bf0faa21473e5cec2491e95c9349)
1.\" Copyright (c) 2010 The FreeBSD Foundation
2.\" Copyright (c) 2010-2012 Pawel Jakub Dawidek <pawel@dawidek.net>
3.\" All rights reserved.
4.\"
5.\" This documentation was written by Pawel Jakub Dawidek under sponsorship from
6.\" the FreeBSD Foundation.
7.\"
8.\" Redistribution and use in source and binary forms, with or without
9.\" modification, are permitted provided that the following conditions
10.\" are met:
11.\" 1. Redistributions of source code must retain the above copyright
12.\"    notice, this list of conditions and the following disclaimer.
13.\" 2. Redistributions in binary form must reproduce the above copyright
14.\"    notice, this list of conditions and the following disclaimer in the
15.\"    documentation and/or other materials provided with the distribution.
16.\"
17.\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
18.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
19.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
20.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
21.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
22.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
23.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
24.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
25.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
26.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
27.\" SUCH DAMAGE.
28.\"
29.\" $FreeBSD$
30.\"
31.Dd January 25, 2012
32.Dt HAST.CONF 5
33.Os
34.Sh NAME
35.Nm hast.conf
36.Nd configuration file for the
37.Xr hastd 8
38daemon and the
39.Xr hastctl 8
40utility
41.Sh DESCRIPTION
42The
43.Nm
44file is used by both
45.Xr hastd 8
46daemon
47and
48.Xr hastctl 8
49control utility.
50Configuration file is designed in a way that exactly the same file can be
51(and should be) used on both HAST nodes.
52Every line starting with # is treated as comment and ignored.
53.Sh CONFIGURATION FILE SYNTAX
54General syntax of the
55.Nm
56file is following:
57.Bd -literal -offset indent
58# Global section
59control <addr>
60listen <addr>
61replication <mode>
62checksum <algorithm>
63compression <algorithm>
64timeout <seconds>
65exec <path>
66metaflush on | off
67pidfile <path>
68
69on <node> {
70	# Node section
71        control <addr>
72        listen <addr>
73        pidfile <path>
74}
75
76on <node> {
77	# Node section
78        control <addr>
79        listen <addr>
80        pidfile <path>
81}
82
83resource <name> {
84	# Resource section
85	replication <mode>
86	checksum <algorithm>
87	compression <algorithm>
88	name <name>
89	local <path>
90	timeout <seconds>
91	exec <path>
92	metaflush on | off
93
94	on <node> {
95		# Resource-node section
96		name <name>
97		# Required
98		local <path>
99		metaflush on | off
100		# Required
101		remote <addr>
102		source <addr>
103	}
104	on <node> {
105		# Resource-node section
106		name <name>
107		# Required
108		local <path>
109		metaflush on | off
110		# Required
111		remote <addr>
112		source <addr>
113	}
114}
115.Ed
116.Pp
117Most of the various available configuration parameters are optional.
118If parameter is not defined in the particular section, it will be
119inherited from the parent section.
120For example, if the
121.Ic listen
122parameter is not defined in the node section, it will be inherited from
123the global section.
124In case the global section does not define the
125.Ic listen
126parameter at all, the default value will be used.
127.Sh CONFIGURATION FILE DESCRIPTION
128The
129.Aq node
130argument can be replaced either by a full hostname as obtained by
131.Xr gethostname 3 ,
132only first part of the hostname, by node's UUID as found in the
133.Va kern.hostuuid
134.Xr sysctl 8
135variable
136or by node's hostid as found in the
137.Va kern.hostid
138.Xr sysctl 8
139variable.
140.Pp
141The following statements are available:
142.Bl -tag -width ".Ic xxxx"
143.It Ic control Aq addr
144.Pp
145Address for communication with
146.Xr hastctl 8 .
147Each of the following examples defines the same control address:
148.Bd -literal -offset indent
149uds:///var/run/hastctl
150unix:///var/run/hastctl
151/var/run/hastctl
152.Ed
153.Pp
154The default value is
155.Pa uds:///var/run/hastctl .
156.It Ic pidfile Aq path
157.Pp
158File in which to store the process ID of the main
159.Xr hastd 8
160process.
161.Pp
162The default value is
163.Pa /var/run/hastd.pid .
164.It Ic listen Aq addr
165.Pp
166Address to listen on in form of:
167.Bd -literal -offset indent
168protocol://protocol-specific-address
169.Ed
170.Pp
171Each of the following examples defines the same listen address:
172.Bd -literal -offset indent
1730.0.0.0
1740.0.0.0:8457
175tcp://0.0.0.0
176tcp://0.0.0.0:8457
177tcp4://0.0.0.0
178tcp4://0.0.0.0:8457
179.Ed
180.Pp
181Multiple listen addresses can be specified.
182By default
183.Nm hastd
184listens on
185.Pa tcp4://0.0.0.0:8457
186and
187.Pa tcp6://[::]:8457
188if kernel supports IPv4 and IPv6 respectively.
189.It Ic replication Aq mode
190.Pp
191Replication mode should be one of the following:
192.Bl -tag -width ".Ic xxxx"
193.It Ic memsync
194.Pp
195Report the write operation as completed when local write completes and
196when the remote node acknowledges the data receipt, but before it
197actually stores the data.
198The data on remote node will be stored directly after sending
199acknowledgement.
200This mode is intended to reduce latency, but still provides a very good
201reliability.
202The only situation where some small amount of data could be lost is when
203the data is stored on primary node and sent to the secondary.
204Secondary node then acknowledges data receipt and primary reports
205success to an application.
206However, it may happen that the secondary goes down before the received
207data is really stored locally.
208Before secondary node returns, primary node dies entirely.
209When the secondary node comes back to life it becomes the new primary.
210Unfortunately some small amount of data which was confirmed to be stored
211to the application was lost.
212The risk of such a situation is very small.
213The
214.Ic memsync
215replication mode is the default.
216.It Ic fullsync
217.Pp
218Mark the write operation as completed when local as well as remote
219write completes.
220This is the safest and the slowest replication mode.
221.It Ic async
222.Pp
223The write operation is reported as complete right after the local write
224completes.
225This is the fastest and the most dangerous replication mode.
226This mode should be used when replicating to a distant node where
227latency is too high for other modes.
228.El
229.It Ic checksum Aq algorithm
230.Pp
231Checksum algorithm should be one of the following:
232.Bl -tag -width ".Ic sha256"
233.It Ic none
234No checksum will be calculated for the data being send over the network.
235This is the default setting.
236.It Ic crc32
237CRC32 checksum will be calculated.
238.It Ic sha256
239SHA256 checksum will be calculated.
240.El
241.It Ic compression Aq algorithm
242.Pp
243Compression algorithm should be one of the following:
244.Bl -tag -width ".Ic none"
245.It Ic none
246Data send over the network will not be compressed.
247.It Ic hole
248Only blocks that contain all zeros will be compressed.
249This is very useful for initial synchronization where potentially many blocks
250are still all zeros.
251There should be no measurable performance overhead when this algorithm is being
252used.
253This is the default setting.
254.It Ic lzf
255The LZF algorithm by Marc Alexander Lehmann will be used to compress the data
256send over the network.
257LZF is very fast, general purpose compression algorithm.
258.El
259.It Ic timeout Aq seconds
260.Pp
261Connection timeout in seconds.
262The default value is
263.Va 20 .
264.It Ic exec Aq path
265.Pp
266Execute the given program on various HAST events.
267Below is the list of currently implemented events and arguments the given
268program is executed with:
269.Bl -tag -width ".Ic xxxx"
270.It Ic "<path> role <resource> <oldrole> <newrole>"
271.Pp
272Executed on both primary and secondary nodes when resource role is changed.
273.Pp
274.It Ic "<path> connect <resource>"
275.Pp
276Executed on both primary and secondary nodes when connection for the given
277resource between the nodes is established.
278.Pp
279.It Ic "<path> disconnect <resource>"
280.Pp
281Executed on both primary and secondary nodes when connection for the given
282resource between the nodes is lost.
283.Pp
284.It Ic "<path> syncstart <resource>"
285.Pp
286Executed on primary node when synchronization process of secondary node is
287started.
288.Pp
289.It Ic "<path> syncdone <resource>"
290.Pp
291Executed on primary node when synchronization process of secondary node is
292completed successfully.
293.Pp
294.It Ic "<path> syncintr <resource>"
295.Pp
296Executed on primary node when synchronization process of secondary node is
297interrupted, most likely due to secondary node outage or connection failure
298between the nodes.
299.Pp
300.It Ic "<path> split-brain <resource>"
301.Pp
302Executed on both primary and secondary nodes when split-brain condition is
303detected.
304.Pp
305.El
306The
307.Aq path
308argument should contain full path to executable program.
309If the given program exits with code different than
310.Va 0 ,
311.Nm hastd
312will log it as an error.
313.Pp
314The
315.Aq resource
316argument is resource name from the configuration file.
317.Pp
318The
319.Aq oldrole
320argument is previous resource role (before the change).
321It can be one of:
322.Ar init ,
323.Ar secondary ,
324.Ar primary .
325.Pp
326The
327.Aq newrole
328argument is current resource role (after the change).
329It can be one of:
330.Ar init ,
331.Ar secondary ,
332.Ar primary .
333.Pp
334.It Ic metaflush on | off
335.Pp
336When set to
337.Va on ,
338flush write cache of the local provider after every metadata (activemap) update.
339Flushing write cache ensures that provider will not reorder writes and that
340metadata will be properly updated before real data is stored.
341If the local provider does not support flushing write cache (it returns
342.Er EOPNOTSUPP
343on the
344.Cm BIO_FLUSH
345request),
346.Nm hastd
347will disable
348.Ic metaflush
349automatically.
350The default value is
351.Va on .
352.Pp
353.It Ic name Aq name
354.Pp
355GEOM provider name that will appear as
356.Pa /dev/hast/<name> .
357If name is not defined, resource name will be used as provider name.
358.It Ic local Aq path
359.Pp
360Path to the local component which will be used as backend provider for
361the resource.
362This can be either GEOM provider or regular file.
363.It Ic remote Aq addr
364.Pp
365Address of the remote
366.Nm hastd
367daemon.
368Format is the same as for the
369.Ic listen
370statement.
371When operating as a primary node this address will be used to connect to
372the secondary node.
373When operating as a secondary node only connections from this address
374will be accepted.
375.Pp
376A special value of
377.Va none
378can be used when the remote address is not yet known (eg. the other node is not
379set up yet).
380.It Ic source Aq addr
381.Pp
382Local address to bind to before connecting to the remote
383.Nm hastd
384daemon.
385Format is the same as for the
386.Ic listen
387statement.
388.El
389.Sh FILES
390.Bl -tag -width ".Pa /var/run/hastctl" -compact
391.It Pa /etc/hast.conf
392The default
393.Xr hastctl 8
394and
395.Xr hastd 8
396configuration file.
397.It Pa /var/run/hastctl
398Control socket used by the
399.Xr hastctl 8
400control utility to communicate with the
401.Xr hastd 8
402daemon.
403.El
404.Sh EXAMPLES
405The example configuration file can look as follows:
406.Bd -literal -offset indent
407listen tcp://0.0.0.0
408
409on hasta {
410	listen tcp://2001:db8::1/64
411}
412on hastb {
413	listen tcp://2001:db8::2/64
414}
415
416resource shared {
417	local /dev/da0
418
419	on hasta {
420		remote tcp://10.0.0.2
421	}
422	on hastb {
423		remote tcp://10.0.0.1
424	}
425}
426resource tank {
427	on hasta {
428		local /dev/mirror/tanka
429		source tcp://10.0.0.1
430		remote tcp://10.0.0.2
431	}
432	on hastb {
433		local /dev/mirror/tankb
434		source tcp://10.0.0.2
435		remote tcp://10.0.0.1
436	}
437}
438.Ed
439.Sh SEE ALSO
440.Xr gethostname 3 ,
441.Xr geom 4 ,
442.Xr hastctl 8 ,
443.Xr hastd 8
444.Sh AUTHORS
445The
446.Nm
447was written by
448.An Pawel Jakub Dawidek Aq pjd@FreeBSD.org
449under sponsorship of the FreeBSD Foundation.
450