xref: /freebsd/usr.bin/mkuzip/mkuzip.8 (revision 7ef62cebc2f965b0f640263e179276928885e33d)
1.\"-
2.\" Copyright (c) 2004-2016 Maxim Sobolev <sobomax@FreeBSD.org>
3.\" All rights reserved.
4.\"
5.\" Redistribution and use in source and binary forms, with or without
6.\" modification, are permitted provided that the following conditions
7.\" are met:
8.\" 1. Redistributions of source code must retain the above copyright
9.\"    notice, this list of conditions and the following disclaimer.
10.\" 2. Redistributions in binary form must reproduce the above copyright
11.\"    notice, this list of conditions and the following disclaimer in the
12.\"    documentation and/or other materials provided with the distribution.
13.\"
14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
17.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
18.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
24.\" SUCH DAMAGE.
25.\"
26.\" $FreeBSD$
27.\"
28.Dd August 9, 2019
29.Dt MKUZIP 8
30.Os
31.Sh NAME
32.Nm mkuzip
33.Nd compress disk image for use with
34.Xr geom_uzip 4
35class
36.Sh SYNOPSIS
37.Nm
38.Op Fl dSsvZ
39.Op Fl A Ar compression_algorithm
40.Op Fl C Ar compression_level
41.Op Fl j Ar compression_jobs
42.Op Fl o Ar outfile
43.Op Fl s Ar cluster_size
44.Ar infile
45.Sh DESCRIPTION
46The
47.Nm
48utility compresses a disk image file so that the
49.Xr geom_uzip 4
50class will be able to decompress the resulting image at run-time.
51This allows for a significant reduction of size of disk image at
52the expense of some CPU time required to decompress the data each
53time it is read.
54The
55.Nm
56utility
57works in two phases:
58.Bl -enum
59.It
60An
61.Ar infile
62image is split into clusters; each cluster is compressed.
63.It
64The resulting set of compressed clusters is written to the output file.
65In addition, a
66.Dq table of contents
67header is written which allows for efficient seeking.
68.El
69.Pp
70The options are:
71.Bl -tag -width indent
72.It Fl A Op Ar lzma | Ar zlib | Ar zstd
73Select a specific compression algorithm.
74If this option is not provided, the default is
75.Ar zlib .
76.Pp
77The
78.Ar lzma
79algorithm provides noticeable better compression levels than zlib on the same
80data set.
81It has vastly slower compression speed and moderately slower decompression
82speed.
83.Pp
84The
85.Ar zstd
86algorithm provides better compression levels than zlib on the same data set.
87It also has faster compression and decompression speed than zlib.
88In the very high compression
89.Dq level
90settings, it does not offer quite as high a compression ratio as
91.Ar lzma .
92However, its decompression speed does not suffer at high compression
93.Dq levels .
94.It Fl C Ar compression_level
95Select the integer compression level used to parameterize the chosen
96compression algorithm.
97.Pp
98For any given algorithm, a lesser number selects a faster compression mode.
99A greater number selects a slower compression mode.
100Typically, for the same algorithm, a greater
101.Ar compression_level
102provides better final compression ratio.
103.Pp
104For
105.Ar lzma ,
106the range of valid compression levels is
107.Va 0-9 .
108The
109.Nm
110default for lzma is
111.Va 6 .
112.Pp
113For
114.Ar zlib ,
115the range of valid compression levels is
116.Va 1-9 .
117The
118.Nm
119default for zlib is
120.Va 9 .
121.Pp
122For
123.Ar zstd ,
124the range of valid compression levels is currently
125.Va 1-19 .
126The
127.Nm
128default for zstd is
129.Va 9 .
130.It Fl d
131Enable de-duplication.
132When the option is enabled
133.Nm
134detects identical blocks in the input and replaces each subsequent occurrence
135of such block with pointer to the very first one in the output.
136Setting this option results is moderate decrease of compressed image size,
137typically around 3-5% of a final size of the compressed image.
138.It Fl j Ar compression_jobs
139Specify the number of compression jobs that
140.Nm
141runs in parallel to speed up compression.
142When option is not specified the number of jobs set to be equal
143to the value of
144.Va hw.ncpu
145.Xr sysctl 8
146variable.
147.It Op Fl L
148Legacy flag that indicates the same thing as
149.Dq Fl A Ar lzma .
150.It Fl o Ar outfile
151Name of the output file
152.Ar outfile .
153The default is to use the input name with the suffix
154.Pa .uzip
155for the
156.Xr zlib 3
157compression or
158.Pa .ulzma
159for the
160.Xr lzma 3 .
161.It Fl S
162Print summary about the compression ratio as well as output
163file size after file has been processed.
164.It Fl s Ar cluster_size
165Split the image into clusters of
166.Ar cluster_size
167bytes, 16384 bytes by default.
168The
169.Ar cluster_size
170should be a multiple of 512 bytes.
171.It Fl v
172Display verbose messages.
173.It Fl Z
174Disable zero-block detection and elimination.
175When this option is set,
176.Nm
177compresses blocks of zero bytes just as it would any other block.
178When the option is not set,
179.Nm
180detects and compresses zero blocks in a space-efficient way.
181Setting
182.Fl Z
183increases compressed image sizes slightly, typically less than 0.1%.
184.El
185.Sh IMPLEMENTATION NOTES
186The compression ratio largely depends on the compression algorithm, level, and
187cluster size used.
188For large cluster sizes (16kB and higher), typical overall image compression
189ratios with
190.Xr zlib 3
191are only 1-2% less than those achieved with
192.Xr gzip 1
193over the entire image.
194However, it should be kept in mind that larger cluster sizes lead to higher
195overhead in the
196.Xr geom_uzip 4
197class, as the class has to decompress the whole cluster even if
198only a few bytes from that cluster have to be read.
199.Pp
200Additionally, the threshold at 16-32 kB where a larger cluster size does not
201benefit overall compression ratio is an artifact of the
202.Xr zlib 3
203algorithm in particular.
204.Ar Lzma
205and
206.Ar Zstd will continue to provide better compression ratios as cluster sizes
207are increased, at high enough compression levels.
208The same tradeoff continues to apply: reads in
209.Xr geom_uzip 4
210become more expensive the greater the cluster size.
211.Pp
212The
213.Nm
214utility
215inserts a short shell script at the beginning of the generated image,
216which makes it possible to
217.Dq run
218the image just like any other shell script.
219The script tries to load the
220.Xr geom_uzip 4
221class if it is not loaded, configure the image as an
222.Xr md 4
223disk device using
224.Xr mdconfig 8 ,
225and automatically mount it using
226.Xr mount_cd9660 8
227on the mount point provided as the first argument to the script.
228.Pp
229The de-duplication is a
230.Fx
231specific feature and while it does not require any changes to on-disk
232compressed image format, however it did require some matching changes to the
233.Xr geom_uzip 4
234to handle resulting images correctly.
235.Pp
236To make use of
237.Ar zstd
238.Nm
239images, the kernel must be configured with
240.Cd ZSTDIO .
241It is enabled by default in many
242.Cd GENERIC
243kernels provided as binary distributions by
244.Fx .
245The status on any particular system can be verified by checking
246.Xr sysctl 8
247.Dv kern.features.geom_uzip_zstd
248for
249.Dq 1 .
250.Sh EXIT STATUS
251.Ex -std
252.Sh SEE ALSO
253.Xr gzip 1 ,
254.Xr xz 1 ,
255.Xr zstd 1 ,
256.Xr zlib 3 ,
257.Xr geom 4 ,
258.Xr geom_uzip 4 ,
259.Xr md 4 ,
260.Xr mdconfig 8 ,
261.Xr mount_cd9660 8
262.Sh AUTHORS
263.An Maxim Sobolev Aq Mt sobomax@FreeBSD.org
264