xref: /freebsd/usr.bin/compress/compress.1 (revision bdcbfde31e8e9b343f113a1956384bdf30d1ed62)
19b50d902SRodney W. Grimes.\" Copyright (c) 1986, 1990, 1993
29b50d902SRodney W. Grimes.\"	The Regents of the University of California.  All rights reserved.
39b50d902SRodney W. Grimes.\"
49b50d902SRodney W. Grimes.\" This code is derived from software contributed to Berkeley by
59b50d902SRodney W. Grimes.\" James A. Woods, derived from original work by Spencer Thomas
69b50d902SRodney W. Grimes.\" and Joseph Orost.
79b50d902SRodney W. Grimes.\"
89b50d902SRodney W. Grimes.\" Redistribution and use in source and binary forms, with or without
99b50d902SRodney W. Grimes.\" modification, are permitted provided that the following conditions
109b50d902SRodney W. Grimes.\" are met:
119b50d902SRodney W. Grimes.\" 1. Redistributions of source code must retain the above copyright
129b50d902SRodney W. Grimes.\"    notice, this list of conditions and the following disclaimer.
139b50d902SRodney W. Grimes.\" 2. Redistributions in binary form must reproduce the above copyright
149b50d902SRodney W. Grimes.\"    notice, this list of conditions and the following disclaimer in the
159b50d902SRodney W. Grimes.\"    documentation and/or other materials provided with the distribution.
16fbbd9655SWarner Losh.\" 3. Neither the name of the University nor the names of its contributors
179b50d902SRodney W. Grimes.\"    may be used to endorse or promote products derived from this software
189b50d902SRodney W. Grimes.\"    without specific prior written permission.
199b50d902SRodney W. Grimes.\"
209b50d902SRodney W. Grimes.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
219b50d902SRodney W. Grimes.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
229b50d902SRodney W. Grimes.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
239b50d902SRodney W. Grimes.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
249b50d902SRodney W. Grimes.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
259b50d902SRodney W. Grimes.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
269b50d902SRodney W. Grimes.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
279b50d902SRodney W. Grimes.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
289b50d902SRodney W. Grimes.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
299b50d902SRodney W. Grimes.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
309b50d902SRodney W. Grimes.\" SUCH DAMAGE.
319b50d902SRodney W. Grimes.\"
32*fc1e7974SFernando Apesteguía.Dd March 4, 2021
339b50d902SRodney W. Grimes.Dt COMPRESS 1
3462500372SRuslan Ermilov.Os
359b50d902SRodney W. Grimes.Sh NAME
369b50d902SRodney W. Grimes.Nm compress ,
3751f98e58SRuslan Ermilov.Nm uncompress
389b50d902SRodney W. Grimes.Nd compress and expand data
399b50d902SRodney W. Grimes.Sh SYNOPSIS
408fe908efSRuslan Ermilov.Nm
4125b20fc0SGary W. Swearingen.Op Fl fv
429b50d902SRodney W. Grimes.Op Fl b Ar bits
439b50d902SRodney W. Grimes.Op Ar
4425b20fc0SGary W. Swearingen.Nm
4525b20fc0SGary W. Swearingen.Fl c
4625b20fc0SGary W. Swearingen.Op Fl b Ar bits
4725b20fc0SGary W. Swearingen.Op Ar file
489b50d902SRodney W. Grimes.Nm uncompress
4925b20fc0SGary W. Swearingen.Op Fl f
509b50d902SRodney W. Grimes.Op Ar
5125b20fc0SGary W. Swearingen.Nm uncompress
5225b20fc0SGary W. Swearingen.Fl c
5325b20fc0SGary W. Swearingen.Op Ar file
549b50d902SRodney W. Grimes.Sh DESCRIPTION
553898680cSPhilippe CharnierThe
563898680cSPhilippe Charnier.Nm
5725b20fc0SGary W. Swearingenutility reduces the size of files using adaptive Lempel-Ziv coding.
589b50d902SRodney W. GrimesEach
599b50d902SRodney W. Grimes.Ar file
609b50d902SRodney W. Grimesis renamed to the same name plus the extension
614e9e907dSRuslan Ermilov.Pa .Z .
6225b20fc0SGary W. SwearingenA
6325b20fc0SGary W. Swearingen.Ar file
6425b20fc0SGary W. Swearingenargument with a
654e9e907dSRuslan Ermilov.Pa .Z
6625b20fc0SGary W. Swearingenextension will be ignored except it will cause an
6725b20fc0SGary W. Swearingenerror exit after other arguments are processed.
689b50d902SRodney W. GrimesIf compression would not reduce the size of a
699b50d902SRodney W. Grimes.Ar file ,
709b50d902SRodney W. Grimesthe file is ignored.
719b50d902SRodney W. Grimes.Pp
723898680cSPhilippe CharnierThe
733898680cSPhilippe Charnier.Nm uncompress
7425b20fc0SGary W. Swearingenutility restores compressed files to their original form, renaming the
759b50d902SRodney W. Grimesfiles by deleting the
764e9e907dSRuslan Ermilov.Pa .Z
7725b20fc0SGary W. Swearingenextensions.
7825b20fc0SGary W. SwearingenA file specification need not include the file's
794e9e907dSRuslan Ermilov.Pa .Z
809b50d902SRodney W. Grimesextension.
8125b20fc0SGary W. SwearingenIf a file's name in its file system does not have a
824e9e907dSRuslan Ermilov.Pa .Z
8325b20fc0SGary W. Swearingenextension, it will not be uncompressed and it will cause
8425b20fc0SGary W. Swearingenan error exit after other arguments are processed.
859b50d902SRodney W. Grimes.Pp
869b50d902SRodney W. GrimesIf renaming the files would cause files to be overwritten and the standard
879b50d902SRodney W. Grimesinput device is a terminal, the user is prompted (on the standard error
889b50d902SRodney W. Grimesoutput) for confirmation.
899b50d902SRodney W. GrimesIf prompting is not possible or confirmation is not received, the files
909b50d902SRodney W. Grimesare not overwritten.
919b50d902SRodney W. Grimes.Pp
9225b20fc0SGary W. SwearingenAs many of the modification time, access time, file flags, file mode,
9325b20fc0SGary W. Swearingenuser ID, and group ID as allowed by permissions are retained in the
9425b20fc0SGary W. Swearingennew file.
9525b20fc0SGary W. Swearingen.Pp
96b4771590STim J. RobbinsIf no files are specified or a
97b4771590STim J. Robbins.Ar file
98b4771590STim J. Robbinsargument is a single dash
993971fc8cSRuslan Ermilov.Pq Sq Fl ,
100b4771590STim J. Robbinsthe standard input is compressed or uncompressed to the standard output.
1019b50d902SRodney W. GrimesIf either the input and output files are not regular files, the checks for
1029b50d902SRodney W. Grimesreduction in size and file overwriting are not performed, the input file is
10325b20fc0SGary W. Swearingennot removed, and the attributes of the input file are not retained
10425b20fc0SGary W. Swearingenin the output file.
1059b50d902SRodney W. Grimes.Pp
1069b50d902SRodney W. GrimesThe options are as follows:
10725b20fc0SGary W. Swearingen.Bl -tag -width ".Fl b Ar bits"
10825b20fc0SGary W. Swearingen.It Fl b Ar bits
10925b20fc0SGary W. SwearingenThe code size (see below) is limited to
11025b20fc0SGary W. Swearingen.Ar bits ,
11125b20fc0SGary W. Swearingenwhich must be in the range 9..16.
11225b20fc0SGary W. SwearingenThe default is 16.
1139b50d902SRodney W. Grimes.It Fl c
1149b50d902SRodney W. GrimesCompressed or uncompressed output is written to the standard output.
1159b50d902SRodney W. GrimesNo files are modified.
11625b20fc0SGary W. SwearingenThe
11725b20fc0SGary W. Swearingen.Fl v
11825b20fc0SGary W. Swearingenoption is ignored.
11925b20fc0SGary W. SwearingenCompression is attempted even if the results will be larger than the
12025b20fc0SGary W. Swearingenoriginal.
1219b50d902SRodney W. Grimes.It Fl f
12225b20fc0SGary W. SwearingenFiles are overwritten without prompting for confirmation.
12325b20fc0SGary W. SwearingenAlso, for
12425b20fc0SGary W. Swearingen.Nm compress ,
12525b20fc0SGary W. Swearingenfiles are compressed even if they are not actually reduced in size.
1269b50d902SRodney W. Grimes.It Fl v
1279b50d902SRodney W. GrimesPrint the percentage reduction of each file.
12825b20fc0SGary W. SwearingenIgnored by
12925b20fc0SGary W. Swearingen.Nm uncompress
13025b20fc0SGary W. Swearingenor if the
13125b20fc0SGary W. Swearingen.Fl c
13225b20fc0SGary W. Swearingenoption is also used.
1339b50d902SRodney W. Grimes.El
1349b50d902SRodney W. Grimes.Pp
1353898680cSPhilippe CharnierThe
1363898680cSPhilippe Charnier.Nm
1373898680cSPhilippe Charnierutility uses a modified Lempel-Ziv algorithm.
1389b50d902SRodney W. GrimesCommon substrings in the file are first replaced by 9-bit codes 257 and up.
1399b50d902SRodney W. GrimesWhen code 512 is reached, the algorithm switches to 10-bit codes and
1409b50d902SRodney W. Grimescontinues to use more bits until the
1419b50d902SRodney W. Grimeslimit specified by the
1429b50d902SRodney W. Grimes.Fl b
14325b20fc0SGary W. Swearingenoption or its default is reached.
1449b50d902SRodney W. Grimes.Pp
14525b20fc0SGary W. SwearingenAfter the limit is reached,
146fae643c5SPhilippe Charnier.Nm
1479b50d902SRodney W. Grimesperiodically checks the compression ratio.
1489b50d902SRodney W. GrimesIf it is increasing,
149fae643c5SPhilippe Charnier.Nm
1509b50d902SRodney W. Grimescontinues to use the existing code dictionary.
1519b50d902SRodney W. GrimesHowever, if the compression ratio decreases,
152fae643c5SPhilippe Charnier.Nm
1536a3e8b0aSRuslan Ermilovdiscards the table of substrings and rebuilds it from scratch.
1546a3e8b0aSRuslan ErmilovThis allows
1559b50d902SRodney W. Grimesthe algorithm to adapt to the next "block" of the file.
1569b50d902SRodney W. Grimes.Pp
1579b50d902SRodney W. GrimesThe
1589b50d902SRodney W. Grimes.Fl b
15925b20fc0SGary W. Swearingenoption is unavailable for
16025537080SPhilippe Charnier.Nm uncompress
1619b50d902SRodney W. Grimessince the
1629b50d902SRodney W. Grimes.Ar bits
1639b50d902SRodney W. Grimesparameter specified during compression
1649b50d902SRodney W. Grimesis encoded within the output, along with
1659b50d902SRodney W. Grimesa magic number to ensure that neither decompression of random data nor
1669b50d902SRodney W. Grimesrecompression of compressed data is attempted.
1679b50d902SRodney W. Grimes.Pp
1689b50d902SRodney W. GrimesThe amount of compression obtained depends on the size of the
1699b50d902SRodney W. Grimesinput, the number of
1709b50d902SRodney W. Grimes.Ar bits
1719b50d902SRodney W. Grimesper code, and the distribution of common substrings.
1729b50d902SRodney W. GrimesTypically, text such as source code or English is reduced by 50\-60%.
1739b50d902SRodney W. GrimesCompression is generally much better than that achieved by Huffman
1749b50d902SRodney W. Grimescoding (as used in the historical command pack), or adaptive Huffman
1759b50d902SRodney W. Grimescoding (as used in the historical command compact), and takes less
1769b50d902SRodney W. Grimestime to compute.
177*fc1e7974SFernando Apesteguía.Pp
178*fc1e7974SFernando ApesteguíaIf
179*fc1e7974SFernando Apesteguía.Ar file
180*fc1e7974SFernando Apesteguíais a soft or hard link
181*fc1e7974SFernando Apesteguía.Nm
182*fc1e7974SFernando Apesteguíawill replace it with a compressed copy of the file pointed to by the link.
183*fc1e7974SFernando ApesteguíaThe link's target file is left uncompressed.
184a866e170SRuslan Ermilov.Sh EXIT STATUS
1853e4d070bSTom Rhodes.Ex -std compress uncompress
1866c1b63bbSTim J. Robbins.Pp
1876c1b63bbSTim J. RobbinsThe
1886c1b63bbSTim J. Robbins.Nm compress
18925b20fc0SGary W. Swearingenutility exits 2 if attempting to compress a file would not reduce its size
1906c1b63bbSTim J. Robbinsand the
1916c1b63bbSTim J. Robbins.Fl f
19225b20fc0SGary W. Swearingenoption was not specified and if no other error occurs.
193ea772485SFernando Apesteguía.Sh EXAMPLES
194ea772485SFernando ApesteguíaCreate a file
195ea772485SFernando Apesteguía.Pa test_file
196ea772485SFernando Apesteguíawith a single line of text:
197ea772485SFernando Apesteguía.Bd -literal -offset indent
198ea772485SFernando Apesteguíaecho "This is a test" > test_file
199ea772485SFernando Apesteguía.Ed
200ea772485SFernando Apesteguía.Pp
201ea772485SFernando ApesteguíaTry to reduce the size of the file using a 10-bit code and show the exit status:
202ea772485SFernando Apesteguía.Bd -literal -offset indent
203ea772485SFernando Apesteguía$ compress -b 10 test_file
204ea772485SFernando Apesteguía$ echo $?
205ea772485SFernando Apesteguía2
206ea772485SFernando Apesteguía.Ed
207ea772485SFernando Apesteguía.Pp
208ea772485SFernando ApesteguíaTry to compress the file and show compression percentage:
209ea772485SFernando Apesteguía.Bd -literal -offset indent
210ea772485SFernando Apesteguía$ compress -v test_file
211ea772485SFernando Apesteguíatest_file: file would grow; left unmodified
212ea772485SFernando Apesteguía.Ed
213ea772485SFernando Apesteguía.Pp
214ea772485SFernando ApesteguíaSame as above but forcing compression:
215ea772485SFernando Apesteguía.Bd -literal -offset indent
216ea772485SFernando Apesteguía$ compress -f -v test_file
217ea772485SFernando Apesteguíatest_file.Z: 79% expansion
218ea772485SFernando Apesteguía.Ed
219ea772485SFernando Apesteguía.Pp
220ea772485SFernando ApesteguíaCompress and uncompress the string
221ea772485SFernando Apesteguía.Ql hello
222ea772485SFernando Apesteguíaon the fly:
223ea772485SFernando Apesteguía.Bd -literal -offset indent
224ea772485SFernando Apesteguía$ echo "hello" | compress | uncompress
225ea772485SFernando Apesteguíahello
226ea772485SFernando Apesteguía.Ed
2279b50d902SRodney W. Grimes.Sh SEE ALSO
2283e4d070bSTom Rhodes.Xr gunzip 1 ,
2293e4d070bSTom Rhodes.Xr gzexe 1 ,
2303e4d070bSTom Rhodes.Xr gzip 1 ,
2313e4d070bSTom Rhodes.Xr zcat 1 ,
2323e4d070bSTom Rhodes.Xr zmore 1 ,
2333e4d070bSTom Rhodes.Xr znew 1
2349b50d902SRodney W. Grimes.Rs
2359b50d902SRodney W. Grimes.%A Welch, Terry A.
2369b50d902SRodney W. Grimes.%D June, 1984
2379b50d902SRodney W. Grimes.%T "A Technique for High Performance Data Compression"
2389b50d902SRodney W. Grimes.%J "IEEE Computer"
2399b50d902SRodney W. Grimes.%V 17:6
2409b50d902SRodney W. Grimes.%P pp. 8-19
2419b50d902SRodney W. Grimes.Re
242f5ba2b90STim J. Robbins.Sh STANDARDS
243f5ba2b90STim J. RobbinsThe
244f5ba2b90STim J. Robbins.Nm compress
245f5ba2b90STim J. Robbinsand
246f5ba2b90STim J. Robbins.Nm uncompress
247f5ba2b90STim J. Robbinsutilities conform to
248f5ba2b90STim J. Robbins.St -p1003.1-2001 .
2499b50d902SRodney W. Grimes.Sh HISTORY
2509b50d902SRodney W. GrimesThe
2519b50d902SRodney W. Grimes.Nm
2529b50d902SRodney W. Grimescommand appeared in
2539b50d902SRodney W. Grimes.Bx 4.3 .
25425b20fc0SGary W. Swearingen.Sh BUGS
255*fc1e7974SFernando ApesteguíaThe program does not handle links well and has no link-handling options.
256*fc1e7974SFernando Apesteguía.Pp
25725b20fc0SGary W. SwearingenSome of these might be considered otherwise-undocumented features.
25825b20fc0SGary W. Swearingen.Pp
25925b20fc0SGary W. Swearingen.Nm compress :
26025b20fc0SGary W. SwearingenIf the utility does not compress a file because doing so would not
2614e9e907dSRuslan Ermilovreduce its size, and a file of the same name except with an
2624e9e907dSRuslan Ermilov.Pa .Z
26325b20fc0SGary W. Swearingenextension exists, the named file is not really ignored as stated above;
26425b20fc0SGary W. Swearingenit causes a prompt to confirm the overwriting of the file with the extension.
26525b20fc0SGary W. SwearingenIf the operation is confirmed, that file is deleted.
26625b20fc0SGary W. Swearingen.Pp
26725b20fc0SGary W. Swearingen.Nm uncompress :
26825b20fc0SGary W. SwearingenIf an empty file is compressed (using
26925b20fc0SGary W. Swearingen.Fl f ) ,
27025b20fc0SGary W. Swearingenthe resulting
2714e9e907dSRuslan Ermilov.Pa .Z
27225b20fc0SGary W. Swearingenfile is also empty.
27325b20fc0SGary W. SwearingenThat seems right, but if
27425b20fc0SGary W. Swearingen.Nm uncompress
27525b20fc0SGary W. Swearingenis then used on that file, an error will occur.
27625b20fc0SGary W. Swearingen.Pp
27725b20fc0SGary W. SwearingenBoth utilities: If a
27825b20fc0SGary W. Swearingen.Sq Fl
27925b20fc0SGary W. Swearingenargument is used and the utility prompts the user, the standard input
28025b20fc0SGary W. Swearingenis taken as the user's reply to the prompt.
28125b20fc0SGary W. Swearingen.Pp
28225b20fc0SGary W. SwearingenBoth utilities:
28325b20fc0SGary W. SwearingenIf the specified file does not exist, but a similarly-named one with (for
28425b20fc0SGary W. Swearingen.Nm compress )
28525b20fc0SGary W. Swearingenor without (for
28625b20fc0SGary W. Swearingen.Nm uncompress )
28725b20fc0SGary W. Swearingena
2884e9e907dSRuslan Ermilov.Pa .Z
28925b20fc0SGary W. Swearingenextension does exist, the utility will waste the user's time by not
29025b20fc0SGary W. Swearingenimmediately emitting an error message about the missing file and
29125b20fc0SGary W. Swearingencontinuing.
29225b20fc0SGary W. SwearingenInstead, it first asks for confirmation to overwrite
2934e9e907dSRuslan Ermilovthe existing file and then does not overwrite it.
294