xref: /freebsd/contrib/xz/README (revision 3b35e7ee8de9b0260149a2b77e87a2b9c7a36244)
181ad8388SMartin Matuska
281ad8388SMartin MatuskaXZ Utils
381ad8388SMartin Matuska========
481ad8388SMartin Matuska
581ad8388SMartin Matuska    0. Overview
681ad8388SMartin Matuska    1. Documentation
781ad8388SMartin Matuska       1.1. Overall documentation
83632bc4cSMartin Matuska       1.2. Documentation for command-line tools
981ad8388SMartin Matuska       1.3. Documentation for liblzma
1081ad8388SMartin Matuska    2. Version numbering
1181ad8388SMartin Matuska    3. Reporting bugs
12a8675d92SXin LI    4. Translations
13e0f0e66dSMartin Matuska    5. Other implementations of the .xz format
14e0f0e66dSMartin Matuska    6. Contact information
1581ad8388SMartin Matuska
1681ad8388SMartin Matuska
1781ad8388SMartin Matuska0. Overview
1881ad8388SMartin Matuska-----------
1981ad8388SMartin Matuska
203632bc4cSMartin Matuska    XZ Utils provide a general-purpose data-compression library plus
213632bc4cSMartin Matuska    command-line tools. The native file format is the .xz format, but
2281ad8388SMartin Matuska    also the legacy .lzma format is supported. The .xz format supports
233632bc4cSMartin Matuska    multiple compression algorithms, which are called "filters" in the
2481ad8388SMartin Matuska    context of XZ Utils. The primary filter is currently LZMA2. With
2581ad8388SMartin Matuska    typical files, XZ Utils create about 30 % smaller files than gzip.
2681ad8388SMartin Matuska
2781ad8388SMartin Matuska    To ease adapting support for the .xz format into existing applications
2881ad8388SMartin Matuska    and scripts, the API of liblzma is somewhat similar to the API of the
293632bc4cSMartin Matuska    popular zlib library. For the same reason, the command-line tool xz
303632bc4cSMartin Matuska    has a command-line syntax similar to that of gzip.
3181ad8388SMartin Matuska
323632bc4cSMartin Matuska    When aiming for the highest compression ratio, the LZMA2 encoder uses
3381ad8388SMartin Matuska    a lot of CPU time and may use, depending on the settings, even
343632bc4cSMartin Matuska    hundreds of megabytes of RAM. However, in fast modes, the LZMA2 encoder
3581ad8388SMartin Matuska    competes with bzip2 in compression speed, RAM usage, and compression
3681ad8388SMartin Matuska    ratio.
3781ad8388SMartin Matuska
3881ad8388SMartin Matuska    LZMA2 is reasonably fast to decompress. It is a little slower than
3981ad8388SMartin Matuska    gzip, but a lot faster than bzip2. Being fast to decompress means
4081ad8388SMartin Matuska    that the .xz format is especially nice when the same file will be
4181ad8388SMartin Matuska    decompressed very many times (usually on different computers), which
4281ad8388SMartin Matuska    is the case e.g. when distributing software packages. In such
4381ad8388SMartin Matuska    situations, it's not too bad if the compression takes some time,
4481ad8388SMartin Matuska    since that needs to be done only once to benefit many people.
4581ad8388SMartin Matuska
4681ad8388SMartin Matuska    With some file types, combining (or "chaining") LZMA2 with an
473632bc4cSMartin Matuska    additional filter can improve the compression ratio. A filter chain may
483632bc4cSMartin Matuska    contain up to four filters, although usually only one or two are used.
4981ad8388SMartin Matuska    For example, putting a BCJ (Branch/Call/Jump) filter before LZMA2
5081ad8388SMartin Matuska    in the filter chain can improve compression ratio of executable files.
5181ad8388SMartin Matuska
5281ad8388SMartin Matuska    Since the .xz format allows adding new filter IDs, it is possible that
5381ad8388SMartin Matuska    some day there will be a filter that is, for example, much faster to
5481ad8388SMartin Matuska    compress than LZMA2 (but probably with worse compression ratio).
5581ad8388SMartin Matuska    Similarly, it is possible that some day there is a filter that will
5681ad8388SMartin Matuska    compress better than LZMA2.
5781ad8388SMartin Matuska
58a8675d92SXin LI    XZ Utils supports multithreaded compression. XZ Utils doesn't support
59a8675d92SXin LI    multithreaded decompression yet. It has been planned though and taken
60a8675d92SXin LI    into account when designing the .xz file format. In the future, files
61a8675d92SXin LI    that were created in threaded mode can be decompressed in threaded
62a8675d92SXin LI    mode too.
6381ad8388SMartin Matuska
6481ad8388SMartin Matuska
6581ad8388SMartin Matuska1. Documentation
6681ad8388SMartin Matuska----------------
6781ad8388SMartin Matuska
6881ad8388SMartin Matuska1.1. Overall documentation
6981ad8388SMartin Matuska
7081ad8388SMartin Matuska    README                This file
7181ad8388SMartin Matuska
72*3b35e7eeSXin LI    INSTALL.generic       Generic install instructions for those not
73*3b35e7eeSXin LI                          familiar with packages using GNU Autotools
7481ad8388SMartin Matuska    INSTALL               Installation instructions specific to XZ Utils
7581ad8388SMartin Matuska    PACKAGERS             Information to packagers of XZ Utils
7681ad8388SMartin Matuska
7781ad8388SMartin Matuska    COPYING               XZ Utils copyright and license information
78*3b35e7eeSXin LI    COPYING.0BSD          BSD Zero Clause License
7981ad8388SMartin Matuska    COPYING.GPLv2         GNU General Public License version 2
8081ad8388SMartin Matuska    COPYING.GPLv3         GNU General Public License version 3
8181ad8388SMartin Matuska    COPYING.LGPLv2.1      GNU Lesser General Public License version 2.1
8281ad8388SMartin Matuska
8381ad8388SMartin Matuska    AUTHORS               The main authors of XZ Utils
8481ad8388SMartin Matuska    THANKS                Incomplete list of people who have helped making
8581ad8388SMartin Matuska                          this software
8681ad8388SMartin Matuska    NEWS                  User-visible changes between XZ Utils releases
8781ad8388SMartin Matuska    ChangeLog             Detailed list of changes (commit log)
8881ad8388SMartin Matuska    TODO                  Known bugs and some sort of to-do list
8981ad8388SMartin Matuska
9081ad8388SMartin Matuska    Note that only some of the above files are included in binary
9181ad8388SMartin Matuska    packages.
9281ad8388SMartin Matuska
9381ad8388SMartin Matuska
943632bc4cSMartin Matuska1.2. Documentation for command-line tools
9581ad8388SMartin Matuska
963632bc4cSMartin Matuska    The command-line tools are documented as man pages. In source code
9781ad8388SMartin Matuska    releases (and possibly also in some binary packages), the man pages
98*3b35e7eeSXin LI    are also provided in plain text (ASCII only) format in the directory
99*3b35e7eeSXin LI    "doc/man" to make the man pages more accessible to those whose
100*3b35e7eeSXin LI    operating system doesn't provide an easy way to view man pages.
10181ad8388SMartin Matuska
10281ad8388SMartin Matuska
10381ad8388SMartin Matuska1.3. Documentation for liblzma
10481ad8388SMartin Matuska
10581ad8388SMartin Matuska    The liblzma API headers include short docs about each function
10681ad8388SMartin Matuska    and data type as Doxygen tags. These docs should be quite OK as
10781ad8388SMartin Matuska    a quick reference.
10881ad8388SMartin Matuska
109a8675d92SXin LI    There are a few example/tutorial programs that should help in
110a8675d92SXin LI    getting started with liblzma. In the source package the examples
111a8675d92SXin LI    are in "doc/examples" and in binary packages they may be under
112a8675d92SXin LI    "examples" in the same directory as this README.
11381ad8388SMartin Matuska
114a8675d92SXin LI    Since the liblzma API has similarities to the zlib API, some people
115a8675d92SXin LI    may find it useful to read the zlib docs and tutorial too:
11681ad8388SMartin Matuska
117c917796cSXin LI        https://zlib.net/manual.html
118c917796cSXin LI        https://zlib.net/zlib_how.html
11981ad8388SMartin Matuska
12081ad8388SMartin Matuska
12181ad8388SMartin Matuska2. Version numbering
12281ad8388SMartin Matuska--------------------
12381ad8388SMartin Matuska
12481ad8388SMartin Matuska    The version number format of XZ Utils is X.Y.ZS:
12581ad8388SMartin Matuska
12681ad8388SMartin Matuska      - X is the major version. When this is incremented, the library
12781ad8388SMartin Matuska        API and ABI break.
12881ad8388SMartin Matuska
1293632bc4cSMartin Matuska      - Y is the minor version. It is incremented when new features
1303632bc4cSMartin Matuska        are added without breaking the existing API or ABI. An even Y
1313632bc4cSMartin Matuska        indicates a stable release and an odd Y indicates unstable
1323632bc4cSMartin Matuska        (alpha or beta version).
13381ad8388SMartin Matuska
1343632bc4cSMartin Matuska      - Z is the revision. This has a different meaning for stable and
13581ad8388SMartin Matuska        unstable releases:
1363632bc4cSMartin Matuska
13781ad8388SMartin Matuska          * Stable: Z is incremented when bugs get fixed without adding
1383632bc4cSMartin Matuska            any new features. This is intended to be convenient for
1393632bc4cSMartin Matuska            downstream distributors that want bug fixes but don't want
1403632bc4cSMartin Matuska            any new features to minimize the risk of introducing new bugs.
1413632bc4cSMartin Matuska
14281ad8388SMartin Matuska          * Unstable: Z is just a counter. API or ABI of features added
14381ad8388SMartin Matuska            in earlier unstable releases having the same X.Y may break.
14481ad8388SMartin Matuska
14581ad8388SMartin Matuska      - S indicates stability of the release. It is missing from the
1463632bc4cSMartin Matuska        stable releases, where Y is an even number. When Y is odd, S
14781ad8388SMartin Matuska        is either "alpha" or "beta" to make it very clear that such
14881ad8388SMartin Matuska        versions are not stable releases. The same X.Y.Z combination is
1493632bc4cSMartin Matuska        not used for more than one stability level, i.e. after X.Y.Zalpha,
15081ad8388SMartin Matuska        the next version can be X.Y.(Z+1)beta but not X.Y.Zbeta.
15181ad8388SMartin Matuska
15281ad8388SMartin Matuska
15381ad8388SMartin Matuska3. Reporting bugs
15481ad8388SMartin Matuska-----------------
15581ad8388SMartin Matuska
15681ad8388SMartin Matuska    Naturally it is easiest for me if you already know what causes the
15781ad8388SMartin Matuska    unexpected behavior. Even better if you have a patch to propose.
15881ad8388SMartin Matuska    However, quite often the reason for unexpected behavior is unknown,
15981ad8388SMartin Matuska    so here are a few things to do before sending a bug report:
16081ad8388SMartin Matuska
16181ad8388SMartin Matuska      1. Try to create a small example how to reproduce the issue.
16281ad8388SMartin Matuska
16381ad8388SMartin Matuska      2. Compile XZ Utils with debugging code using configure switches
16481ad8388SMartin Matuska         --enable-debug and, if possible, --disable-shared. If you are
16581ad8388SMartin Matuska         using GCC, use CFLAGS='-O0 -ggdb3'. Don't strip the resulting
16681ad8388SMartin Matuska         binaries.
16781ad8388SMartin Matuska
16881ad8388SMartin Matuska      3. Turn on core dumps. The exact command depends on your shell;
16981ad8388SMartin Matuska         for example in GNU bash it is done with "ulimit -c unlimited",
17081ad8388SMartin Matuska         and in tcsh with "limit coredumpsize unlimited".
17181ad8388SMartin Matuska
17281ad8388SMartin Matuska      4. Try to reproduce the suspected bug. If you get "assertion failed"
17381ad8388SMartin Matuska         message, be sure to include the complete message in your bug
17481ad8388SMartin Matuska         report. If the application leaves a coredump, get a backtrace
17581ad8388SMartin Matuska         using gdb:
17681ad8388SMartin Matuska           $ gdb /path/to/app-binary   # Load the app to the debugger.
17781ad8388SMartin Matuska           (gdb) core core   # Open the coredump.
17881ad8388SMartin Matuska           (gdb) bt   # Print the backtrace. Copy & paste to bug report.
17981ad8388SMartin Matuska           (gdb) quit   # Quit gdb.
18081ad8388SMartin Matuska
18181ad8388SMartin Matuska    Report your bug via email or IRC (see Contact information below).
18281ad8388SMartin Matuska    Don't send core dump files or any executables. If you have a small
18381ad8388SMartin Matuska    example file(s) (total size less than 256 KiB), please include
18481ad8388SMartin Matuska    it/them as an attachment. If you have bigger test files, put them
1853632bc4cSMartin Matuska    online somewhere and include a URL to the file(s) in the bug report.
18681ad8388SMartin Matuska
18781ad8388SMartin Matuska    Always include the exact version number of XZ Utils in the bug report.
18881ad8388SMartin Matuska    If you are using a snapshot from the git repository, use "git describe"
18981ad8388SMartin Matuska    to get the exact snapshot version. If you are using XZ Utils shipped
19081ad8388SMartin Matuska    in an operating system distribution, mention the distribution name,
19181ad8388SMartin Matuska    distribution version, and exact xz package version; if you cannot
19281ad8388SMartin Matuska    repeat the bug with the code compiled from unpatched source code,
19381ad8388SMartin Matuska    you probably need to report a bug to your distribution's bug tracking
19481ad8388SMartin Matuska    system.
19581ad8388SMartin Matuska
19681ad8388SMartin Matuska
197a8675d92SXin LI4. Translations
198a8675d92SXin LI---------------
199e0f0e66dSMartin Matuska
200a8675d92SXin LI    The xz command line tool and all man pages can be translated.
201a8675d92SXin LI    The translations are handled via the Translation Project. If you
202a8675d92SXin LI    wish to help translating xz, please join the Translation Project:
203e0f0e66dSMartin Matuska
204a8675d92SXin LI        https://translationproject.org/html/translators.html
205e0f0e66dSMartin Matuska
20673ed8e77SXin LI    Below are notes and testing instructions specific to xz
20773ed8e77SXin LI    translations.
20873ed8e77SXin LI
20973ed8e77SXin LI    Testing can be done by installing xz into a temporary directory:
21073ed8e77SXin LI
21173ed8e77SXin LI        ./configure --disable-shared --prefix=/tmp/xz-test
21273ed8e77SXin LI        # <Edit the .po file in the po directory.>
21373ed8e77SXin LI        make -C po update-po
21473ed8e77SXin LI        make install
21573ed8e77SXin LI        bash debug/translation.bash | less
21673ed8e77SXin LI        bash debug/translation.bash | less -S  # For --list outputs
21773ed8e77SXin LI
21873ed8e77SXin LI    Repeat the above as needed (no need to re-run configure though).
21973ed8e77SXin LI
22073ed8e77SXin LI    Note especially the following:
22173ed8e77SXin LI
22273ed8e77SXin LI      - The output of --help and --long-help must look nice on
22373ed8e77SXin LI        an 80-column terminal. It's OK to add extra lines if needed.
22473ed8e77SXin LI
22573ed8e77SXin LI      - In contrast, don't add extra lines to error messages and such.
22673ed8e77SXin LI        They are often preceded with e.g. a filename on the same line,
22773ed8e77SXin LI        so you have no way to predict where to put a \n. Let the terminal
22873ed8e77SXin LI        do the wrapping even if it looks ugly. Adding new lines will be
22973ed8e77SXin LI        even uglier in the generic case even if it looks nice in a few
23073ed8e77SXin LI        limited examples.
23173ed8e77SXin LI
23273ed8e77SXin LI      - Be careful with column alignment in tables and table-like output
23373ed8e77SXin LI        (--list, --list --verbose --verbose, --info-memory, --help, and
23473ed8e77SXin LI        --long-help):
23573ed8e77SXin LI
23673ed8e77SXin LI          * All descriptions of options in --help should start in the
23773ed8e77SXin LI            same column (but it doesn't need to be the same column as
23873ed8e77SXin LI            in the English messages; just be consistent if you change it).
23973ed8e77SXin LI            Check that both --help and --long-help look OK, since they
24073ed8e77SXin LI            share several strings.
24173ed8e77SXin LI
24273ed8e77SXin LI          * --list --verbose and --info-memory print lines that have
24373ed8e77SXin LI            the format "Description:   %s". If you need a longer
24473ed8e77SXin LI            description, you can put extra space between the colon
24573ed8e77SXin LI            and %s. Then you may need to add extra space to other
24673ed8e77SXin LI            strings too so that the result as a whole looks good (all
24773ed8e77SXin LI            values start at the same column).
24873ed8e77SXin LI
24973ed8e77SXin LI          * The columns of the actual tables in --list --verbose --verbose
25073ed8e77SXin LI            should be aligned properly. Abbreviate if necessary. It might
25173ed8e77SXin LI            be good to keep at least 2 or 3 spaces between column headings
25273ed8e77SXin LI            and avoid spaces in the headings so that the columns stand out
25373ed8e77SXin LI            better, but this is a matter of opinion. Do what you think
25473ed8e77SXin LI            looks best.
25573ed8e77SXin LI
25673ed8e77SXin LI      - Be careful to put a period at the end of a sentence when the
25773ed8e77SXin LI        original version has it, and don't put it when the original
25873ed8e77SXin LI        doesn't have it. Similarly, be careful with \n characters
25973ed8e77SXin LI        at the beginning and end of the strings.
26073ed8e77SXin LI
26173ed8e77SXin LI      - Read the TRANSLATORS comments that have been extracted from the
26273ed8e77SXin LI        source code and included in xz.pot. Some comments suggest
26373ed8e77SXin LI        testing with a specific command which needs an .xz file. You
26473ed8e77SXin LI        may use e.g. any tests/files/good-*.xz. However, these test
26573ed8e77SXin LI        commands are included in translations.bash output, so reading
26673ed8e77SXin LI        translations.bash output carefully can be enough.
26773ed8e77SXin LI
26873ed8e77SXin LI      - If you find language problems in the original English strings,
26973ed8e77SXin LI        feel free to suggest improvements. Ask if something is unclear.
27073ed8e77SXin LI
27173ed8e77SXin LI      - The translated messages should be understandable (sometimes this
27273ed8e77SXin LI        may be a problem with the original English messages too). Don't
27373ed8e77SXin LI        make a direct word-by-word translation from English especially if
27473ed8e77SXin LI        the result doesn't sound good in your language.
27573ed8e77SXin LI
27673ed8e77SXin LI    Thanks for your help!
277e0f0e66dSMartin Matuska
278e0f0e66dSMartin Matuska
279e0f0e66dSMartin Matuska5. Other implementations of the .xz format
28081ad8388SMartin Matuska------------------------------------------
28181ad8388SMartin Matuska
28281ad8388SMartin Matuska    7-Zip and the p7zip port of 7-Zip support the .xz format starting
28381ad8388SMartin Matuska    from the version 9.00alpha.
28481ad8388SMartin Matuska
285c917796cSXin LI        https://7-zip.org/
286c917796cSXin LI        https://p7zip.sourceforge.net/
28781ad8388SMartin Matuska
28881ad8388SMartin Matuska    XZ Embedded is a limited implementation written for use in the Linux
28981ad8388SMartin Matuska    kernel, but it is also suitable for other embedded use.
29081ad8388SMartin Matuska
2912f9cd13dSXin LI        https://tukaani.org/xz/embedded.html
29281ad8388SMartin Matuska
2931f3ced26SXin LI    XZ for Java is a complete implementation written in pure Java.
2941f3ced26SXin LI
2952f9cd13dSXin LI        https://tukaani.org/xz/java.html
2961f3ced26SXin LI
29781ad8388SMartin Matuska
298e0f0e66dSMartin Matuska6. Contact information
29981ad8388SMartin Matuska----------------------
30081ad8388SMartin Matuska
301*3b35e7eeSXin LI    XZ Utils in general:
302*3b35e7eeSXin LI      - Home page: https://tukaani.org/xz/
303*3b35e7eeSXin LI      - Email to maintainer(s): xz@tukaani.org
304*3b35e7eeSXin LI      - IRC: #tukaani on Libera Chat
305*3b35e7eeSXin LI      - GitHub: https://github.com/tukaani-project/xz
30681ad8388SMartin Matuska
307*3b35e7eeSXin LI    Lead maintainer:
308*3b35e7eeSXin LI      - Email: Lasse Collin <lasse.collin@tukaani.org>
309*3b35e7eeSXin LI      - IRC: Larhzu on Libera Chat
31081ad8388SMartin Matuska
311