181ad8388SMartin Matuska 281ad8388SMartin MatuskaXZ Utils 381ad8388SMartin Matuska======== 481ad8388SMartin Matuska 581ad8388SMartin Matuska 0. Overview 681ad8388SMartin Matuska 1. Documentation 781ad8388SMartin Matuska 1.1. Overall documentation 83632bc4cSMartin Matuska 1.2. Documentation for command-line tools 981ad8388SMartin Matuska 1.3. Documentation for liblzma 1081ad8388SMartin Matuska 2. Version numbering 1181ad8388SMartin Matuska 3. Reporting bugs 12a8675d92SXin LI 4. Translations 13e0f0e66dSMartin Matuska 5. Other implementations of the .xz format 14e0f0e66dSMartin Matuska 6. Contact information 1581ad8388SMartin Matuska 1681ad8388SMartin Matuska 1781ad8388SMartin Matuska0. Overview 1881ad8388SMartin Matuska----------- 1981ad8388SMartin Matuska 203632bc4cSMartin Matuska XZ Utils provide a general-purpose data-compression library plus 213632bc4cSMartin Matuska command-line tools. The native file format is the .xz format, but 2281ad8388SMartin Matuska also the legacy .lzma format is supported. The .xz format supports 233632bc4cSMartin Matuska multiple compression algorithms, which are called "filters" in the 2481ad8388SMartin Matuska context of XZ Utils. The primary filter is currently LZMA2. With 2581ad8388SMartin Matuska typical files, XZ Utils create about 30 % smaller files than gzip. 2681ad8388SMartin Matuska 2781ad8388SMartin Matuska To ease adapting support for the .xz format into existing applications 2881ad8388SMartin Matuska and scripts, the API of liblzma is somewhat similar to the API of the 293632bc4cSMartin Matuska popular zlib library. For the same reason, the command-line tool xz 303632bc4cSMartin Matuska has a command-line syntax similar to that of gzip. 3181ad8388SMartin Matuska 323632bc4cSMartin Matuska When aiming for the highest compression ratio, the LZMA2 encoder uses 3381ad8388SMartin Matuska a lot of CPU time and may use, depending on the settings, even 343632bc4cSMartin Matuska hundreds of megabytes of RAM. However, in fast modes, the LZMA2 encoder 3581ad8388SMartin Matuska competes with bzip2 in compression speed, RAM usage, and compression 3681ad8388SMartin Matuska ratio. 3781ad8388SMartin Matuska 3881ad8388SMartin Matuska LZMA2 is reasonably fast to decompress. It is a little slower than 3981ad8388SMartin Matuska gzip, but a lot faster than bzip2. Being fast to decompress means 4081ad8388SMartin Matuska that the .xz format is especially nice when the same file will be 4181ad8388SMartin Matuska decompressed very many times (usually on different computers), which 4281ad8388SMartin Matuska is the case e.g. when distributing software packages. In such 4381ad8388SMartin Matuska situations, it's not too bad if the compression takes some time, 4481ad8388SMartin Matuska since that needs to be done only once to benefit many people. 4581ad8388SMartin Matuska 4681ad8388SMartin Matuska With some file types, combining (or "chaining") LZMA2 with an 473632bc4cSMartin Matuska additional filter can improve the compression ratio. A filter chain may 483632bc4cSMartin Matuska contain up to four filters, although usually only one or two are used. 4981ad8388SMartin Matuska For example, putting a BCJ (Branch/Call/Jump) filter before LZMA2 5081ad8388SMartin Matuska in the filter chain can improve compression ratio of executable files. 5181ad8388SMartin Matuska 5281ad8388SMartin Matuska Since the .xz format allows adding new filter IDs, it is possible that 5381ad8388SMartin Matuska some day there will be a filter that is, for example, much faster to 5481ad8388SMartin Matuska compress than LZMA2 (but probably with worse compression ratio). 5581ad8388SMartin Matuska Similarly, it is possible that some day there is a filter that will 5681ad8388SMartin Matuska compress better than LZMA2. 5781ad8388SMartin Matuska 58a8675d92SXin LI XZ Utils supports multithreaded compression. XZ Utils doesn't support 59a8675d92SXin LI multithreaded decompression yet. It has been planned though and taken 60a8675d92SXin LI into account when designing the .xz file format. In the future, files 61a8675d92SXin LI that were created in threaded mode can be decompressed in threaded 62a8675d92SXin LI mode too. 6381ad8388SMartin Matuska 6481ad8388SMartin Matuska 6581ad8388SMartin Matuska1. Documentation 6681ad8388SMartin Matuska---------------- 6781ad8388SMartin Matuska 6881ad8388SMartin Matuska1.1. Overall documentation 6981ad8388SMartin Matuska 7081ad8388SMartin Matuska README This file 7181ad8388SMartin Matuska 72*3b35e7eeSXin LI INSTALL.generic Generic install instructions for those not 73*3b35e7eeSXin LI familiar with packages using GNU Autotools 7481ad8388SMartin Matuska INSTALL Installation instructions specific to XZ Utils 7581ad8388SMartin Matuska PACKAGERS Information to packagers of XZ Utils 7681ad8388SMartin Matuska 7781ad8388SMartin Matuska COPYING XZ Utils copyright and license information 78*3b35e7eeSXin LI COPYING.0BSD BSD Zero Clause License 7981ad8388SMartin Matuska COPYING.GPLv2 GNU General Public License version 2 8081ad8388SMartin Matuska COPYING.GPLv3 GNU General Public License version 3 8181ad8388SMartin Matuska COPYING.LGPLv2.1 GNU Lesser General Public License version 2.1 8281ad8388SMartin Matuska 8381ad8388SMartin Matuska AUTHORS The main authors of XZ Utils 8481ad8388SMartin Matuska THANKS Incomplete list of people who have helped making 8581ad8388SMartin Matuska this software 8681ad8388SMartin Matuska NEWS User-visible changes between XZ Utils releases 8781ad8388SMartin Matuska ChangeLog Detailed list of changes (commit log) 8881ad8388SMartin Matuska TODO Known bugs and some sort of to-do list 8981ad8388SMartin Matuska 9081ad8388SMartin Matuska Note that only some of the above files are included in binary 9181ad8388SMartin Matuska packages. 9281ad8388SMartin Matuska 9381ad8388SMartin Matuska 943632bc4cSMartin Matuska1.2. Documentation for command-line tools 9581ad8388SMartin Matuska 963632bc4cSMartin Matuska The command-line tools are documented as man pages. In source code 9781ad8388SMartin Matuska releases (and possibly also in some binary packages), the man pages 98*3b35e7eeSXin LI are also provided in plain text (ASCII only) format in the directory 99*3b35e7eeSXin LI "doc/man" to make the man pages more accessible to those whose 100*3b35e7eeSXin LI operating system doesn't provide an easy way to view man pages. 10181ad8388SMartin Matuska 10281ad8388SMartin Matuska 10381ad8388SMartin Matuska1.3. Documentation for liblzma 10481ad8388SMartin Matuska 10581ad8388SMartin Matuska The liblzma API headers include short docs about each function 10681ad8388SMartin Matuska and data type as Doxygen tags. These docs should be quite OK as 10781ad8388SMartin Matuska a quick reference. 10881ad8388SMartin Matuska 109a8675d92SXin LI There are a few example/tutorial programs that should help in 110a8675d92SXin LI getting started with liblzma. In the source package the examples 111a8675d92SXin LI are in "doc/examples" and in binary packages they may be under 112a8675d92SXin LI "examples" in the same directory as this README. 11381ad8388SMartin Matuska 114a8675d92SXin LI Since the liblzma API has similarities to the zlib API, some people 115a8675d92SXin LI may find it useful to read the zlib docs and tutorial too: 11681ad8388SMartin Matuska 117c917796cSXin LI https://zlib.net/manual.html 118c917796cSXin LI https://zlib.net/zlib_how.html 11981ad8388SMartin Matuska 12081ad8388SMartin Matuska 12181ad8388SMartin Matuska2. Version numbering 12281ad8388SMartin Matuska-------------------- 12381ad8388SMartin Matuska 12481ad8388SMartin Matuska The version number format of XZ Utils is X.Y.ZS: 12581ad8388SMartin Matuska 12681ad8388SMartin Matuska - X is the major version. When this is incremented, the library 12781ad8388SMartin Matuska API and ABI break. 12881ad8388SMartin Matuska 1293632bc4cSMartin Matuska - Y is the minor version. It is incremented when new features 1303632bc4cSMartin Matuska are added without breaking the existing API or ABI. An even Y 1313632bc4cSMartin Matuska indicates a stable release and an odd Y indicates unstable 1323632bc4cSMartin Matuska (alpha or beta version). 13381ad8388SMartin Matuska 1343632bc4cSMartin Matuska - Z is the revision. This has a different meaning for stable and 13581ad8388SMartin Matuska unstable releases: 1363632bc4cSMartin Matuska 13781ad8388SMartin Matuska * Stable: Z is incremented when bugs get fixed without adding 1383632bc4cSMartin Matuska any new features. This is intended to be convenient for 1393632bc4cSMartin Matuska downstream distributors that want bug fixes but don't want 1403632bc4cSMartin Matuska any new features to minimize the risk of introducing new bugs. 1413632bc4cSMartin Matuska 14281ad8388SMartin Matuska * Unstable: Z is just a counter. API or ABI of features added 14381ad8388SMartin Matuska in earlier unstable releases having the same X.Y may break. 14481ad8388SMartin Matuska 14581ad8388SMartin Matuska - S indicates stability of the release. It is missing from the 1463632bc4cSMartin Matuska stable releases, where Y is an even number. When Y is odd, S 14781ad8388SMartin Matuska is either "alpha" or "beta" to make it very clear that such 14881ad8388SMartin Matuska versions are not stable releases. The same X.Y.Z combination is 1493632bc4cSMartin Matuska not used for more than one stability level, i.e. after X.Y.Zalpha, 15081ad8388SMartin Matuska the next version can be X.Y.(Z+1)beta but not X.Y.Zbeta. 15181ad8388SMartin Matuska 15281ad8388SMartin Matuska 15381ad8388SMartin Matuska3. Reporting bugs 15481ad8388SMartin Matuska----------------- 15581ad8388SMartin Matuska 15681ad8388SMartin Matuska Naturally it is easiest for me if you already know what causes the 15781ad8388SMartin Matuska unexpected behavior. Even better if you have a patch to propose. 15881ad8388SMartin Matuska However, quite often the reason for unexpected behavior is unknown, 15981ad8388SMartin Matuska so here are a few things to do before sending a bug report: 16081ad8388SMartin Matuska 16181ad8388SMartin Matuska 1. Try to create a small example how to reproduce the issue. 16281ad8388SMartin Matuska 16381ad8388SMartin Matuska 2. Compile XZ Utils with debugging code using configure switches 16481ad8388SMartin Matuska --enable-debug and, if possible, --disable-shared. If you are 16581ad8388SMartin Matuska using GCC, use CFLAGS='-O0 -ggdb3'. Don't strip the resulting 16681ad8388SMartin Matuska binaries. 16781ad8388SMartin Matuska 16881ad8388SMartin Matuska 3. Turn on core dumps. The exact command depends on your shell; 16981ad8388SMartin Matuska for example in GNU bash it is done with "ulimit -c unlimited", 17081ad8388SMartin Matuska and in tcsh with "limit coredumpsize unlimited". 17181ad8388SMartin Matuska 17281ad8388SMartin Matuska 4. Try to reproduce the suspected bug. If you get "assertion failed" 17381ad8388SMartin Matuska message, be sure to include the complete message in your bug 17481ad8388SMartin Matuska report. If the application leaves a coredump, get a backtrace 17581ad8388SMartin Matuska using gdb: 17681ad8388SMartin Matuska $ gdb /path/to/app-binary # Load the app to the debugger. 17781ad8388SMartin Matuska (gdb) core core # Open the coredump. 17881ad8388SMartin Matuska (gdb) bt # Print the backtrace. Copy & paste to bug report. 17981ad8388SMartin Matuska (gdb) quit # Quit gdb. 18081ad8388SMartin Matuska 18181ad8388SMartin Matuska Report your bug via email or IRC (see Contact information below). 18281ad8388SMartin Matuska Don't send core dump files or any executables. If you have a small 18381ad8388SMartin Matuska example file(s) (total size less than 256 KiB), please include 18481ad8388SMartin Matuska it/them as an attachment. If you have bigger test files, put them 1853632bc4cSMartin Matuska online somewhere and include a URL to the file(s) in the bug report. 18681ad8388SMartin Matuska 18781ad8388SMartin Matuska Always include the exact version number of XZ Utils in the bug report. 18881ad8388SMartin Matuska If you are using a snapshot from the git repository, use "git describe" 18981ad8388SMartin Matuska to get the exact snapshot version. If you are using XZ Utils shipped 19081ad8388SMartin Matuska in an operating system distribution, mention the distribution name, 19181ad8388SMartin Matuska distribution version, and exact xz package version; if you cannot 19281ad8388SMartin Matuska repeat the bug with the code compiled from unpatched source code, 19381ad8388SMartin Matuska you probably need to report a bug to your distribution's bug tracking 19481ad8388SMartin Matuska system. 19581ad8388SMartin Matuska 19681ad8388SMartin Matuska 197a8675d92SXin LI4. Translations 198a8675d92SXin LI--------------- 199e0f0e66dSMartin Matuska 200a8675d92SXin LI The xz command line tool and all man pages can be translated. 201a8675d92SXin LI The translations are handled via the Translation Project. If you 202a8675d92SXin LI wish to help translating xz, please join the Translation Project: 203e0f0e66dSMartin Matuska 204a8675d92SXin LI https://translationproject.org/html/translators.html 205e0f0e66dSMartin Matuska 20673ed8e77SXin LI Below are notes and testing instructions specific to xz 20773ed8e77SXin LI translations. 20873ed8e77SXin LI 20973ed8e77SXin LI Testing can be done by installing xz into a temporary directory: 21073ed8e77SXin LI 21173ed8e77SXin LI ./configure --disable-shared --prefix=/tmp/xz-test 21273ed8e77SXin LI # <Edit the .po file in the po directory.> 21373ed8e77SXin LI make -C po update-po 21473ed8e77SXin LI make install 21573ed8e77SXin LI bash debug/translation.bash | less 21673ed8e77SXin LI bash debug/translation.bash | less -S # For --list outputs 21773ed8e77SXin LI 21873ed8e77SXin LI Repeat the above as needed (no need to re-run configure though). 21973ed8e77SXin LI 22073ed8e77SXin LI Note especially the following: 22173ed8e77SXin LI 22273ed8e77SXin LI - The output of --help and --long-help must look nice on 22373ed8e77SXin LI an 80-column terminal. It's OK to add extra lines if needed. 22473ed8e77SXin LI 22573ed8e77SXin LI - In contrast, don't add extra lines to error messages and such. 22673ed8e77SXin LI They are often preceded with e.g. a filename on the same line, 22773ed8e77SXin LI so you have no way to predict where to put a \n. Let the terminal 22873ed8e77SXin LI do the wrapping even if it looks ugly. Adding new lines will be 22973ed8e77SXin LI even uglier in the generic case even if it looks nice in a few 23073ed8e77SXin LI limited examples. 23173ed8e77SXin LI 23273ed8e77SXin LI - Be careful with column alignment in tables and table-like output 23373ed8e77SXin LI (--list, --list --verbose --verbose, --info-memory, --help, and 23473ed8e77SXin LI --long-help): 23573ed8e77SXin LI 23673ed8e77SXin LI * All descriptions of options in --help should start in the 23773ed8e77SXin LI same column (but it doesn't need to be the same column as 23873ed8e77SXin LI in the English messages; just be consistent if you change it). 23973ed8e77SXin LI Check that both --help and --long-help look OK, since they 24073ed8e77SXin LI share several strings. 24173ed8e77SXin LI 24273ed8e77SXin LI * --list --verbose and --info-memory print lines that have 24373ed8e77SXin LI the format "Description: %s". If you need a longer 24473ed8e77SXin LI description, you can put extra space between the colon 24573ed8e77SXin LI and %s. Then you may need to add extra space to other 24673ed8e77SXin LI strings too so that the result as a whole looks good (all 24773ed8e77SXin LI values start at the same column). 24873ed8e77SXin LI 24973ed8e77SXin LI * The columns of the actual tables in --list --verbose --verbose 25073ed8e77SXin LI should be aligned properly. Abbreviate if necessary. It might 25173ed8e77SXin LI be good to keep at least 2 or 3 spaces between column headings 25273ed8e77SXin LI and avoid spaces in the headings so that the columns stand out 25373ed8e77SXin LI better, but this is a matter of opinion. Do what you think 25473ed8e77SXin LI looks best. 25573ed8e77SXin LI 25673ed8e77SXin LI - Be careful to put a period at the end of a sentence when the 25773ed8e77SXin LI original version has it, and don't put it when the original 25873ed8e77SXin LI doesn't have it. Similarly, be careful with \n characters 25973ed8e77SXin LI at the beginning and end of the strings. 26073ed8e77SXin LI 26173ed8e77SXin LI - Read the TRANSLATORS comments that have been extracted from the 26273ed8e77SXin LI source code and included in xz.pot. Some comments suggest 26373ed8e77SXin LI testing with a specific command which needs an .xz file. You 26473ed8e77SXin LI may use e.g. any tests/files/good-*.xz. However, these test 26573ed8e77SXin LI commands are included in translations.bash output, so reading 26673ed8e77SXin LI translations.bash output carefully can be enough. 26773ed8e77SXin LI 26873ed8e77SXin LI - If you find language problems in the original English strings, 26973ed8e77SXin LI feel free to suggest improvements. Ask if something is unclear. 27073ed8e77SXin LI 27173ed8e77SXin LI - The translated messages should be understandable (sometimes this 27273ed8e77SXin LI may be a problem with the original English messages too). Don't 27373ed8e77SXin LI make a direct word-by-word translation from English especially if 27473ed8e77SXin LI the result doesn't sound good in your language. 27573ed8e77SXin LI 27673ed8e77SXin LI Thanks for your help! 277e0f0e66dSMartin Matuska 278e0f0e66dSMartin Matuska 279e0f0e66dSMartin Matuska5. Other implementations of the .xz format 28081ad8388SMartin Matuska------------------------------------------ 28181ad8388SMartin Matuska 28281ad8388SMartin Matuska 7-Zip and the p7zip port of 7-Zip support the .xz format starting 28381ad8388SMartin Matuska from the version 9.00alpha. 28481ad8388SMartin Matuska 285c917796cSXin LI https://7-zip.org/ 286c917796cSXin LI https://p7zip.sourceforge.net/ 28781ad8388SMartin Matuska 28881ad8388SMartin Matuska XZ Embedded is a limited implementation written for use in the Linux 28981ad8388SMartin Matuska kernel, but it is also suitable for other embedded use. 29081ad8388SMartin Matuska 2912f9cd13dSXin LI https://tukaani.org/xz/embedded.html 29281ad8388SMartin Matuska 2931f3ced26SXin LI XZ for Java is a complete implementation written in pure Java. 2941f3ced26SXin LI 2952f9cd13dSXin LI https://tukaani.org/xz/java.html 2961f3ced26SXin LI 29781ad8388SMartin Matuska 298e0f0e66dSMartin Matuska6. Contact information 29981ad8388SMartin Matuska---------------------- 30081ad8388SMartin Matuska 301*3b35e7eeSXin LI XZ Utils in general: 302*3b35e7eeSXin LI - Home page: https://tukaani.org/xz/ 303*3b35e7eeSXin LI - Email to maintainer(s): xz@tukaani.org 304*3b35e7eeSXin LI - IRC: #tukaani on Libera Chat 305*3b35e7eeSXin LI - GitHub: https://github.com/tukaani-project/xz 30681ad8388SMartin Matuska 307*3b35e7eeSXin LI Lead maintainer: 308*3b35e7eeSXin LI - Email: Lasse Collin <lasse.collin@tukaani.org> 309*3b35e7eeSXin LI - IRC: Larhzu on Libera Chat 31081ad8388SMartin Matuska 311