181ad8388SMartin Matuska 281ad8388SMartin MatuskaXZ Utils 381ad8388SMartin Matuska======== 481ad8388SMartin Matuska 581ad8388SMartin Matuska 0. Overview 681ad8388SMartin Matuska 1. Documentation 781ad8388SMartin Matuska 1.1. Overall documentation 83632bc4cSMartin Matuska 1.2. Documentation for command-line tools 981ad8388SMartin Matuska 1.3. Documentation for liblzma 1081ad8388SMartin Matuska 2. Version numbering 1181ad8388SMartin Matuska 3. Reporting bugs 12a8675d92SXin LI 4. Translations 13*128836d3SXin LI 4.1. Testing translations 14e0f0e66dSMartin Matuska 5. Other implementations of the .xz format 15e0f0e66dSMartin Matuska 6. Contact information 1681ad8388SMartin Matuska 1781ad8388SMartin Matuska 1881ad8388SMartin Matuska0. Overview 1981ad8388SMartin Matuska----------- 2081ad8388SMartin Matuska 213632bc4cSMartin Matuska XZ Utils provide a general-purpose data-compression library plus 223632bc4cSMartin Matuska command-line tools. The native file format is the .xz format, but 2381ad8388SMartin Matuska also the legacy .lzma format is supported. The .xz format supports 243632bc4cSMartin Matuska multiple compression algorithms, which are called "filters" in the 2581ad8388SMartin Matuska context of XZ Utils. The primary filter is currently LZMA2. With 2681ad8388SMartin Matuska typical files, XZ Utils create about 30 % smaller files than gzip. 2781ad8388SMartin Matuska 2881ad8388SMartin Matuska To ease adapting support for the .xz format into existing applications 2981ad8388SMartin Matuska and scripts, the API of liblzma is somewhat similar to the API of the 303632bc4cSMartin Matuska popular zlib library. For the same reason, the command-line tool xz 313632bc4cSMartin Matuska has a command-line syntax similar to that of gzip. 3281ad8388SMartin Matuska 333632bc4cSMartin Matuska When aiming for the highest compression ratio, the LZMA2 encoder uses 3481ad8388SMartin Matuska a lot of CPU time and may use, depending on the settings, even 353632bc4cSMartin Matuska hundreds of megabytes of RAM. However, in fast modes, the LZMA2 encoder 3681ad8388SMartin Matuska competes with bzip2 in compression speed, RAM usage, and compression 3781ad8388SMartin Matuska ratio. 3881ad8388SMartin Matuska 3981ad8388SMartin Matuska LZMA2 is reasonably fast to decompress. It is a little slower than 4081ad8388SMartin Matuska gzip, but a lot faster than bzip2. Being fast to decompress means 4181ad8388SMartin Matuska that the .xz format is especially nice when the same file will be 4281ad8388SMartin Matuska decompressed very many times (usually on different computers), which 4381ad8388SMartin Matuska is the case e.g. when distributing software packages. In such 4481ad8388SMartin Matuska situations, it's not too bad if the compression takes some time, 4581ad8388SMartin Matuska since that needs to be done only once to benefit many people. 4681ad8388SMartin Matuska 4781ad8388SMartin Matuska With some file types, combining (or "chaining") LZMA2 with an 483632bc4cSMartin Matuska additional filter can improve the compression ratio. A filter chain may 493632bc4cSMartin Matuska contain up to four filters, although usually only one or two are used. 5081ad8388SMartin Matuska For example, putting a BCJ (Branch/Call/Jump) filter before LZMA2 5181ad8388SMartin Matuska in the filter chain can improve compression ratio of executable files. 5281ad8388SMartin Matuska 5381ad8388SMartin Matuska Since the .xz format allows adding new filter IDs, it is possible that 5481ad8388SMartin Matuska some day there will be a filter that is, for example, much faster to 5581ad8388SMartin Matuska compress than LZMA2 (but probably with worse compression ratio). 5681ad8388SMartin Matuska Similarly, it is possible that some day there is a filter that will 5781ad8388SMartin Matuska compress better than LZMA2. 5881ad8388SMartin Matuska 59a8675d92SXin LI XZ Utils supports multithreaded compression. XZ Utils doesn't support 60a8675d92SXin LI multithreaded decompression yet. It has been planned though and taken 61a8675d92SXin LI into account when designing the .xz file format. In the future, files 62a8675d92SXin LI that were created in threaded mode can be decompressed in threaded 63a8675d92SXin LI mode too. 6481ad8388SMartin Matuska 6581ad8388SMartin Matuska 6681ad8388SMartin Matuska1. Documentation 6781ad8388SMartin Matuska---------------- 6881ad8388SMartin Matuska 6981ad8388SMartin Matuska1.1. Overall documentation 7081ad8388SMartin Matuska 7181ad8388SMartin Matuska README This file 7281ad8388SMartin Matuska 733b35e7eeSXin LI INSTALL.generic Generic install instructions for those not 743b35e7eeSXin LI familiar with packages using GNU Autotools 7581ad8388SMartin Matuska INSTALL Installation instructions specific to XZ Utils 7681ad8388SMartin Matuska PACKAGERS Information to packagers of XZ Utils 7781ad8388SMartin Matuska 7881ad8388SMartin Matuska COPYING XZ Utils copyright and license information 793b35e7eeSXin LI COPYING.0BSD BSD Zero Clause License 8081ad8388SMartin Matuska COPYING.GPLv2 GNU General Public License version 2 8181ad8388SMartin Matuska COPYING.GPLv3 GNU General Public License version 3 8281ad8388SMartin Matuska COPYING.LGPLv2.1 GNU Lesser General Public License version 2.1 8381ad8388SMartin Matuska 8481ad8388SMartin Matuska AUTHORS The main authors of XZ Utils 8581ad8388SMartin Matuska THANKS Incomplete list of people who have helped making 8681ad8388SMartin Matuska this software 8781ad8388SMartin Matuska NEWS User-visible changes between XZ Utils releases 8881ad8388SMartin Matuska ChangeLog Detailed list of changes (commit log) 8981ad8388SMartin Matuska TODO Known bugs and some sort of to-do list 9081ad8388SMartin Matuska 9181ad8388SMartin Matuska Note that only some of the above files are included in binary 9281ad8388SMartin Matuska packages. 9381ad8388SMartin Matuska 9481ad8388SMartin Matuska 953632bc4cSMartin Matuska1.2. Documentation for command-line tools 9681ad8388SMartin Matuska 973632bc4cSMartin Matuska The command-line tools are documented as man pages. In source code 9881ad8388SMartin Matuska releases (and possibly also in some binary packages), the man pages 993b35e7eeSXin LI are also provided in plain text (ASCII only) format in the directory 1003b35e7eeSXin LI "doc/man" to make the man pages more accessible to those whose 1013b35e7eeSXin LI operating system doesn't provide an easy way to view man pages. 10281ad8388SMartin Matuska 10381ad8388SMartin Matuska 10481ad8388SMartin Matuska1.3. Documentation for liblzma 10581ad8388SMartin Matuska 10681ad8388SMartin Matuska The liblzma API headers include short docs about each function 10781ad8388SMartin Matuska and data type as Doxygen tags. These docs should be quite OK as 10881ad8388SMartin Matuska a quick reference. 10981ad8388SMartin Matuska 110a8675d92SXin LI There are a few example/tutorial programs that should help in 111a8675d92SXin LI getting started with liblzma. In the source package the examples 112a8675d92SXin LI are in "doc/examples" and in binary packages they may be under 113a8675d92SXin LI "examples" in the same directory as this README. 11481ad8388SMartin Matuska 115a8675d92SXin LI Since the liblzma API has similarities to the zlib API, some people 116a8675d92SXin LI may find it useful to read the zlib docs and tutorial too: 11781ad8388SMartin Matuska 118c917796cSXin LI https://zlib.net/manual.html 119c917796cSXin LI https://zlib.net/zlib_how.html 12081ad8388SMartin Matuska 12181ad8388SMartin Matuska 12281ad8388SMartin Matuska2. Version numbering 12381ad8388SMartin Matuska-------------------- 12481ad8388SMartin Matuska 12581ad8388SMartin Matuska The version number format of XZ Utils is X.Y.ZS: 12681ad8388SMartin Matuska 12781ad8388SMartin Matuska - X is the major version. When this is incremented, the library 12881ad8388SMartin Matuska API and ABI break. 12981ad8388SMartin Matuska 1303632bc4cSMartin Matuska - Y is the minor version. It is incremented when new features 1313632bc4cSMartin Matuska are added without breaking the existing API or ABI. An even Y 1323632bc4cSMartin Matuska indicates a stable release and an odd Y indicates unstable 1333632bc4cSMartin Matuska (alpha or beta version). 13481ad8388SMartin Matuska 1353632bc4cSMartin Matuska - Z is the revision. This has a different meaning for stable and 13681ad8388SMartin Matuska unstable releases: 1373632bc4cSMartin Matuska 13881ad8388SMartin Matuska * Stable: Z is incremented when bugs get fixed without adding 1393632bc4cSMartin Matuska any new features. This is intended to be convenient for 1403632bc4cSMartin Matuska downstream distributors that want bug fixes but don't want 1413632bc4cSMartin Matuska any new features to minimize the risk of introducing new bugs. 1423632bc4cSMartin Matuska 14381ad8388SMartin Matuska * Unstable: Z is just a counter. API or ABI of features added 14481ad8388SMartin Matuska in earlier unstable releases having the same X.Y may break. 14581ad8388SMartin Matuska 14681ad8388SMartin Matuska - S indicates stability of the release. It is missing from the 1473632bc4cSMartin Matuska stable releases, where Y is an even number. When Y is odd, S 14881ad8388SMartin Matuska is either "alpha" or "beta" to make it very clear that such 14981ad8388SMartin Matuska versions are not stable releases. The same X.Y.Z combination is 1503632bc4cSMartin Matuska not used for more than one stability level, i.e. after X.Y.Zalpha, 15181ad8388SMartin Matuska the next version can be X.Y.(Z+1)beta but not X.Y.Zbeta. 15281ad8388SMartin Matuska 15381ad8388SMartin Matuska 15481ad8388SMartin Matuska3. Reporting bugs 15581ad8388SMartin Matuska----------------- 15681ad8388SMartin Matuska 15781ad8388SMartin Matuska Naturally it is easiest for me if you already know what causes the 15881ad8388SMartin Matuska unexpected behavior. Even better if you have a patch to propose. 15981ad8388SMartin Matuska However, quite often the reason for unexpected behavior is unknown, 16081ad8388SMartin Matuska so here are a few things to do before sending a bug report: 16181ad8388SMartin Matuska 16281ad8388SMartin Matuska 1. Try to create a small example how to reproduce the issue. 16381ad8388SMartin Matuska 16481ad8388SMartin Matuska 2. Compile XZ Utils with debugging code using configure switches 16581ad8388SMartin Matuska --enable-debug and, if possible, --disable-shared. If you are 16681ad8388SMartin Matuska using GCC, use CFLAGS='-O0 -ggdb3'. Don't strip the resulting 16781ad8388SMartin Matuska binaries. 16881ad8388SMartin Matuska 16981ad8388SMartin Matuska 3. Turn on core dumps. The exact command depends on your shell; 17081ad8388SMartin Matuska for example in GNU bash it is done with "ulimit -c unlimited", 17181ad8388SMartin Matuska and in tcsh with "limit coredumpsize unlimited". 17281ad8388SMartin Matuska 17381ad8388SMartin Matuska 4. Try to reproduce the suspected bug. If you get "assertion failed" 17481ad8388SMartin Matuska message, be sure to include the complete message in your bug 17581ad8388SMartin Matuska report. If the application leaves a coredump, get a backtrace 17681ad8388SMartin Matuska using gdb: 17781ad8388SMartin Matuska $ gdb /path/to/app-binary # Load the app to the debugger. 17881ad8388SMartin Matuska (gdb) core core # Open the coredump. 17981ad8388SMartin Matuska (gdb) bt # Print the backtrace. Copy & paste to bug report. 18081ad8388SMartin Matuska (gdb) quit # Quit gdb. 18181ad8388SMartin Matuska 18281ad8388SMartin Matuska Report your bug via email or IRC (see Contact information below). 18381ad8388SMartin Matuska Don't send core dump files or any executables. If you have a small 18481ad8388SMartin Matuska example file(s) (total size less than 256 KiB), please include 18581ad8388SMartin Matuska it/them as an attachment. If you have bigger test files, put them 1863632bc4cSMartin Matuska online somewhere and include a URL to the file(s) in the bug report. 18781ad8388SMartin Matuska 18881ad8388SMartin Matuska Always include the exact version number of XZ Utils in the bug report. 18981ad8388SMartin Matuska If you are using a snapshot from the git repository, use "git describe" 19081ad8388SMartin Matuska to get the exact snapshot version. If you are using XZ Utils shipped 19181ad8388SMartin Matuska in an operating system distribution, mention the distribution name, 19281ad8388SMartin Matuska distribution version, and exact xz package version; if you cannot 19381ad8388SMartin Matuska repeat the bug with the code compiled from unpatched source code, 19481ad8388SMartin Matuska you probably need to report a bug to your distribution's bug tracking 19581ad8388SMartin Matuska system. 19681ad8388SMartin Matuska 19781ad8388SMartin Matuska 198a8675d92SXin LI4. Translations 199a8675d92SXin LI--------------- 200e0f0e66dSMartin Matuska 201a8675d92SXin LI The xz command line tool and all man pages can be translated. 202a8675d92SXin LI The translations are handled via the Translation Project. If you 203a8675d92SXin LI wish to help translating xz, please join the Translation Project: 204e0f0e66dSMartin Matuska 205a8675d92SXin LI https://translationproject.org/html/translators.html 206e0f0e66dSMartin Matuska 207*128836d3SXin LI Updates to translations won't be accepted by methods that bypass 208*128836d3SXin LI the Translation Project because there is a risk of duplicate work: 209*128836d3SXin LI translation updates made in the xz repository aren't seen by the 210*128836d3SXin LI translators in the Translation Project. If you have found bugs in 211*128836d3SXin LI a translation, please report them to the Language-Team address 212*128836d3SXin LI which can be found near the beginning of the PO file. 21373ed8e77SXin LI 214*128836d3SXin LI If you find language problems in the original English strings, 21573ed8e77SXin LI feel free to suggest improvements. Ask if something is unclear. 21673ed8e77SXin LI 21773ed8e77SXin LI 218*128836d3SXin LI4.1. Testing translations 219*128836d3SXin LI 220*128836d3SXin LI Testing can be done by installing xz into a temporary directory. 221*128836d3SXin LI 222*128836d3SXin LI If building from Git repository (not tarball), generate the 223*128836d3SXin LI Autotools files: 224*128836d3SXin LI 225*128836d3SXin LI ./autogen.sh 226*128836d3SXin LI 227*128836d3SXin LI Create a subdirectory for the build files. The tmp-build directory 228*128836d3SXin LI can be deleted after testing. 229*128836d3SXin LI 230*128836d3SXin LI mkdir tmp-build 231*128836d3SXin LI cd tmp-build 232*128836d3SXin LI ../configure --disable-shared --enable-debug --prefix=$PWD/inst 233*128836d3SXin LI 234*128836d3SXin LI Edit the .po file in the po directory. Then build and install to 235*128836d3SXin LI the "tmp-build/inst" directory, and use translations.bash to see 236*128836d3SXin LI how some of the messages look. Repeat these steps if needed: 237*128836d3SXin LI 238*128836d3SXin LI make -C po update-po 239*128836d3SXin LI make -j"$(nproc)" install 240*128836d3SXin LI bash ../debug/translation.bash | less 241*128836d3SXin LI bash ../debug/translation.bash | less -S # For --list outputs 242*128836d3SXin LI 243*128836d3SXin LI To test other languages, set the LANGUAGE environment variable 244*128836d3SXin LI before running translations.bash. The value should match the PO file 245*128836d3SXin LI name without the .po suffix. Example: 246*128836d3SXin LI 247*128836d3SXin LI export LANGUAGE=fi 248e0f0e66dSMartin Matuska 249e0f0e66dSMartin Matuska 250e0f0e66dSMartin Matuska5. Other implementations of the .xz format 25181ad8388SMartin Matuska------------------------------------------ 25281ad8388SMartin Matuska 25381ad8388SMartin Matuska 7-Zip and the p7zip port of 7-Zip support the .xz format starting 25481ad8388SMartin Matuska from the version 9.00alpha. 25581ad8388SMartin Matuska 256c917796cSXin LI https://7-zip.org/ 257c917796cSXin LI https://p7zip.sourceforge.net/ 25881ad8388SMartin Matuska 25981ad8388SMartin Matuska XZ Embedded is a limited implementation written for use in the Linux 26081ad8388SMartin Matuska kernel, but it is also suitable for other embedded use. 26181ad8388SMartin Matuska 2622f9cd13dSXin LI https://tukaani.org/xz/embedded.html 26381ad8388SMartin Matuska 2641f3ced26SXin LI XZ for Java is a complete implementation written in pure Java. 2651f3ced26SXin LI 2662f9cd13dSXin LI https://tukaani.org/xz/java.html 2671f3ced26SXin LI 26881ad8388SMartin Matuska 269e0f0e66dSMartin Matuska6. Contact information 27081ad8388SMartin Matuska---------------------- 27181ad8388SMartin Matuska 2723b35e7eeSXin LI XZ Utils in general: 2733b35e7eeSXin LI - Home page: https://tukaani.org/xz/ 2743b35e7eeSXin LI - Email to maintainer(s): xz@tukaani.org 2753b35e7eeSXin LI - IRC: #tukaani on Libera Chat 2763b35e7eeSXin LI - GitHub: https://github.com/tukaani-project/xz 27781ad8388SMartin Matuska 2783b35e7eeSXin LI Lead maintainer: 2793b35e7eeSXin LI - Email: Lasse Collin <lasse.collin@tukaani.org> 2803b35e7eeSXin LI - IRC: Larhzu on Libera Chat 28181ad8388SMartin Matuska 282