181ad8388SMartin Matuska 281ad8388SMartin MatuskaXZ Utils 381ad8388SMartin Matuska======== 481ad8388SMartin Matuska 581ad8388SMartin Matuska 0. Overview 681ad8388SMartin Matuska 1. Documentation 781ad8388SMartin Matuska 1.1. Overall documentation 83632bc4cSMartin Matuska 1.2. Documentation for command-line tools 981ad8388SMartin Matuska 1.3. Documentation for liblzma 1081ad8388SMartin Matuska 2. Version numbering 1181ad8388SMartin Matuska 3. Reporting bugs 12a8675d92SXin LI 4. Translations 13e0f0e66dSMartin Matuska 5. Other implementations of the .xz format 14e0f0e66dSMartin Matuska 6. Contact information 1581ad8388SMartin Matuska 1681ad8388SMartin Matuska 1781ad8388SMartin Matuska0. Overview 1881ad8388SMartin Matuska----------- 1981ad8388SMartin Matuska 203632bc4cSMartin Matuska XZ Utils provide a general-purpose data-compression library plus 213632bc4cSMartin Matuska command-line tools. The native file format is the .xz format, but 2281ad8388SMartin Matuska also the legacy .lzma format is supported. The .xz format supports 233632bc4cSMartin Matuska multiple compression algorithms, which are called "filters" in the 2481ad8388SMartin Matuska context of XZ Utils. The primary filter is currently LZMA2. With 2581ad8388SMartin Matuska typical files, XZ Utils create about 30 % smaller files than gzip. 2681ad8388SMartin Matuska 2781ad8388SMartin Matuska To ease adapting support for the .xz format into existing applications 2881ad8388SMartin Matuska and scripts, the API of liblzma is somewhat similar to the API of the 293632bc4cSMartin Matuska popular zlib library. For the same reason, the command-line tool xz 303632bc4cSMartin Matuska has a command-line syntax similar to that of gzip. 3181ad8388SMartin Matuska 323632bc4cSMartin Matuska When aiming for the highest compression ratio, the LZMA2 encoder uses 3381ad8388SMartin Matuska a lot of CPU time and may use, depending on the settings, even 343632bc4cSMartin Matuska hundreds of megabytes of RAM. However, in fast modes, the LZMA2 encoder 3581ad8388SMartin Matuska competes with bzip2 in compression speed, RAM usage, and compression 3681ad8388SMartin Matuska ratio. 3781ad8388SMartin Matuska 3881ad8388SMartin Matuska LZMA2 is reasonably fast to decompress. It is a little slower than 3981ad8388SMartin Matuska gzip, but a lot faster than bzip2. Being fast to decompress means 4081ad8388SMartin Matuska that the .xz format is especially nice when the same file will be 4181ad8388SMartin Matuska decompressed very many times (usually on different computers), which 4281ad8388SMartin Matuska is the case e.g. when distributing software packages. In such 4381ad8388SMartin Matuska situations, it's not too bad if the compression takes some time, 4481ad8388SMartin Matuska since that needs to be done only once to benefit many people. 4581ad8388SMartin Matuska 4681ad8388SMartin Matuska With some file types, combining (or "chaining") LZMA2 with an 473632bc4cSMartin Matuska additional filter can improve the compression ratio. A filter chain may 483632bc4cSMartin Matuska contain up to four filters, although usually only one or two are used. 4981ad8388SMartin Matuska For example, putting a BCJ (Branch/Call/Jump) filter before LZMA2 5081ad8388SMartin Matuska in the filter chain can improve compression ratio of executable files. 5181ad8388SMartin Matuska 5281ad8388SMartin Matuska Since the .xz format allows adding new filter IDs, it is possible that 5381ad8388SMartin Matuska some day there will be a filter that is, for example, much faster to 5481ad8388SMartin Matuska compress than LZMA2 (but probably with worse compression ratio). 5581ad8388SMartin Matuska Similarly, it is possible that some day there is a filter that will 5681ad8388SMartin Matuska compress better than LZMA2. 5781ad8388SMartin Matuska 58a8675d92SXin LI XZ Utils supports multithreaded compression. XZ Utils doesn't support 59a8675d92SXin LI multithreaded decompression yet. It has been planned though and taken 60a8675d92SXin LI into account when designing the .xz file format. In the future, files 61a8675d92SXin LI that were created in threaded mode can be decompressed in threaded 62a8675d92SXin LI mode too. 6381ad8388SMartin Matuska 6481ad8388SMartin Matuska 6581ad8388SMartin Matuska1. Documentation 6681ad8388SMartin Matuska---------------- 6781ad8388SMartin Matuska 6881ad8388SMartin Matuska1.1. Overall documentation 6981ad8388SMartin Matuska 7081ad8388SMartin Matuska README This file 7181ad8388SMartin Matuska 7281ad8388SMartin Matuska INSTALL.generic Generic install instructions for those not familiar 7381ad8388SMartin Matuska with packages using GNU Autotools 7481ad8388SMartin Matuska INSTALL Installation instructions specific to XZ Utils 7581ad8388SMartin Matuska PACKAGERS Information to packagers of XZ Utils 7681ad8388SMartin Matuska 7781ad8388SMartin Matuska COPYING XZ Utils copyright and license information 7881ad8388SMartin Matuska COPYING.GPLv2 GNU General Public License version 2 7981ad8388SMartin Matuska COPYING.GPLv3 GNU General Public License version 3 8081ad8388SMartin Matuska COPYING.LGPLv2.1 GNU Lesser General Public License version 2.1 8181ad8388SMartin Matuska 8281ad8388SMartin Matuska AUTHORS The main authors of XZ Utils 8381ad8388SMartin Matuska THANKS Incomplete list of people who have helped making 8481ad8388SMartin Matuska this software 8581ad8388SMartin Matuska NEWS User-visible changes between XZ Utils releases 8681ad8388SMartin Matuska ChangeLog Detailed list of changes (commit log) 8781ad8388SMartin Matuska TODO Known bugs and some sort of to-do list 8881ad8388SMartin Matuska 8981ad8388SMartin Matuska Note that only some of the above files are included in binary 9081ad8388SMartin Matuska packages. 9181ad8388SMartin Matuska 9281ad8388SMartin Matuska 933632bc4cSMartin Matuska1.2. Documentation for command-line tools 9481ad8388SMartin Matuska 953632bc4cSMartin Matuska The command-line tools are documented as man pages. In source code 9681ad8388SMartin Matuska releases (and possibly also in some binary packages), the man pages 9781ad8388SMartin Matuska are also provided in plain text (ASCII only) and PDF formats in the 9881ad8388SMartin Matuska directory "doc/man" to make the man pages more accessible to those 9981ad8388SMartin Matuska whose operating system doesn't provide an easy way to view man pages. 10081ad8388SMartin Matuska 10181ad8388SMartin Matuska 10281ad8388SMartin Matuska1.3. Documentation for liblzma 10381ad8388SMartin Matuska 10481ad8388SMartin Matuska The liblzma API headers include short docs about each function 10581ad8388SMartin Matuska and data type as Doxygen tags. These docs should be quite OK as 10681ad8388SMartin Matuska a quick reference. 10781ad8388SMartin Matuska 108a8675d92SXin LI There are a few example/tutorial programs that should help in 109a8675d92SXin LI getting started with liblzma. In the source package the examples 110a8675d92SXin LI are in "doc/examples" and in binary packages they may be under 111a8675d92SXin LI "examples" in the same directory as this README. 11281ad8388SMartin Matuska 113a8675d92SXin LI Since the liblzma API has similarities to the zlib API, some people 114a8675d92SXin LI may find it useful to read the zlib docs and tutorial too: 11581ad8388SMartin Matuska 116c917796cSXin LI https://zlib.net/manual.html 117c917796cSXin LI https://zlib.net/zlib_how.html 11881ad8388SMartin Matuska 11981ad8388SMartin Matuska 12081ad8388SMartin Matuska2. Version numbering 12181ad8388SMartin Matuska-------------------- 12281ad8388SMartin Matuska 12381ad8388SMartin Matuska The version number format of XZ Utils is X.Y.ZS: 12481ad8388SMartin Matuska 12581ad8388SMartin Matuska - X is the major version. When this is incremented, the library 12681ad8388SMartin Matuska API and ABI break. 12781ad8388SMartin Matuska 1283632bc4cSMartin Matuska - Y is the minor version. It is incremented when new features 1293632bc4cSMartin Matuska are added without breaking the existing API or ABI. An even Y 1303632bc4cSMartin Matuska indicates a stable release and an odd Y indicates unstable 1313632bc4cSMartin Matuska (alpha or beta version). 13281ad8388SMartin Matuska 1333632bc4cSMartin Matuska - Z is the revision. This has a different meaning for stable and 13481ad8388SMartin Matuska unstable releases: 1353632bc4cSMartin Matuska 13681ad8388SMartin Matuska * Stable: Z is incremented when bugs get fixed without adding 1373632bc4cSMartin Matuska any new features. This is intended to be convenient for 1383632bc4cSMartin Matuska downstream distributors that want bug fixes but don't want 1393632bc4cSMartin Matuska any new features to minimize the risk of introducing new bugs. 1403632bc4cSMartin Matuska 14181ad8388SMartin Matuska * Unstable: Z is just a counter. API or ABI of features added 14281ad8388SMartin Matuska in earlier unstable releases having the same X.Y may break. 14381ad8388SMartin Matuska 14481ad8388SMartin Matuska - S indicates stability of the release. It is missing from the 1453632bc4cSMartin Matuska stable releases, where Y is an even number. When Y is odd, S 14681ad8388SMartin Matuska is either "alpha" or "beta" to make it very clear that such 14781ad8388SMartin Matuska versions are not stable releases. The same X.Y.Z combination is 1483632bc4cSMartin Matuska not used for more than one stability level, i.e. after X.Y.Zalpha, 14981ad8388SMartin Matuska the next version can be X.Y.(Z+1)beta but not X.Y.Zbeta. 15081ad8388SMartin Matuska 15181ad8388SMartin Matuska 15281ad8388SMartin Matuska3. Reporting bugs 15381ad8388SMartin Matuska----------------- 15481ad8388SMartin Matuska 15581ad8388SMartin Matuska Naturally it is easiest for me if you already know what causes the 15681ad8388SMartin Matuska unexpected behavior. Even better if you have a patch to propose. 15781ad8388SMartin Matuska However, quite often the reason for unexpected behavior is unknown, 15881ad8388SMartin Matuska so here are a few things to do before sending a bug report: 15981ad8388SMartin Matuska 16081ad8388SMartin Matuska 1. Try to create a small example how to reproduce the issue. 16181ad8388SMartin Matuska 16281ad8388SMartin Matuska 2. Compile XZ Utils with debugging code using configure switches 16381ad8388SMartin Matuska --enable-debug and, if possible, --disable-shared. If you are 16481ad8388SMartin Matuska using GCC, use CFLAGS='-O0 -ggdb3'. Don't strip the resulting 16581ad8388SMartin Matuska binaries. 16681ad8388SMartin Matuska 16781ad8388SMartin Matuska 3. Turn on core dumps. The exact command depends on your shell; 16881ad8388SMartin Matuska for example in GNU bash it is done with "ulimit -c unlimited", 16981ad8388SMartin Matuska and in tcsh with "limit coredumpsize unlimited". 17081ad8388SMartin Matuska 17181ad8388SMartin Matuska 4. Try to reproduce the suspected bug. If you get "assertion failed" 17281ad8388SMartin Matuska message, be sure to include the complete message in your bug 17381ad8388SMartin Matuska report. If the application leaves a coredump, get a backtrace 17481ad8388SMartin Matuska using gdb: 17581ad8388SMartin Matuska $ gdb /path/to/app-binary # Load the app to the debugger. 17681ad8388SMartin Matuska (gdb) core core # Open the coredump. 17781ad8388SMartin Matuska (gdb) bt # Print the backtrace. Copy & paste to bug report. 17881ad8388SMartin Matuska (gdb) quit # Quit gdb. 17981ad8388SMartin Matuska 18081ad8388SMartin Matuska Report your bug via email or IRC (see Contact information below). 18181ad8388SMartin Matuska Don't send core dump files or any executables. If you have a small 18281ad8388SMartin Matuska example file(s) (total size less than 256 KiB), please include 18381ad8388SMartin Matuska it/them as an attachment. If you have bigger test files, put them 1843632bc4cSMartin Matuska online somewhere and include a URL to the file(s) in the bug report. 18581ad8388SMartin Matuska 18681ad8388SMartin Matuska Always include the exact version number of XZ Utils in the bug report. 18781ad8388SMartin Matuska If you are using a snapshot from the git repository, use "git describe" 18881ad8388SMartin Matuska to get the exact snapshot version. If you are using XZ Utils shipped 18981ad8388SMartin Matuska in an operating system distribution, mention the distribution name, 19081ad8388SMartin Matuska distribution version, and exact xz package version; if you cannot 19181ad8388SMartin Matuska repeat the bug with the code compiled from unpatched source code, 19281ad8388SMartin Matuska you probably need to report a bug to your distribution's bug tracking 19381ad8388SMartin Matuska system. 19481ad8388SMartin Matuska 19581ad8388SMartin Matuska 196a8675d92SXin LI4. Translations 197a8675d92SXin LI--------------- 198e0f0e66dSMartin Matuska 199a8675d92SXin LI The xz command line tool and all man pages can be translated. 200a8675d92SXin LI The translations are handled via the Translation Project. If you 201a8675d92SXin LI wish to help translating xz, please join the Translation Project: 202e0f0e66dSMartin Matuska 203a8675d92SXin LI https://translationproject.org/html/translators.html 204e0f0e66dSMartin Matuska 20573ed8e77SXin LI Below are notes and testing instructions specific to xz 20673ed8e77SXin LI translations. 20773ed8e77SXin LI 20873ed8e77SXin LI Testing can be done by installing xz into a temporary directory: 20973ed8e77SXin LI 21073ed8e77SXin LI ./configure --disable-shared --prefix=/tmp/xz-test 21173ed8e77SXin LI # <Edit the .po file in the po directory.> 21273ed8e77SXin LI make -C po update-po 21373ed8e77SXin LI make install 21473ed8e77SXin LI bash debug/translation.bash | less 21573ed8e77SXin LI bash debug/translation.bash | less -S # For --list outputs 21673ed8e77SXin LI 21773ed8e77SXin LI Repeat the above as needed (no need to re-run configure though). 21873ed8e77SXin LI 21973ed8e77SXin LI Note especially the following: 22073ed8e77SXin LI 22173ed8e77SXin LI - The output of --help and --long-help must look nice on 22273ed8e77SXin LI an 80-column terminal. It's OK to add extra lines if needed. 22373ed8e77SXin LI 22473ed8e77SXin LI - In contrast, don't add extra lines to error messages and such. 22573ed8e77SXin LI They are often preceded with e.g. a filename on the same line, 22673ed8e77SXin LI so you have no way to predict where to put a \n. Let the terminal 22773ed8e77SXin LI do the wrapping even if it looks ugly. Adding new lines will be 22873ed8e77SXin LI even uglier in the generic case even if it looks nice in a few 22973ed8e77SXin LI limited examples. 23073ed8e77SXin LI 23173ed8e77SXin LI - Be careful with column alignment in tables and table-like output 23273ed8e77SXin LI (--list, --list --verbose --verbose, --info-memory, --help, and 23373ed8e77SXin LI --long-help): 23473ed8e77SXin LI 23573ed8e77SXin LI * All descriptions of options in --help should start in the 23673ed8e77SXin LI same column (but it doesn't need to be the same column as 23773ed8e77SXin LI in the English messages; just be consistent if you change it). 23873ed8e77SXin LI Check that both --help and --long-help look OK, since they 23973ed8e77SXin LI share several strings. 24073ed8e77SXin LI 24173ed8e77SXin LI * --list --verbose and --info-memory print lines that have 24273ed8e77SXin LI the format "Description: %s". If you need a longer 24373ed8e77SXin LI description, you can put extra space between the colon 24473ed8e77SXin LI and %s. Then you may need to add extra space to other 24573ed8e77SXin LI strings too so that the result as a whole looks good (all 24673ed8e77SXin LI values start at the same column). 24773ed8e77SXin LI 24873ed8e77SXin LI * The columns of the actual tables in --list --verbose --verbose 24973ed8e77SXin LI should be aligned properly. Abbreviate if necessary. It might 25073ed8e77SXin LI be good to keep at least 2 or 3 spaces between column headings 25173ed8e77SXin LI and avoid spaces in the headings so that the columns stand out 25273ed8e77SXin LI better, but this is a matter of opinion. Do what you think 25373ed8e77SXin LI looks best. 25473ed8e77SXin LI 25573ed8e77SXin LI - Be careful to put a period at the end of a sentence when the 25673ed8e77SXin LI original version has it, and don't put it when the original 25773ed8e77SXin LI doesn't have it. Similarly, be careful with \n characters 25873ed8e77SXin LI at the beginning and end of the strings. 25973ed8e77SXin LI 26073ed8e77SXin LI - Read the TRANSLATORS comments that have been extracted from the 26173ed8e77SXin LI source code and included in xz.pot. Some comments suggest 26273ed8e77SXin LI testing with a specific command which needs an .xz file. You 26373ed8e77SXin LI may use e.g. any tests/files/good-*.xz. However, these test 26473ed8e77SXin LI commands are included in translations.bash output, so reading 26573ed8e77SXin LI translations.bash output carefully can be enough. 26673ed8e77SXin LI 26773ed8e77SXin LI - If you find language problems in the original English strings, 26873ed8e77SXin LI feel free to suggest improvements. Ask if something is unclear. 26973ed8e77SXin LI 27073ed8e77SXin LI - The translated messages should be understandable (sometimes this 27173ed8e77SXin LI may be a problem with the original English messages too). Don't 27273ed8e77SXin LI make a direct word-by-word translation from English especially if 27373ed8e77SXin LI the result doesn't sound good in your language. 27473ed8e77SXin LI 27573ed8e77SXin LI Thanks for your help! 276e0f0e66dSMartin Matuska 277e0f0e66dSMartin Matuska 278e0f0e66dSMartin Matuska5. Other implementations of the .xz format 27981ad8388SMartin Matuska------------------------------------------ 28081ad8388SMartin Matuska 28181ad8388SMartin Matuska 7-Zip and the p7zip port of 7-Zip support the .xz format starting 28281ad8388SMartin Matuska from the version 9.00alpha. 28381ad8388SMartin Matuska 284c917796cSXin LI https://7-zip.org/ 285c917796cSXin LI https://p7zip.sourceforge.net/ 28681ad8388SMartin Matuska 28781ad8388SMartin Matuska XZ Embedded is a limited implementation written for use in the Linux 28881ad8388SMartin Matuska kernel, but it is also suitable for other embedded use. 28981ad8388SMartin Matuska 290b71a5db3SXin LI https://tukaani.org/xz/embedded.html 29181ad8388SMartin Matuska 292*1f3ced26SXin LI XZ for Java is a complete implementation written in pure Java. 293*1f3ced26SXin LI 294*1f3ced26SXin LI https://tukaani.org/xz/java.html 295*1f3ced26SXin LI 29681ad8388SMartin Matuska 297e0f0e66dSMartin Matuska6. Contact information 29881ad8388SMartin Matuska---------------------- 29981ad8388SMartin Matuska 30081ad8388SMartin Matuska If you have questions, bug reports, patches etc. related to XZ Utils, 3010ca90ed4SXin LI the project maintainers Lasse Collin and Jia Tan can be reached via 3020ca90ed4SXin LI <xz@tukaani.org>. 30381ad8388SMartin Matuska 3040ca90ed4SXin LI You might find Lasse also from #tukaani on Libera Chat (IRC). 3050ca90ed4SXin LI The nick is Larhzu. The channel tends to be pretty quiet, 3060ca90ed4SXin LI so just ask your question and someone might wake up. 30781ad8388SMartin Matuska 308