163dab8eeSAdrian Chadd 263dab8eeSAdrian ChaddXZ Embedded 363dab8eeSAdrian Chadd=========== 463dab8eeSAdrian Chadd 563dab8eeSAdrian Chadd XZ Embedded is a relatively small, limited implementation of the .xz 663dab8eeSAdrian Chadd file format. Currently only decoding is implemented. 763dab8eeSAdrian Chadd 863dab8eeSAdrian Chadd XZ Embedded was written for use in the Linux kernel, but the code can 963dab8eeSAdrian Chadd be easily used in other environments too, including regular userspace 10f0bd5302SXin LI applications. See userspace/xzminidec.c for an example program. 1163dab8eeSAdrian Chadd 1263dab8eeSAdrian Chadd This README contains information that is useful only when the copy 1363dab8eeSAdrian Chadd of XZ Embedded isn't part of the Linux kernel tree. You should also 1463dab8eeSAdrian Chadd read linux/Documentation/xz.txt even if you aren't using XZ Embedded 1563dab8eeSAdrian Chadd as part of Linux; information in that file is not repeated in this 1663dab8eeSAdrian Chadd README. 1763dab8eeSAdrian Chadd 1863dab8eeSAdrian ChaddCompiling the Linux kernel module 1963dab8eeSAdrian Chadd 2063dab8eeSAdrian Chadd The xz_dec module depends on crc32 module, so make sure that you have 2163dab8eeSAdrian Chadd it enabled (CONFIG_CRC32). 2263dab8eeSAdrian Chadd 2363dab8eeSAdrian Chadd Building the xz_dec and xz_dec_test modules without support for BCJ 2463dab8eeSAdrian Chadd filters: 2563dab8eeSAdrian Chadd 2663dab8eeSAdrian Chadd cd linux/lib/xz 2763dab8eeSAdrian Chadd make -C /path/to/kernel/source \ 2863dab8eeSAdrian Chadd KCPPFLAGS=-I"$(pwd)/../../include" M="$(pwd)" \ 2963dab8eeSAdrian Chadd CONFIG_XZ_DEC=m CONFIG_XZ_DEC_TEST=m 3063dab8eeSAdrian Chadd 3163dab8eeSAdrian Chadd Building the xz_dec and xz_dec_test modules with support for BCJ 3263dab8eeSAdrian Chadd filters: 3363dab8eeSAdrian Chadd 3463dab8eeSAdrian Chadd cd linux/lib/xz 3563dab8eeSAdrian Chadd make -C /path/to/kernel/source \ 3663dab8eeSAdrian Chadd KCPPFLAGS=-I"$(pwd)/../../include" M="$(pwd)" \ 3763dab8eeSAdrian Chadd CONFIG_XZ_DEC=m CONFIG_XZ_DEC_TEST=m CONFIG_XZ_DEC_BCJ=y \ 3863dab8eeSAdrian Chadd CONFIG_XZ_DEC_X86=y CONFIG_XZ_DEC_POWERPC=y \ 3963dab8eeSAdrian Chadd CONFIG_XZ_DEC_IA64=y CONFIG_XZ_DEC_ARM=y \ 4063dab8eeSAdrian Chadd CONFIG_XZ_DEC_ARMTHUMB=y CONFIG_XZ_DEC_SPARC=y 4163dab8eeSAdrian Chadd 4263dab8eeSAdrian Chadd If you want only one or a few of the BCJ filters, omit the appropriate 4363dab8eeSAdrian Chadd variables. CONFIG_XZ_DEC_BCJ=y is always required to build the support 4463dab8eeSAdrian Chadd code shared between all BCJ filters. 4563dab8eeSAdrian Chadd 4663dab8eeSAdrian Chadd Most people don't need the xz_dec_test module. You can skip building 4763dab8eeSAdrian Chadd it by omitting CONFIG_XZ_DEC_TEST=m from the make command line. 4863dab8eeSAdrian Chadd 4963dab8eeSAdrian ChaddCompiler requirements 5063dab8eeSAdrian Chadd 5163dab8eeSAdrian Chadd XZ Embedded should compile as either GNU-C89 (used in the Linux 5263dab8eeSAdrian Chadd kernel) or with any C99 compiler. Getting the code to compile with 5363dab8eeSAdrian Chadd non-GNU C89 compiler or a C++ compiler should be quite easy as 5463dab8eeSAdrian Chadd long as there is a data type for unsigned 64-bit integer (or the 5563dab8eeSAdrian Chadd code is modified not to support large files, which needs some more 5663dab8eeSAdrian Chadd care than just using 32-bit integer instead of 64-bit). 5763dab8eeSAdrian Chadd 5863dab8eeSAdrian Chadd If you use GCC, try to use a recent version. For example, on x86-32, 5963dab8eeSAdrian Chadd xz_dec_lzma2.c compiled with GCC 3.3.6 is 15-25 % slower than when 6063dab8eeSAdrian Chadd compiled with GCC 4.3.3. 6163dab8eeSAdrian Chadd 6263dab8eeSAdrian ChaddEmbedding into userspace applications 6363dab8eeSAdrian Chadd 6463dab8eeSAdrian Chadd To embed the XZ decoder, copy the following files into a single 6563dab8eeSAdrian Chadd directory in your source code tree: 6663dab8eeSAdrian Chadd 6763dab8eeSAdrian Chadd linux/include/linux/xz.h 6863dab8eeSAdrian Chadd linux/lib/xz/xz_crc32.c 6963dab8eeSAdrian Chadd linux/lib/xz/xz_dec_lzma2.c 7063dab8eeSAdrian Chadd linux/lib/xz/xz_dec_stream.c 7163dab8eeSAdrian Chadd linux/lib/xz/xz_lzma2.h 7263dab8eeSAdrian Chadd linux/lib/xz/xz_private.h 7363dab8eeSAdrian Chadd linux/lib/xz/xz_stream.h 7463dab8eeSAdrian Chadd userspace/xz_config.h 7563dab8eeSAdrian Chadd 7663dab8eeSAdrian Chadd Alternatively, xz.h may be placed into a different directory but then 7763dab8eeSAdrian Chadd that directory must be in the compiler include path when compiling 7863dab8eeSAdrian Chadd the .c files. 7963dab8eeSAdrian Chadd 8063dab8eeSAdrian Chadd Your code should use only the functions declared in xz.h. The rest of 8163dab8eeSAdrian Chadd the .h files are meant only for internal use in XZ Embedded. 8263dab8eeSAdrian Chadd 8363dab8eeSAdrian Chadd You may want to modify xz_config.h to be more suitable for your build 8463dab8eeSAdrian Chadd environment. Probably you should at least skim through it even if the 8563dab8eeSAdrian Chadd default file works as is. 8663dab8eeSAdrian Chadd 87*cd3a777bSXin LISupporting concatenated .xz files 88*cd3a777bSXin LI 89*cd3a777bSXin LI Regular .xz files can be concatenated as is and the xz command line 90*cd3a777bSXin LI tool will decompress all streams from a concatenated file (a few 91*cd3a777bSXin LI other popular formats and tools support this too). This kind of .xz 92*cd3a777bSXin LI files aren't as uncommon as one might think because pxz, an early 93*cd3a777bSXin LI threaded XZ compressor, created this kind of .xz files. 94*cd3a777bSXin LI 95*cd3a777bSXin LI The xz_dec_run() function will stop after decompressing one stream. 96*cd3a777bSXin LI This is good when XZ data is stored inside some other file format. 97*cd3a777bSXin LI However, if one is decompressing regular standalone .xz files, one 98*cd3a777bSXin LI will want to decompress all streams in the file. This is easy with 99*cd3a777bSXin LI xz_dec_catrun(). To include support for xz_dec_catrun(), you need 100*cd3a777bSXin LI to #define XZ_DEC_CONCATENATED in xz_config.h or in compiler flags. 101*cd3a777bSXin LI 102f0bd5302SXin LIIntegrity check support 103f0bd5302SXin LI 104f0bd5302SXin LI XZ Embedded always supports the integrity check types None and 105f0bd5302SXin LI CRC32. Support for CRC64 is optional. SHA-256 is currently not 106f0bd5302SXin LI supported in XZ Embedded although the .xz format does support it. 107f0bd5302SXin LI The xz tool from XZ Utils uses CRC64 by default, but CRC32 is usually 108f0bd5302SXin LI enough in embedded systems to keep the code size smaller. 109f0bd5302SXin LI 110f0bd5302SXin LI If you want support for CRC64, you need to copy linux/lib/xz/xz_crc64.c 111f0bd5302SXin LI into your application, and #define XZ_USE_CRC64 in xz_config.h or in 112f0bd5302SXin LI compiler flags. 113f0bd5302SXin LI 114f0bd5302SXin LI When using the internal CRC32 or CRC64, their lookup tables need to be 115f0bd5302SXin LI initialized with xz_crc32_init() and xz_crc64_init(), respectively. 116f0bd5302SXin LI See xz.h for details. 117f0bd5302SXin LI 118f0bd5302SXin LI To use external CRC32 or CRC64 code instead of the code from 119f0bd5302SXin LI xz_crc32.c or xz_crc64.c, the following #defines may be used 120f0bd5302SXin LI in xz_config.h or in compiler flags: 121f0bd5302SXin LI 122f0bd5302SXin LI #define XZ_INTERNAL_CRC32 0 123f0bd5302SXin LI #define XZ_INTERNAL_CRC64 0 124f0bd5302SXin LI 125f0bd5302SXin LI Then it is up to you to provide compatible xz_crc32() or xz_crc64() 126f0bd5302SXin LI functions. 127f0bd5302SXin LI 128f0bd5302SXin LI If the .xz file being decompressed uses an integrity check type that 129f0bd5302SXin LI isn't supported by XZ Embedded, it is treated as an error and the 130f0bd5302SXin LI file cannot be decompressed. For multi-call mode, this can be modified 131f0bd5302SXin LI by #defining XZ_DEC_ANY_CHECK. Then xz_dec_run() will return 132f0bd5302SXin LI XZ_UNSUPPORTED_CHECK when unsupported check type is detected. After 133f0bd5302SXin LI that decompression can be continued normally except that the 134f0bd5302SXin LI integrity check won't be verified. In single-call mode there's 135f0bd5302SXin LI no way to continue decoding, so XZ_DEC_ANY_CHECK is almost useless 136f0bd5302SXin LI in single-call mode. 137f0bd5302SXin LI 13863dab8eeSAdrian ChaddBCJ filter support 13963dab8eeSAdrian Chadd 14063dab8eeSAdrian Chadd If you want support for one or more BCJ filters, you need to copy also 14163dab8eeSAdrian Chadd linux/lib/xz/xz_dec_bcj.c into your application, and use appropriate 14263dab8eeSAdrian Chadd #defines in xz_config.h or in compiler flags. You don't need these 14363dab8eeSAdrian Chadd #defines in the code that just uses XZ Embedded via xz.h, but having 14463dab8eeSAdrian Chadd them always #defined doesn't hurt either. 14563dab8eeSAdrian Chadd 14663dab8eeSAdrian Chadd #define Instruction set BCJ filter endianness 14763dab8eeSAdrian Chadd XZ_DEC_X86 x86-32 or x86-64 Little endian only 14863dab8eeSAdrian Chadd XZ_DEC_POWERPC PowerPC Big endian only 14963dab8eeSAdrian Chadd XZ_DEC_IA64 Itanium (IA-64) Big or little endian 15063dab8eeSAdrian Chadd XZ_DEC_ARM ARM Little endian only 15163dab8eeSAdrian Chadd XZ_DEC_ARMTHUMB ARM-Thumb Little endian only 15263dab8eeSAdrian Chadd XZ_DEC_SPARC SPARC Big or little endian 15363dab8eeSAdrian Chadd 15463dab8eeSAdrian Chadd While some architectures are (partially) bi-endian, the endianness 15563dab8eeSAdrian Chadd setting doesn't change the endianness of the instructions on all 15663dab8eeSAdrian Chadd architectures. That's why Itanium and SPARC filters work for both big 15763dab8eeSAdrian Chadd and little endian executables (Itanium has little endian instructions 15863dab8eeSAdrian Chadd and SPARC has big endian instructions). 15963dab8eeSAdrian Chadd 16063dab8eeSAdrian Chadd There currently is no filter for little endian PowerPC or big endian 16163dab8eeSAdrian Chadd ARM or ARM-Thumb. Implementing filters for them can be considered if 16263dab8eeSAdrian Chadd there is a need for such filters in real-world applications. 16363dab8eeSAdrian Chadd 16463dab8eeSAdrian ChaddNotes about shared libraries 16563dab8eeSAdrian Chadd 16663dab8eeSAdrian Chadd If you are including XZ Embedded into a shared library, you very 16763dab8eeSAdrian Chadd probably should rename the xz_* functions to prevent symbol 16863dab8eeSAdrian Chadd conflicts in case your library is linked against some other library 16963dab8eeSAdrian Chadd or application that also has XZ Embedded in it (which may even be 17063dab8eeSAdrian Chadd a different version of XZ Embedded). TODO: Provide an easy way 17163dab8eeSAdrian Chadd to do this. 17263dab8eeSAdrian Chadd 17363dab8eeSAdrian Chadd Please don't create a shared library of XZ Embedded itself unless 17463dab8eeSAdrian Chadd it is fine to rebuild everything depending on that shared library 17563dab8eeSAdrian Chadd everytime you upgrade to a newer version of XZ Embedded. There are 17663dab8eeSAdrian Chadd no API or ABI stability guarantees between different versions of 17763dab8eeSAdrian Chadd XZ Embedded. 17863dab8eeSAdrian Chadd 179