xref: /freebsd/sys/contrib/xz-embedded/README (revision cd3a777bca91669fc4711d1eff66c40f3f62a223)
163dab8eeSAdrian Chadd
263dab8eeSAdrian ChaddXZ Embedded
363dab8eeSAdrian Chadd===========
463dab8eeSAdrian Chadd
563dab8eeSAdrian Chadd    XZ Embedded is a relatively small, limited implementation of the .xz
663dab8eeSAdrian Chadd    file format. Currently only decoding is implemented.
763dab8eeSAdrian Chadd
863dab8eeSAdrian Chadd    XZ Embedded was written for use in the Linux kernel, but the code can
963dab8eeSAdrian Chadd    be easily used in other environments too, including regular userspace
10f0bd5302SXin LI    applications. See userspace/xzminidec.c for an example program.
1163dab8eeSAdrian Chadd
1263dab8eeSAdrian Chadd    This README contains information that is useful only when the copy
1363dab8eeSAdrian Chadd    of XZ Embedded isn't part of the Linux kernel tree. You should also
1463dab8eeSAdrian Chadd    read linux/Documentation/xz.txt even if you aren't using XZ Embedded
1563dab8eeSAdrian Chadd    as part of Linux; information in that file is not repeated in this
1663dab8eeSAdrian Chadd    README.
1763dab8eeSAdrian Chadd
1863dab8eeSAdrian ChaddCompiling the Linux kernel module
1963dab8eeSAdrian Chadd
2063dab8eeSAdrian Chadd    The xz_dec module depends on crc32 module, so make sure that you have
2163dab8eeSAdrian Chadd    it enabled (CONFIG_CRC32).
2263dab8eeSAdrian Chadd
2363dab8eeSAdrian Chadd    Building the xz_dec and xz_dec_test modules without support for BCJ
2463dab8eeSAdrian Chadd    filters:
2563dab8eeSAdrian Chadd
2663dab8eeSAdrian Chadd        cd linux/lib/xz
2763dab8eeSAdrian Chadd        make -C /path/to/kernel/source \
2863dab8eeSAdrian Chadd                KCPPFLAGS=-I"$(pwd)/../../include" M="$(pwd)" \
2963dab8eeSAdrian Chadd                CONFIG_XZ_DEC=m CONFIG_XZ_DEC_TEST=m
3063dab8eeSAdrian Chadd
3163dab8eeSAdrian Chadd    Building the xz_dec and xz_dec_test modules with support for BCJ
3263dab8eeSAdrian Chadd    filters:
3363dab8eeSAdrian Chadd
3463dab8eeSAdrian Chadd        cd linux/lib/xz
3563dab8eeSAdrian Chadd        make -C /path/to/kernel/source \
3663dab8eeSAdrian Chadd                KCPPFLAGS=-I"$(pwd)/../../include" M="$(pwd)" \
3763dab8eeSAdrian Chadd                CONFIG_XZ_DEC=m CONFIG_XZ_DEC_TEST=m CONFIG_XZ_DEC_BCJ=y \
3863dab8eeSAdrian Chadd                CONFIG_XZ_DEC_X86=y CONFIG_XZ_DEC_POWERPC=y \
3963dab8eeSAdrian Chadd                CONFIG_XZ_DEC_IA64=y CONFIG_XZ_DEC_ARM=y \
4063dab8eeSAdrian Chadd                CONFIG_XZ_DEC_ARMTHUMB=y CONFIG_XZ_DEC_SPARC=y
4163dab8eeSAdrian Chadd
4263dab8eeSAdrian Chadd    If you want only one or a few of the BCJ filters, omit the appropriate
4363dab8eeSAdrian Chadd    variables. CONFIG_XZ_DEC_BCJ=y is always required to build the support
4463dab8eeSAdrian Chadd    code shared between all BCJ filters.
4563dab8eeSAdrian Chadd
4663dab8eeSAdrian Chadd    Most people don't need the xz_dec_test module. You can skip building
4763dab8eeSAdrian Chadd    it by omitting CONFIG_XZ_DEC_TEST=m from the make command line.
4863dab8eeSAdrian Chadd
4963dab8eeSAdrian ChaddCompiler requirements
5063dab8eeSAdrian Chadd
5163dab8eeSAdrian Chadd    XZ Embedded should compile as either GNU-C89 (used in the Linux
5263dab8eeSAdrian Chadd    kernel) or with any C99 compiler. Getting the code to compile with
5363dab8eeSAdrian Chadd    non-GNU C89 compiler or a C++ compiler should be quite easy as
5463dab8eeSAdrian Chadd    long as there is a data type for unsigned 64-bit integer (or the
5563dab8eeSAdrian Chadd    code is modified not to support large files, which needs some more
5663dab8eeSAdrian Chadd    care than just using 32-bit integer instead of 64-bit).
5763dab8eeSAdrian Chadd
5863dab8eeSAdrian Chadd    If you use GCC, try to use a recent version. For example, on x86-32,
5963dab8eeSAdrian Chadd    xz_dec_lzma2.c compiled with GCC 3.3.6 is 15-25 % slower than when
6063dab8eeSAdrian Chadd    compiled with GCC 4.3.3.
6163dab8eeSAdrian Chadd
6263dab8eeSAdrian ChaddEmbedding into userspace applications
6363dab8eeSAdrian Chadd
6463dab8eeSAdrian Chadd    To embed the XZ decoder, copy the following files into a single
6563dab8eeSAdrian Chadd    directory in your source code tree:
6663dab8eeSAdrian Chadd
6763dab8eeSAdrian Chadd        linux/include/linux/xz.h
6863dab8eeSAdrian Chadd        linux/lib/xz/xz_crc32.c
6963dab8eeSAdrian Chadd        linux/lib/xz/xz_dec_lzma2.c
7063dab8eeSAdrian Chadd        linux/lib/xz/xz_dec_stream.c
7163dab8eeSAdrian Chadd        linux/lib/xz/xz_lzma2.h
7263dab8eeSAdrian Chadd        linux/lib/xz/xz_private.h
7363dab8eeSAdrian Chadd        linux/lib/xz/xz_stream.h
7463dab8eeSAdrian Chadd        userspace/xz_config.h
7563dab8eeSAdrian Chadd
7663dab8eeSAdrian Chadd    Alternatively, xz.h may be placed into a different directory but then
7763dab8eeSAdrian Chadd    that directory must be in the compiler include path when compiling
7863dab8eeSAdrian Chadd    the .c files.
7963dab8eeSAdrian Chadd
8063dab8eeSAdrian Chadd    Your code should use only the functions declared in xz.h. The rest of
8163dab8eeSAdrian Chadd    the .h files are meant only for internal use in XZ Embedded.
8263dab8eeSAdrian Chadd
8363dab8eeSAdrian Chadd    You may want to modify xz_config.h to be more suitable for your build
8463dab8eeSAdrian Chadd    environment. Probably you should at least skim through it even if the
8563dab8eeSAdrian Chadd    default file works as is.
8663dab8eeSAdrian Chadd
87*cd3a777bSXin LISupporting concatenated .xz files
88*cd3a777bSXin LI
89*cd3a777bSXin LI    Regular .xz files can be concatenated as is and the xz command line
90*cd3a777bSXin LI    tool will decompress all streams from a concatenated file (a few
91*cd3a777bSXin LI    other popular formats and tools support this too). This kind of .xz
92*cd3a777bSXin LI    files aren't as uncommon as one might think because pxz, an early
93*cd3a777bSXin LI    threaded XZ compressor, created this kind of .xz files.
94*cd3a777bSXin LI
95*cd3a777bSXin LI    The xz_dec_run() function will stop after decompressing one stream.
96*cd3a777bSXin LI    This is good when XZ data is stored inside some other file format.
97*cd3a777bSXin LI    However, if one is decompressing regular standalone .xz files, one
98*cd3a777bSXin LI    will want to decompress all streams in the file. This is easy with
99*cd3a777bSXin LI    xz_dec_catrun(). To include support for xz_dec_catrun(), you need
100*cd3a777bSXin LI    to #define XZ_DEC_CONCATENATED in xz_config.h or in compiler flags.
101*cd3a777bSXin LI
102f0bd5302SXin LIIntegrity check support
103f0bd5302SXin LI
104f0bd5302SXin LI    XZ Embedded always supports the integrity check types None and
105f0bd5302SXin LI    CRC32. Support for CRC64 is optional. SHA-256 is currently not
106f0bd5302SXin LI    supported in XZ Embedded although the .xz format does support it.
107f0bd5302SXin LI    The xz tool from XZ Utils uses CRC64 by default, but CRC32 is usually
108f0bd5302SXin LI    enough in embedded systems to keep the code size smaller.
109f0bd5302SXin LI
110f0bd5302SXin LI    If you want support for CRC64, you need to copy linux/lib/xz/xz_crc64.c
111f0bd5302SXin LI    into your application, and #define XZ_USE_CRC64 in xz_config.h or in
112f0bd5302SXin LI    compiler flags.
113f0bd5302SXin LI
114f0bd5302SXin LI    When using the internal CRC32 or CRC64, their lookup tables need to be
115f0bd5302SXin LI    initialized with xz_crc32_init() and xz_crc64_init(), respectively.
116f0bd5302SXin LI    See xz.h for details.
117f0bd5302SXin LI
118f0bd5302SXin LI    To use external CRC32 or CRC64 code instead of the code from
119f0bd5302SXin LI    xz_crc32.c or xz_crc64.c, the following #defines may be used
120f0bd5302SXin LI    in xz_config.h or in compiler flags:
121f0bd5302SXin LI
122f0bd5302SXin LI        #define XZ_INTERNAL_CRC32 0
123f0bd5302SXin LI        #define XZ_INTERNAL_CRC64 0
124f0bd5302SXin LI
125f0bd5302SXin LI    Then it is up to you to provide compatible xz_crc32() or xz_crc64()
126f0bd5302SXin LI    functions.
127f0bd5302SXin LI
128f0bd5302SXin LI    If the .xz file being decompressed uses an integrity check type that
129f0bd5302SXin LI    isn't supported by XZ Embedded, it is treated as an error and the
130f0bd5302SXin LI    file cannot be decompressed. For multi-call mode, this can be modified
131f0bd5302SXin LI    by #defining XZ_DEC_ANY_CHECK. Then xz_dec_run() will return
132f0bd5302SXin LI    XZ_UNSUPPORTED_CHECK when unsupported check type is detected. After
133f0bd5302SXin LI    that decompression can be continued normally except that the
134f0bd5302SXin LI    integrity check won't be verified. In single-call mode there's
135f0bd5302SXin LI    no way to continue decoding, so XZ_DEC_ANY_CHECK is almost useless
136f0bd5302SXin LI    in single-call mode.
137f0bd5302SXin LI
13863dab8eeSAdrian ChaddBCJ filter support
13963dab8eeSAdrian Chadd
14063dab8eeSAdrian Chadd    If you want support for one or more BCJ filters, you need to copy also
14163dab8eeSAdrian Chadd    linux/lib/xz/xz_dec_bcj.c into your application, and use appropriate
14263dab8eeSAdrian Chadd    #defines in xz_config.h or in compiler flags. You don't need these
14363dab8eeSAdrian Chadd    #defines in the code that just uses XZ Embedded via xz.h, but having
14463dab8eeSAdrian Chadd    them always #defined doesn't hurt either.
14563dab8eeSAdrian Chadd
14663dab8eeSAdrian Chadd        #define             Instruction set     BCJ filter endianness
14763dab8eeSAdrian Chadd        XZ_DEC_X86          x86-32 or x86-64    Little endian only
14863dab8eeSAdrian Chadd        XZ_DEC_POWERPC      PowerPC             Big endian only
14963dab8eeSAdrian Chadd        XZ_DEC_IA64         Itanium (IA-64)     Big or little endian
15063dab8eeSAdrian Chadd        XZ_DEC_ARM          ARM                 Little endian only
15163dab8eeSAdrian Chadd        XZ_DEC_ARMTHUMB     ARM-Thumb           Little endian only
15263dab8eeSAdrian Chadd        XZ_DEC_SPARC        SPARC               Big or little endian
15363dab8eeSAdrian Chadd
15463dab8eeSAdrian Chadd    While some architectures are (partially) bi-endian, the endianness
15563dab8eeSAdrian Chadd    setting doesn't change the endianness of the instructions on all
15663dab8eeSAdrian Chadd    architectures. That's why Itanium and SPARC filters work for both big
15763dab8eeSAdrian Chadd    and little endian executables (Itanium has little endian instructions
15863dab8eeSAdrian Chadd    and SPARC has big endian instructions).
15963dab8eeSAdrian Chadd
16063dab8eeSAdrian Chadd    There currently is no filter for little endian PowerPC or big endian
16163dab8eeSAdrian Chadd    ARM or ARM-Thumb. Implementing filters for them can be considered if
16263dab8eeSAdrian Chadd    there is a need for such filters in real-world applications.
16363dab8eeSAdrian Chadd
16463dab8eeSAdrian ChaddNotes about shared libraries
16563dab8eeSAdrian Chadd
16663dab8eeSAdrian Chadd    If you are including XZ Embedded into a shared library, you very
16763dab8eeSAdrian Chadd    probably should rename the xz_* functions to prevent symbol
16863dab8eeSAdrian Chadd    conflicts in case your library is linked against some other library
16963dab8eeSAdrian Chadd    or application that also has XZ Embedded in it (which may even be
17063dab8eeSAdrian Chadd    a different version of XZ Embedded). TODO: Provide an easy way
17163dab8eeSAdrian Chadd    to do this.
17263dab8eeSAdrian Chadd
17363dab8eeSAdrian Chadd    Please don't create a shared library of XZ Embedded itself unless
17463dab8eeSAdrian Chadd    it is fine to rebuild everything depending on that shared library
17563dab8eeSAdrian Chadd    everytime you upgrade to a newer version of XZ Embedded. There are
17663dab8eeSAdrian Chadd    no API or ABI stability guarantees between different versions of
17763dab8eeSAdrian Chadd    XZ Embedded.
17863dab8eeSAdrian Chadd
179