xref: /freebsd/sys/contrib/zstd/zlibWrapper/README.md (revision f6a3b357e9be4c6423c85eff9a847163a0d307c8)
1Zstandard wrapper for zlib
2================================
3
4The main objective of creating a zstd wrapper for [zlib](http://zlib.net/) is to allow a quick and smooth transition to zstd for projects already using zlib.
5
6#### Required files
7
8To build the zstd wrapper for zlib the following files are required:
9- zlib.h
10- a static or dynamic zlib library
11- zlibWrapper/zstd_zlibwrapper.h
12- zlibWrapper/zstd_zlibwrapper.c
13- zlibWrapper/gz*.c files (gzclose.c, gzlib.c, gzread.c, gzwrite.c)
14- zlibWrapper/gz*.h files (gzcompatibility.h, gzguts.h)
15- a static or dynamic zstd library
16
17The first two files are required by all projects using zlib and they are not included with the zstd distribution.
18The further files are supplied with the zstd distribution.
19
20
21#### Embedding the zstd wrapper within your project
22
23Let's assume that your project that uses zlib is compiled with:
24```gcc project.o -lz```
25
26To compile the zstd wrapper with your project you have to do the following:
27- change all references with `#include "zlib.h"` to `#include "zstd_zlibwrapper.h"`
28- compile your project with `zstd_zlibwrapper.c`, `gz*.c` and a static or dynamic zstd library
29
30The linking should be changed to:
31```gcc project.o zstd_zlibwrapper.o gz*.c -lz -lzstd```
32
33
34#### Enabling zstd compression within your project
35
36After embedding the zstd wrapper within your project the zstd library is turned off by default.
37Your project should work as before with zlib. There are two options to enable zstd compression:
38- compilation with `-DZWRAP_USE_ZSTD=1` (or using `#define ZWRAP_USE_ZSTD 1` before `#include "zstd_zlibwrapper.h"`)
39- using the `void ZWRAP_useZSTDcompression(int turn_on)` function (declared in `#include "zstd_zlibwrapper.h"`)
40
41During decompression zlib and zstd streams are automatically detected and decompressed using a proper library.
42This behavior can be changed using `ZWRAP_setDecompressionType(ZWRAP_FORCE_ZLIB)` what will make zlib decompression slightly faster.
43
44
45#### Example
46We have take the file `test/example.c` from [the zlib library distribution](http://zlib.net/) and copied it to [zlibWrapper/examples/example.c](examples/example.c).
47After compilation and execution it shows the following results:
48```
49zlib version 1.2.8 = 0x1280, compile flags = 0x65
50uncompress(): hello, hello!
51gzread(): hello, hello!
52gzgets() after gzseek:  hello!
53inflate(): hello, hello!
54large_inflate(): OK
55after inflateSync(): hello, hello!
56inflate with dictionary: hello, hello!
57```
58Then we have changed `#include "zlib.h"` to `#include "zstd_zlibwrapper.h"`, compiled the [example.c](examples/example.c) file
59with `-DZWRAP_USE_ZSTD=1` and linked with additional `zstd_zlibwrapper.o gz*.c -lzstd`.
60We were forced to turn off the following functions: `test_flush`, `test_sync` which use currently unsupported features.
61After running it shows the following results:
62```
63zlib version 1.2.8 = 0x1280, compile flags = 0x65
64uncompress(): hello, hello!
65gzread(): hello, hello!
66gzgets() after gzseek:  hello!
67inflate(): hello, hello!
68large_inflate(): OK
69inflate with dictionary: hello, hello!
70```
71The script used for compilation can be found at [zlibWrapper/Makefile](Makefile).
72
73
74#### The measurement of performance of Zstandard wrapper for zlib
75
76The zstd distribution contains a tool called `zwrapbench` which can measure speed and ratio of zlib, zstd, and the wrapper.
77The benchmark is conducted using given filenames or synthetic data if filenames are not provided.
78The files are read into memory and processed independently.
79It makes benchmark more precise as it eliminates I/O overhead.
80Many filenames can be supplied as multiple parameters, parameters with wildcards or names of directories can be used as parameters with the -r option.
81One can select compression levels starting from `-b` and ending with `-e`. The `-i` parameter selects minimal time used for each of tested levels.
82With `-B` option bigger files can be divided into smaller, independently compressed blocks.
83The benchmark tool can be compiled with `make zwrapbench` using [zlibWrapper/Makefile](Makefile).
84
85
86#### Improving speed of streaming compression
87
88During streaming compression the compressor never knows how big is data to compress.
89Zstandard compression can be improved by providing size of source data to the compressor. By default streaming compressor assumes that data is bigger than 256 KB but it can hurt compression speed on smaller data.
90The zstd wrapper provides the `ZWRAP_setPledgedSrcSize()` function that allows to change a pledged source size for a given compression stream.
91The function will change zstd compression parameters what may improve compression speed and/or ratio.
92It should be called just after `deflateInit()`or `deflateReset()` and before `deflate()` or `deflateSetDictionary()`. The function is only helpful when data is compressed in blocks. There will be no change in case of `deflateInit()` or `deflateReset()`  immediately followed by `deflate(strm, Z_FINISH)`
93as this case is automatically detected.
94
95
96#### Reusing contexts
97
98The ordinary zlib compression of two files/streams allocates two contexts:
99- for the 1st file calls `deflateInit`, `deflate`, `...`, `deflate`, `deflateEnd`
100- for the 2nd file calls `deflateInit`, `deflate`, `...`, `deflate`, `deflateEnd`
101
102The speed of compression can be improved with reusing a single context with following steps:
103- initialize the context with `deflateInit`
104- for the 1st file call `deflate`, `...`, `deflate`
105- for the 2nd file call `deflateReset`, `deflate`, `...`, `deflate`
106- free the context with `deflateEnd`
107
108To check the difference we made experiments using `zwrapbench -ri6b6` with zstd and zlib compression (both at level 6).
109The input data was decompressed git repository downloaded from https://github.com/git/git/archive/master.zip which contains 2979 files.
110The table below shows that reusing contexts has a minor influence on zlib but it gives improvement for zstd.
111In our example (the last 2 lines) it gives 4% better compression speed and 5% better decompression speed.
112
113| Compression type                                  | Compression | Decompress.| Compr. size | Ratio |
114| ------------------------------------------------- | ------------| -----------| ----------- | ----- |
115| zlib 1.2.8                                        |  30.51 MB/s | 219.3 MB/s |     6819783 | 3.459 |
116| zlib 1.2.8 not reusing a context                  |  30.22 MB/s | 218.1 MB/s |     6819783 | 3.459 |
117| zlib 1.2.8 with zlibWrapper and reusing a context |  30.40 MB/s | 218.9 MB/s |     6819783 | 3.459 |
118| zlib 1.2.8 with zlibWrapper not reusing a context |  30.28 MB/s | 218.1 MB/s |     6819783 | 3.459 |
119| zstd 1.1.0 using ZSTD_CCtx                        |  68.35 MB/s | 430.9 MB/s |     6868521 | 3.435 |
120| zstd 1.1.0 using ZSTD_CStream                     |  66.63 MB/s | 422.3 MB/s |     6868521 | 3.435 |
121| zstd 1.1.0 with zlibWrapper and reusing a context |  54.01 MB/s | 403.2 MB/s |     6763482 | 3.488 |
122| zstd 1.1.0 with zlibWrapper not reusing a context |  51.59 MB/s | 383.7 MB/s |     6763482 | 3.488 |
123
124
125#### Compatibility issues
126After enabling zstd compression not all native zlib functions are supported. When calling unsupported methods they put error message into `strm->msg` and return Z_STREAM_ERROR.
127
128Supported methods:
129- deflateInit
130- deflate (with exception of Z_FULL_FLUSH, Z_BLOCK, and Z_TREES)
131- deflateSetDictionary
132- deflateEnd
133- deflateReset
134- deflateBound
135- inflateInit
136- inflate
137- inflateSetDictionary
138- inflateReset
139- inflateReset2
140- compress
141- compress2
142- compressBound
143- uncompress
144- gzip file access functions
145
146Ignored methods (they do nothing):
147- deflateParams
148
149Unsupported methods:
150- deflateCopy
151- deflateTune
152- deflatePending
153- deflatePrime
154- deflateSetHeader
155- inflateGetDictionary
156- inflateCopy
157- inflateSync
158- inflatePrime
159- inflateMark
160- inflateGetHeader
161- inflateBackInit
162- inflateBack
163- inflateBackEnd
164