xref: /titanic_53/usr/src/boot/lib/libz/doc/rfc1950.txt (revision 4a5d661a82b942b6538acd26209d959ce98b593a)
1*4a5d661aSToomas Soome
2*4a5d661aSToomas Soome
3*4a5d661aSToomas Soome
4*4a5d661aSToomas Soome
5*4a5d661aSToomas Soome
6*4a5d661aSToomas Soome
7*4a5d661aSToomas SoomeNetwork Working Group                                         P. Deutsch
8*4a5d661aSToomas SoomeRequest for Comments: 1950                           Aladdin Enterprises
9*4a5d661aSToomas SoomeCategory: Informational                                      J-L. Gailly
10*4a5d661aSToomas Soome                                                                Info-ZIP
11*4a5d661aSToomas Soome                                                                May 1996
12*4a5d661aSToomas Soome
13*4a5d661aSToomas Soome
14*4a5d661aSToomas Soome         ZLIB Compressed Data Format Specification version 3.3
15*4a5d661aSToomas Soome
16*4a5d661aSToomas SoomeStatus of This Memo
17*4a5d661aSToomas Soome
18*4a5d661aSToomas Soome   This memo provides information for the Internet community.  This memo
19*4a5d661aSToomas Soome   does not specify an Internet standard of any kind.  Distribution of
20*4a5d661aSToomas Soome   this memo is unlimited.
21*4a5d661aSToomas Soome
22*4a5d661aSToomas SoomeIESG Note:
23*4a5d661aSToomas Soome
24*4a5d661aSToomas Soome   The IESG takes no position on the validity of any Intellectual
25*4a5d661aSToomas Soome   Property Rights statements contained in this document.
26*4a5d661aSToomas Soome
27*4a5d661aSToomas SoomeNotices
28*4a5d661aSToomas Soome
29*4a5d661aSToomas Soome   Copyright (c) 1996 L. Peter Deutsch and Jean-Loup Gailly
30*4a5d661aSToomas Soome
31*4a5d661aSToomas Soome   Permission is granted to copy and distribute this document for any
32*4a5d661aSToomas Soome   purpose and without charge, including translations into other
33*4a5d661aSToomas Soome   languages and incorporation into compilations, provided that the
34*4a5d661aSToomas Soome   copyright notice and this notice are preserved, and that any
35*4a5d661aSToomas Soome   substantive changes or deletions from the original are clearly
36*4a5d661aSToomas Soome   marked.
37*4a5d661aSToomas Soome
38*4a5d661aSToomas Soome   A pointer to the latest version of this and related documentation in
39*4a5d661aSToomas Soome   HTML format can be found at the URL
40*4a5d661aSToomas Soome   <ftp://ftp.uu.net/graphics/png/documents/zlib/zdoc-index.html>.
41*4a5d661aSToomas Soome
42*4a5d661aSToomas SoomeAbstract
43*4a5d661aSToomas Soome
44*4a5d661aSToomas Soome   This specification defines a lossless compressed data format.  The
45*4a5d661aSToomas Soome   data can be produced or consumed, even for an arbitrarily long
46*4a5d661aSToomas Soome   sequentially presented input data stream, using only an a priori
47*4a5d661aSToomas Soome   bounded amount of intermediate storage.  The format presently uses
48*4a5d661aSToomas Soome   the DEFLATE compression method but can be easily extended to use
49*4a5d661aSToomas Soome   other compression methods.  It can be implemented readily in a manner
50*4a5d661aSToomas Soome   not covered by patents.  This specification also defines the ADLER-32
51*4a5d661aSToomas Soome   checksum (an extension and improvement of the Fletcher checksum),
52*4a5d661aSToomas Soome   used for detection of data corruption, and provides an algorithm for
53*4a5d661aSToomas Soome   computing it.
54*4a5d661aSToomas Soome
55*4a5d661aSToomas Soome
56*4a5d661aSToomas Soome
57*4a5d661aSToomas Soome
58*4a5d661aSToomas SoomeDeutsch & Gailly             Informational                      [Page 1]
59*4a5d661aSToomas Soome
60*4a5d661aSToomas SoomeRFC 1950       ZLIB Compressed Data Format Specification        May 1996
61*4a5d661aSToomas Soome
62*4a5d661aSToomas Soome
63*4a5d661aSToomas SoomeTable of Contents
64*4a5d661aSToomas Soome
65*4a5d661aSToomas Soome   1. Introduction ................................................... 2
66*4a5d661aSToomas Soome      1.1. Purpose ................................................... 2
67*4a5d661aSToomas Soome      1.2. Intended audience ......................................... 3
68*4a5d661aSToomas Soome      1.3. Scope ..................................................... 3
69*4a5d661aSToomas Soome      1.4. Compliance ................................................ 3
70*4a5d661aSToomas Soome      1.5.  Definitions of terms and conventions used ................ 3
71*4a5d661aSToomas Soome      1.6. Changes from previous versions ............................ 3
72*4a5d661aSToomas Soome   2. Detailed specification ......................................... 3
73*4a5d661aSToomas Soome      2.1. Overall conventions ....................................... 3
74*4a5d661aSToomas Soome      2.2. Data format ............................................... 4
75*4a5d661aSToomas Soome      2.3. Compliance ................................................ 7
76*4a5d661aSToomas Soome   3. References ..................................................... 7
77*4a5d661aSToomas Soome   4. Source code .................................................... 8
78*4a5d661aSToomas Soome   5. Security Considerations ........................................ 8
79*4a5d661aSToomas Soome   6. Acknowledgements ............................................... 8
80*4a5d661aSToomas Soome   7. Authors' Addresses ............................................. 8
81*4a5d661aSToomas Soome   8. Appendix: Rationale ............................................ 9
82*4a5d661aSToomas Soome   9. Appendix: Sample code ..........................................10
83*4a5d661aSToomas Soome
84*4a5d661aSToomas Soome1. Introduction
85*4a5d661aSToomas Soome
86*4a5d661aSToomas Soome   1.1. Purpose
87*4a5d661aSToomas Soome
88*4a5d661aSToomas Soome      The purpose of this specification is to define a lossless
89*4a5d661aSToomas Soome      compressed data format that:
90*4a5d661aSToomas Soome
91*4a5d661aSToomas Soome          * Is independent of CPU type, operating system, file system,
92*4a5d661aSToomas Soome            and character set, and hence can be used for interchange;
93*4a5d661aSToomas Soome
94*4a5d661aSToomas Soome          * Can be produced or consumed, even for an arbitrarily long
95*4a5d661aSToomas Soome            sequentially presented input data stream, using only an a
96*4a5d661aSToomas Soome            priori bounded amount of intermediate storage, and hence can
97*4a5d661aSToomas Soome            be used in data communications or similar structures such as
98*4a5d661aSToomas Soome            Unix filters;
99*4a5d661aSToomas Soome
100*4a5d661aSToomas Soome          * Can use a number of different compression methods;
101*4a5d661aSToomas Soome
102*4a5d661aSToomas Soome          * Can be implemented readily in a manner not covered by
103*4a5d661aSToomas Soome            patents, and hence can be practiced freely.
104*4a5d661aSToomas Soome
105*4a5d661aSToomas Soome      The data format defined by this specification does not attempt to
106*4a5d661aSToomas Soome      allow random access to compressed data.
107*4a5d661aSToomas Soome
108*4a5d661aSToomas Soome
109*4a5d661aSToomas Soome
110*4a5d661aSToomas Soome
111*4a5d661aSToomas Soome
112*4a5d661aSToomas Soome
113*4a5d661aSToomas Soome
114*4a5d661aSToomas SoomeDeutsch & Gailly             Informational                      [Page 2]
115*4a5d661aSToomas Soome
116*4a5d661aSToomas SoomeRFC 1950       ZLIB Compressed Data Format Specification        May 1996
117*4a5d661aSToomas Soome
118*4a5d661aSToomas Soome
119*4a5d661aSToomas Soome   1.2. Intended audience
120*4a5d661aSToomas Soome
121*4a5d661aSToomas Soome      This specification is intended for use by implementors of software
122*4a5d661aSToomas Soome      to compress data into zlib format and/or decompress data from zlib
123*4a5d661aSToomas Soome      format.
124*4a5d661aSToomas Soome
125*4a5d661aSToomas Soome      The text of the specification assumes a basic background in
126*4a5d661aSToomas Soome      programming at the level of bits and other primitive data
127*4a5d661aSToomas Soome      representations.
128*4a5d661aSToomas Soome
129*4a5d661aSToomas Soome   1.3. Scope
130*4a5d661aSToomas Soome
131*4a5d661aSToomas Soome      The specification specifies a compressed data format that can be
132*4a5d661aSToomas Soome      used for in-memory compression of a sequence of arbitrary bytes.
133*4a5d661aSToomas Soome
134*4a5d661aSToomas Soome   1.4. Compliance
135*4a5d661aSToomas Soome
136*4a5d661aSToomas Soome      Unless otherwise indicated below, a compliant decompressor must be
137*4a5d661aSToomas Soome      able to accept and decompress any data set that conforms to all
138*4a5d661aSToomas Soome      the specifications presented here; a compliant compressor must
139*4a5d661aSToomas Soome      produce data sets that conform to all the specifications presented
140*4a5d661aSToomas Soome      here.
141*4a5d661aSToomas Soome
142*4a5d661aSToomas Soome   1.5.  Definitions of terms and conventions used
143*4a5d661aSToomas Soome
144*4a5d661aSToomas Soome      byte: 8 bits stored or transmitted as a unit (same as an octet).
145*4a5d661aSToomas Soome      (For this specification, a byte is exactly 8 bits, even on
146*4a5d661aSToomas Soome      machines which store a character on a number of bits different
147*4a5d661aSToomas Soome      from 8.) See below, for the numbering of bits within a byte.
148*4a5d661aSToomas Soome
149*4a5d661aSToomas Soome   1.6. Changes from previous versions
150*4a5d661aSToomas Soome
151*4a5d661aSToomas Soome      Version 3.1 was the first public release of this specification.
152*4a5d661aSToomas Soome      In version 3.2, some terminology was changed and the Adler-32
153*4a5d661aSToomas Soome      sample code was rewritten for clarity.  In version 3.3, the
154*4a5d661aSToomas Soome      support for a preset dictionary was introduced, and the
155*4a5d661aSToomas Soome      specification was converted to RFC style.
156*4a5d661aSToomas Soome
157*4a5d661aSToomas Soome2. Detailed specification
158*4a5d661aSToomas Soome
159*4a5d661aSToomas Soome   2.1. Overall conventions
160*4a5d661aSToomas Soome
161*4a5d661aSToomas Soome      In the diagrams below, a box like this:
162*4a5d661aSToomas Soome
163*4a5d661aSToomas Soome         +---+
164*4a5d661aSToomas Soome         |   | <-- the vertical bars might be missing
165*4a5d661aSToomas Soome         +---+
166*4a5d661aSToomas Soome
167*4a5d661aSToomas Soome
168*4a5d661aSToomas Soome
169*4a5d661aSToomas Soome
170*4a5d661aSToomas SoomeDeutsch & Gailly             Informational                      [Page 3]
171*4a5d661aSToomas Soome
172*4a5d661aSToomas SoomeRFC 1950       ZLIB Compressed Data Format Specification        May 1996
173*4a5d661aSToomas Soome
174*4a5d661aSToomas Soome
175*4a5d661aSToomas Soome      represents one byte; a box like this:
176*4a5d661aSToomas Soome
177*4a5d661aSToomas Soome         +==============+
178*4a5d661aSToomas Soome         |              |
179*4a5d661aSToomas Soome         +==============+
180*4a5d661aSToomas Soome
181*4a5d661aSToomas Soome      represents a variable number of bytes.
182*4a5d661aSToomas Soome
183*4a5d661aSToomas Soome      Bytes stored within a computer do not have a "bit order", since
184*4a5d661aSToomas Soome      they are always treated as a unit.  However, a byte considered as
185*4a5d661aSToomas Soome      an integer between 0 and 255 does have a most- and least-
186*4a5d661aSToomas Soome      significant bit, and since we write numbers with the most-
187*4a5d661aSToomas Soome      significant digit on the left, we also write bytes with the most-
188*4a5d661aSToomas Soome      significant bit on the left.  In the diagrams below, we number the
189*4a5d661aSToomas Soome      bits of a byte so that bit 0 is the least-significant bit, i.e.,
190*4a5d661aSToomas Soome      the bits are numbered:
191*4a5d661aSToomas Soome
192*4a5d661aSToomas Soome         +--------+
193*4a5d661aSToomas Soome         |76543210|
194*4a5d661aSToomas Soome         +--------+
195*4a5d661aSToomas Soome
196*4a5d661aSToomas Soome      Within a computer, a number may occupy multiple bytes.  All
197*4a5d661aSToomas Soome      multi-byte numbers in the format described here are stored with
198*4a5d661aSToomas Soome      the MOST-significant byte first (at the lower memory address).
199*4a5d661aSToomas Soome      For example, the decimal number 520 is stored as:
200*4a5d661aSToomas Soome
201*4a5d661aSToomas Soome             0     1
202*4a5d661aSToomas Soome         +--------+--------+
203*4a5d661aSToomas Soome         |00000010|00001000|
204*4a5d661aSToomas Soome         +--------+--------+
205*4a5d661aSToomas Soome          ^        ^
206*4a5d661aSToomas Soome          |        |
207*4a5d661aSToomas Soome          |        + less significant byte = 8
208*4a5d661aSToomas Soome          + more significant byte = 2 x 256
209*4a5d661aSToomas Soome
210*4a5d661aSToomas Soome   2.2. Data format
211*4a5d661aSToomas Soome
212*4a5d661aSToomas Soome      A zlib stream has the following structure:
213*4a5d661aSToomas Soome
214*4a5d661aSToomas Soome           0   1
215*4a5d661aSToomas Soome         +---+---+
216*4a5d661aSToomas Soome         |CMF|FLG|   (more-->)
217*4a5d661aSToomas Soome         +---+---+
218*4a5d661aSToomas Soome
219*4a5d661aSToomas Soome
220*4a5d661aSToomas Soome
221*4a5d661aSToomas Soome
222*4a5d661aSToomas Soome
223*4a5d661aSToomas Soome
224*4a5d661aSToomas Soome
225*4a5d661aSToomas Soome
226*4a5d661aSToomas SoomeDeutsch & Gailly             Informational                      [Page 4]
227*4a5d661aSToomas Soome
228*4a5d661aSToomas SoomeRFC 1950       ZLIB Compressed Data Format Specification        May 1996
229*4a5d661aSToomas Soome
230*4a5d661aSToomas Soome
231*4a5d661aSToomas Soome      (if FLG.FDICT set)
232*4a5d661aSToomas Soome
233*4a5d661aSToomas Soome           0   1   2   3
234*4a5d661aSToomas Soome         +---+---+---+---+
235*4a5d661aSToomas Soome         |     DICTID    |   (more-->)
236*4a5d661aSToomas Soome         +---+---+---+---+
237*4a5d661aSToomas Soome
238*4a5d661aSToomas Soome         +=====================+---+---+---+---+
239*4a5d661aSToomas Soome         |...compressed data...|    ADLER32    |
240*4a5d661aSToomas Soome         +=====================+---+---+---+---+
241*4a5d661aSToomas Soome
242*4a5d661aSToomas Soome      Any data which may appear after ADLER32 are not part of the zlib
243*4a5d661aSToomas Soome      stream.
244*4a5d661aSToomas Soome
245*4a5d661aSToomas Soome      CMF (Compression Method and flags)
246*4a5d661aSToomas Soome         This byte is divided into a 4-bit compression method and a 4-
247*4a5d661aSToomas Soome         bit information field depending on the compression method.
248*4a5d661aSToomas Soome
249*4a5d661aSToomas Soome            bits 0 to 3  CM     Compression method
250*4a5d661aSToomas Soome            bits 4 to 7  CINFO  Compression info
251*4a5d661aSToomas Soome
252*4a5d661aSToomas Soome      CM (Compression method)
253*4a5d661aSToomas Soome         This identifies the compression method used in the file. CM = 8
254*4a5d661aSToomas Soome         denotes the "deflate" compression method with a window size up
255*4a5d661aSToomas Soome         to 32K.  This is the method used by gzip and PNG (see
256*4a5d661aSToomas Soome         references [1] and [2] in Chapter 3, below, for the reference
257*4a5d661aSToomas Soome         documents).  CM = 15 is reserved.  It might be used in a future
258*4a5d661aSToomas Soome         version of this specification to indicate the presence of an
259*4a5d661aSToomas Soome         extra field before the compressed data.
260*4a5d661aSToomas Soome
261*4a5d661aSToomas Soome      CINFO (Compression info)
262*4a5d661aSToomas Soome         For CM = 8, CINFO is the base-2 logarithm of the LZ77 window
263*4a5d661aSToomas Soome         size, minus eight (CINFO=7 indicates a 32K window size). Values
264*4a5d661aSToomas Soome         of CINFO above 7 are not allowed in this version of the
265*4a5d661aSToomas Soome         specification.  CINFO is not defined in this specification for
266*4a5d661aSToomas Soome         CM not equal to 8.
267*4a5d661aSToomas Soome
268*4a5d661aSToomas Soome      FLG (FLaGs)
269*4a5d661aSToomas Soome         This flag byte is divided as follows:
270*4a5d661aSToomas Soome
271*4a5d661aSToomas Soome            bits 0 to 4  FCHECK  (check bits for CMF and FLG)
272*4a5d661aSToomas Soome            bit  5       FDICT   (preset dictionary)
273*4a5d661aSToomas Soome            bits 6 to 7  FLEVEL  (compression level)
274*4a5d661aSToomas Soome
275*4a5d661aSToomas Soome         The FCHECK value must be such that CMF and FLG, when viewed as
276*4a5d661aSToomas Soome         a 16-bit unsigned integer stored in MSB order (CMF*256 + FLG),
277*4a5d661aSToomas Soome         is a multiple of 31.
278*4a5d661aSToomas Soome
279*4a5d661aSToomas Soome
280*4a5d661aSToomas Soome
281*4a5d661aSToomas Soome
282*4a5d661aSToomas SoomeDeutsch & Gailly             Informational                      [Page 5]
283*4a5d661aSToomas Soome
284*4a5d661aSToomas SoomeRFC 1950       ZLIB Compressed Data Format Specification        May 1996
285*4a5d661aSToomas Soome
286*4a5d661aSToomas Soome
287*4a5d661aSToomas Soome      FDICT (Preset dictionary)
288*4a5d661aSToomas Soome         If FDICT is set, a DICT dictionary identifier is present
289*4a5d661aSToomas Soome         immediately after the FLG byte. The dictionary is a sequence of
290*4a5d661aSToomas Soome         bytes which are initially fed to the compressor without
291*4a5d661aSToomas Soome         producing any compressed output. DICT is the Adler-32 checksum
292*4a5d661aSToomas Soome         of this sequence of bytes (see the definition of ADLER32
293*4a5d661aSToomas Soome         below).  The decompressor can use this identifier to determine
294*4a5d661aSToomas Soome         which dictionary has been used by the compressor.
295*4a5d661aSToomas Soome
296*4a5d661aSToomas Soome      FLEVEL (Compression level)
297*4a5d661aSToomas Soome         These flags are available for use by specific compression
298*4a5d661aSToomas Soome         methods.  The "deflate" method (CM = 8) sets these flags as
299*4a5d661aSToomas Soome         follows:
300*4a5d661aSToomas Soome
301*4a5d661aSToomas Soome            0 - compressor used fastest algorithm
302*4a5d661aSToomas Soome            1 - compressor used fast algorithm
303*4a5d661aSToomas Soome            2 - compressor used default algorithm
304*4a5d661aSToomas Soome            3 - compressor used maximum compression, slowest algorithm
305*4a5d661aSToomas Soome
306*4a5d661aSToomas Soome         The information in FLEVEL is not needed for decompression; it
307*4a5d661aSToomas Soome         is there to indicate if recompression might be worthwhile.
308*4a5d661aSToomas Soome
309*4a5d661aSToomas Soome      compressed data
310*4a5d661aSToomas Soome         For compression method 8, the compressed data is stored in the
311*4a5d661aSToomas Soome         deflate compressed data format as described in the document
312*4a5d661aSToomas Soome         "DEFLATE Compressed Data Format Specification" by L. Peter
313*4a5d661aSToomas Soome         Deutsch. (See reference [3] in Chapter 3, below)
314*4a5d661aSToomas Soome
315*4a5d661aSToomas Soome         Other compressed data formats are not specified in this version
316*4a5d661aSToomas Soome         of the zlib specification.
317*4a5d661aSToomas Soome
318*4a5d661aSToomas Soome      ADLER32 (Adler-32 checksum)
319*4a5d661aSToomas Soome         This contains a checksum value of the uncompressed data
320*4a5d661aSToomas Soome         (excluding any dictionary data) computed according to Adler-32
321*4a5d661aSToomas Soome         algorithm. This algorithm is a 32-bit extension and improvement
322*4a5d661aSToomas Soome         of the Fletcher algorithm, used in the ITU-T X.224 / ISO 8073
323*4a5d661aSToomas Soome         standard. See references [4] and [5] in Chapter 3, below)
324*4a5d661aSToomas Soome
325*4a5d661aSToomas Soome         Adler-32 is composed of two sums accumulated per byte: s1 is
326*4a5d661aSToomas Soome         the sum of all bytes, s2 is the sum of all s1 values. Both sums
327*4a5d661aSToomas Soome         are done modulo 65521. s1 is initialized to 1, s2 to zero.  The
328*4a5d661aSToomas Soome         Adler-32 checksum is stored as s2*65536 + s1 in most-
329*4a5d661aSToomas Soome         significant-byte first (network) order.
330*4a5d661aSToomas Soome
331*4a5d661aSToomas Soome
332*4a5d661aSToomas Soome
333*4a5d661aSToomas Soome
334*4a5d661aSToomas Soome
335*4a5d661aSToomas Soome
336*4a5d661aSToomas Soome
337*4a5d661aSToomas Soome
338*4a5d661aSToomas SoomeDeutsch & Gailly             Informational                      [Page 6]
339*4a5d661aSToomas Soome
340*4a5d661aSToomas SoomeRFC 1950       ZLIB Compressed Data Format Specification        May 1996
341*4a5d661aSToomas Soome
342*4a5d661aSToomas Soome
343*4a5d661aSToomas Soome   2.3. Compliance
344*4a5d661aSToomas Soome
345*4a5d661aSToomas Soome      A compliant compressor must produce streams with correct CMF, FLG
346*4a5d661aSToomas Soome      and ADLER32, but need not support preset dictionaries.  When the
347*4a5d661aSToomas Soome      zlib data format is used as part of another standard data format,
348*4a5d661aSToomas Soome      the compressor may use only preset dictionaries that are specified
349*4a5d661aSToomas Soome      by this other data format.  If this other format does not use the
350*4a5d661aSToomas Soome      preset dictionary feature, the compressor must not set the FDICT
351*4a5d661aSToomas Soome      flag.
352*4a5d661aSToomas Soome
353*4a5d661aSToomas Soome      A compliant decompressor must check CMF, FLG, and ADLER32, and
354*4a5d661aSToomas Soome      provide an error indication if any of these have incorrect values.
355*4a5d661aSToomas Soome      A compliant decompressor must give an error indication if CM is
356*4a5d661aSToomas Soome      not one of the values defined in this specification (only the
357*4a5d661aSToomas Soome      value 8 is permitted in this version), since another value could
358*4a5d661aSToomas Soome      indicate the presence of new features that would cause subsequent
359*4a5d661aSToomas Soome      data to be interpreted incorrectly.  A compliant decompressor must
360*4a5d661aSToomas Soome      give an error indication if FDICT is set and DICTID is not the
361*4a5d661aSToomas Soome      identifier of a known preset dictionary.  A decompressor may
362*4a5d661aSToomas Soome      ignore FLEVEL and still be compliant.  When the zlib data format
363*4a5d661aSToomas Soome      is being used as a part of another standard format, a compliant
364*4a5d661aSToomas Soome      decompressor must support all the preset dictionaries specified by
365*4a5d661aSToomas Soome      the other format. When the other format does not use the preset
366*4a5d661aSToomas Soome      dictionary feature, a compliant decompressor must reject any
367*4a5d661aSToomas Soome      stream in which the FDICT flag is set.
368*4a5d661aSToomas Soome
369*4a5d661aSToomas Soome3. References
370*4a5d661aSToomas Soome
371*4a5d661aSToomas Soome   [1] Deutsch, L.P.,"GZIP Compressed Data Format Specification",
372*4a5d661aSToomas Soome       available in ftp://ftp.uu.net/pub/archiving/zip/doc/
373*4a5d661aSToomas Soome
374*4a5d661aSToomas Soome   [2] Thomas Boutell, "PNG (Portable Network Graphics) specification",
375*4a5d661aSToomas Soome       available in ftp://ftp.uu.net/graphics/png/documents/
376*4a5d661aSToomas Soome
377*4a5d661aSToomas Soome   [3] Deutsch, L.P.,"DEFLATE Compressed Data Format Specification",
378*4a5d661aSToomas Soome       available in ftp://ftp.uu.net/pub/archiving/zip/doc/
379*4a5d661aSToomas Soome
380*4a5d661aSToomas Soome   [4] Fletcher, J. G., "An Arithmetic Checksum for Serial
381*4a5d661aSToomas Soome       Transmissions," IEEE Transactions on Communications, Vol. COM-30,
382*4a5d661aSToomas Soome       No. 1, January 1982, pp. 247-252.
383*4a5d661aSToomas Soome
384*4a5d661aSToomas Soome   [5] ITU-T Recommendation X.224, Annex D, "Checksum Algorithms,"
385*4a5d661aSToomas Soome       November, 1993, pp. 144, 145. (Available from
386*4a5d661aSToomas Soome       gopher://info.itu.ch). ITU-T X.244 is also the same as ISO 8073.
387*4a5d661aSToomas Soome
388*4a5d661aSToomas Soome
389*4a5d661aSToomas Soome
390*4a5d661aSToomas Soome
391*4a5d661aSToomas Soome
392*4a5d661aSToomas Soome
393*4a5d661aSToomas Soome
394*4a5d661aSToomas SoomeDeutsch & Gailly             Informational                      [Page 7]
395*4a5d661aSToomas Soome
396*4a5d661aSToomas SoomeRFC 1950       ZLIB Compressed Data Format Specification        May 1996
397*4a5d661aSToomas Soome
398*4a5d661aSToomas Soome
399*4a5d661aSToomas Soome4. Source code
400*4a5d661aSToomas Soome
401*4a5d661aSToomas Soome   Source code for a C language implementation of a "zlib" compliant
402*4a5d661aSToomas Soome   library is available at ftp://ftp.uu.net/pub/archiving/zip/zlib/.
403*4a5d661aSToomas Soome
404*4a5d661aSToomas Soome5. Security Considerations
405*4a5d661aSToomas Soome
406*4a5d661aSToomas Soome   A decoder that fails to check the ADLER32 checksum value may be
407*4a5d661aSToomas Soome   subject to undetected data corruption.
408*4a5d661aSToomas Soome
409*4a5d661aSToomas Soome6. Acknowledgements
410*4a5d661aSToomas Soome
411*4a5d661aSToomas Soome   Trademarks cited in this document are the property of their
412*4a5d661aSToomas Soome   respective owners.
413*4a5d661aSToomas Soome
414*4a5d661aSToomas Soome   Jean-Loup Gailly and Mark Adler designed the zlib format and wrote
415*4a5d661aSToomas Soome   the related software described in this specification.  Glenn
416*4a5d661aSToomas Soome   Randers-Pehrson converted this document to RFC and HTML format.
417*4a5d661aSToomas Soome
418*4a5d661aSToomas Soome7. Authors' Addresses
419*4a5d661aSToomas Soome
420*4a5d661aSToomas Soome   L. Peter Deutsch
421*4a5d661aSToomas Soome   Aladdin Enterprises
422*4a5d661aSToomas Soome   203 Santa Margarita Ave.
423*4a5d661aSToomas Soome   Menlo Park, CA 94025
424*4a5d661aSToomas Soome
425*4a5d661aSToomas Soome   Phone: (415) 322-0103 (AM only)
426*4a5d661aSToomas Soome   FAX:   (415) 322-1734
427*4a5d661aSToomas Soome   EMail: <ghost@aladdin.com>
428*4a5d661aSToomas Soome
429*4a5d661aSToomas Soome
430*4a5d661aSToomas Soome   Jean-Loup Gailly
431*4a5d661aSToomas Soome
432*4a5d661aSToomas Soome   EMail: <gzip@prep.ai.mit.edu>
433*4a5d661aSToomas Soome
434*4a5d661aSToomas Soome   Questions about the technical content of this specification can be
435*4a5d661aSToomas Soome   sent by email to
436*4a5d661aSToomas Soome
437*4a5d661aSToomas Soome   Jean-Loup Gailly <gzip@prep.ai.mit.edu> and
438*4a5d661aSToomas Soome   Mark Adler <madler@alumni.caltech.edu>
439*4a5d661aSToomas Soome
440*4a5d661aSToomas Soome   Editorial comments on this specification can be sent by email to
441*4a5d661aSToomas Soome
442*4a5d661aSToomas Soome   L. Peter Deutsch <ghost@aladdin.com> and
443*4a5d661aSToomas Soome   Glenn Randers-Pehrson <randeg@alumni.rpi.edu>
444*4a5d661aSToomas Soome
445*4a5d661aSToomas Soome
446*4a5d661aSToomas Soome
447*4a5d661aSToomas Soome
448*4a5d661aSToomas Soome
449*4a5d661aSToomas Soome
450*4a5d661aSToomas SoomeDeutsch & Gailly             Informational                      [Page 8]
451*4a5d661aSToomas Soome
452*4a5d661aSToomas SoomeRFC 1950       ZLIB Compressed Data Format Specification        May 1996
453*4a5d661aSToomas Soome
454*4a5d661aSToomas Soome
455*4a5d661aSToomas Soome8. Appendix: Rationale
456*4a5d661aSToomas Soome
457*4a5d661aSToomas Soome   8.1. Preset dictionaries
458*4a5d661aSToomas Soome
459*4a5d661aSToomas Soome      A preset dictionary is specially useful to compress short input
460*4a5d661aSToomas Soome      sequences. The compressor can take advantage of the dictionary
461*4a5d661aSToomas Soome      context to encode the input in a more compact manner. The
462*4a5d661aSToomas Soome      decompressor can be initialized with the appropriate context by
463*4a5d661aSToomas Soome      virtually decompressing a compressed version of the dictionary
464*4a5d661aSToomas Soome      without producing any output. However for certain compression
465*4a5d661aSToomas Soome      algorithms such as the deflate algorithm this operation can be
466*4a5d661aSToomas Soome      achieved without actually performing any decompression.
467*4a5d661aSToomas Soome
468*4a5d661aSToomas Soome      The compressor and the decompressor must use exactly the same
469*4a5d661aSToomas Soome      dictionary. The dictionary may be fixed or may be chosen among a
470*4a5d661aSToomas Soome      certain number of predefined dictionaries, according to the kind
471*4a5d661aSToomas Soome      of input data. The decompressor can determine which dictionary has
472*4a5d661aSToomas Soome      been chosen by the compressor by checking the dictionary
473*4a5d661aSToomas Soome      identifier. This document does not specify the contents of
474*4a5d661aSToomas Soome      predefined dictionaries, since the optimal dictionaries are
475*4a5d661aSToomas Soome      application specific. Standard data formats using this feature of
476*4a5d661aSToomas Soome      the zlib specification must precisely define the allowed
477*4a5d661aSToomas Soome      dictionaries.
478*4a5d661aSToomas Soome
479*4a5d661aSToomas Soome   8.2. The Adler-32 algorithm
480*4a5d661aSToomas Soome
481*4a5d661aSToomas Soome      The Adler-32 algorithm is much faster than the CRC32 algorithm yet
482*4a5d661aSToomas Soome      still provides an extremely low probability of undetected errors.
483*4a5d661aSToomas Soome
484*4a5d661aSToomas Soome      The modulo on unsigned long accumulators can be delayed for 5552
485*4a5d661aSToomas Soome      bytes, so the modulo operation time is negligible.  If the bytes
486*4a5d661aSToomas Soome      are a, b, c, the second sum is 3a + 2b + c + 3, and so is position
487*4a5d661aSToomas Soome      and order sensitive, unlike the first sum, which is just a
488*4a5d661aSToomas Soome      checksum.  That 65521 is prime is important to avoid a possible
489*4a5d661aSToomas Soome      large class of two-byte errors that leave the check unchanged.
490*4a5d661aSToomas Soome      (The Fletcher checksum uses 255, which is not prime and which also
491*4a5d661aSToomas Soome      makes the Fletcher check insensitive to single byte changes 0 <->
492*4a5d661aSToomas Soome      255.)
493*4a5d661aSToomas Soome
494*4a5d661aSToomas Soome      The sum s1 is initialized to 1 instead of zero to make the length
495*4a5d661aSToomas Soome      of the sequence part of s2, so that the length does not have to be
496*4a5d661aSToomas Soome      checked separately. (Any sequence of zeroes has a Fletcher
497*4a5d661aSToomas Soome      checksum of zero.)
498*4a5d661aSToomas Soome
499*4a5d661aSToomas Soome
500*4a5d661aSToomas Soome
501*4a5d661aSToomas Soome
502*4a5d661aSToomas Soome
503*4a5d661aSToomas Soome
504*4a5d661aSToomas Soome
505*4a5d661aSToomas Soome
506*4a5d661aSToomas SoomeDeutsch & Gailly             Informational                      [Page 9]
507*4a5d661aSToomas Soome
508*4a5d661aSToomas SoomeRFC 1950       ZLIB Compressed Data Format Specification        May 1996
509*4a5d661aSToomas Soome
510*4a5d661aSToomas Soome
511*4a5d661aSToomas Soome9. Appendix: Sample code
512*4a5d661aSToomas Soome
513*4a5d661aSToomas Soome   The following C code computes the Adler-32 checksum of a data buffer.
514*4a5d661aSToomas Soome   It is written for clarity, not for speed.  The sample code is in the
515*4a5d661aSToomas Soome   ANSI C programming language. Non C users may find it easier to read
516*4a5d661aSToomas Soome   with these hints:
517*4a5d661aSToomas Soome
518*4a5d661aSToomas Soome      &      Bitwise AND operator.
519*4a5d661aSToomas Soome      >>     Bitwise right shift operator. When applied to an
520*4a5d661aSToomas Soome             unsigned quantity, as here, right shift inserts zero bit(s)
521*4a5d661aSToomas Soome             at the left.
522*4a5d661aSToomas Soome      <<     Bitwise left shift operator. Left shift inserts zero
523*4a5d661aSToomas Soome             bit(s) at the right.
524*4a5d661aSToomas Soome      ++     "n++" increments the variable n.
525*4a5d661aSToomas Soome      %      modulo operator: a % b is the remainder of a divided by b.
526*4a5d661aSToomas Soome
527*4a5d661aSToomas Soome      #define BASE 65521 /* largest prime smaller than 65536 */
528*4a5d661aSToomas Soome
529*4a5d661aSToomas Soome      /*
530*4a5d661aSToomas Soome         Update a running Adler-32 checksum with the bytes buf[0..len-1]
531*4a5d661aSToomas Soome       and return the updated checksum. The Adler-32 checksum should be
532*4a5d661aSToomas Soome       initialized to 1.
533*4a5d661aSToomas Soome
534*4a5d661aSToomas Soome       Usage example:
535*4a5d661aSToomas Soome
536*4a5d661aSToomas Soome         unsigned long adler = 1L;
537*4a5d661aSToomas Soome
538*4a5d661aSToomas Soome         while (read_buffer(buffer, length) != EOF) {
539*4a5d661aSToomas Soome           adler = update_adler32(adler, buffer, length);
540*4a5d661aSToomas Soome         }
541*4a5d661aSToomas Soome         if (adler != original_adler) error();
542*4a5d661aSToomas Soome      */
543*4a5d661aSToomas Soome      unsigned long update_adler32(unsigned long adler,
544*4a5d661aSToomas Soome         unsigned char *buf, int len)
545*4a5d661aSToomas Soome      {
546*4a5d661aSToomas Soome        unsigned long s1 = adler & 0xffff;
547*4a5d661aSToomas Soome        unsigned long s2 = (adler >> 16) & 0xffff;
548*4a5d661aSToomas Soome        int n;
549*4a5d661aSToomas Soome
550*4a5d661aSToomas Soome        for (n = 0; n < len; n++) {
551*4a5d661aSToomas Soome          s1 = (s1 + buf[n]) % BASE;
552*4a5d661aSToomas Soome          s2 = (s2 + s1)     % BASE;
553*4a5d661aSToomas Soome        }
554*4a5d661aSToomas Soome        return (s2 << 16) + s1;
555*4a5d661aSToomas Soome      }
556*4a5d661aSToomas Soome
557*4a5d661aSToomas Soome      /* Return the adler32 of the bytes buf[0..len-1] */
558*4a5d661aSToomas Soome
559*4a5d661aSToomas Soome
560*4a5d661aSToomas Soome
561*4a5d661aSToomas Soome
562*4a5d661aSToomas SoomeDeutsch & Gailly             Informational                     [Page 10]
563*4a5d661aSToomas Soome
564*4a5d661aSToomas SoomeRFC 1950       ZLIB Compressed Data Format Specification        May 1996
565*4a5d661aSToomas Soome
566*4a5d661aSToomas Soome
567*4a5d661aSToomas Soome      unsigned long adler32(unsigned char *buf, int len)
568*4a5d661aSToomas Soome      {
569*4a5d661aSToomas Soome        return update_adler32(1L, buf, len);
570*4a5d661aSToomas Soome      }
571*4a5d661aSToomas Soome
572*4a5d661aSToomas Soome
573*4a5d661aSToomas Soome
574*4a5d661aSToomas Soome
575*4a5d661aSToomas Soome
576*4a5d661aSToomas Soome
577*4a5d661aSToomas Soome
578*4a5d661aSToomas Soome
579*4a5d661aSToomas Soome
580*4a5d661aSToomas Soome
581*4a5d661aSToomas Soome
582*4a5d661aSToomas Soome
583*4a5d661aSToomas Soome
584*4a5d661aSToomas Soome
585*4a5d661aSToomas Soome
586*4a5d661aSToomas Soome
587*4a5d661aSToomas Soome
588*4a5d661aSToomas Soome
589*4a5d661aSToomas Soome
590*4a5d661aSToomas Soome
591*4a5d661aSToomas Soome
592*4a5d661aSToomas Soome
593*4a5d661aSToomas Soome
594*4a5d661aSToomas Soome
595*4a5d661aSToomas Soome
596*4a5d661aSToomas Soome
597*4a5d661aSToomas Soome
598*4a5d661aSToomas Soome
599*4a5d661aSToomas Soome
600*4a5d661aSToomas Soome
601*4a5d661aSToomas Soome
602*4a5d661aSToomas Soome
603*4a5d661aSToomas Soome
604*4a5d661aSToomas Soome
605*4a5d661aSToomas Soome
606*4a5d661aSToomas Soome
607*4a5d661aSToomas Soome
608*4a5d661aSToomas Soome
609*4a5d661aSToomas Soome
610*4a5d661aSToomas Soome
611*4a5d661aSToomas Soome
612*4a5d661aSToomas Soome
613*4a5d661aSToomas Soome
614*4a5d661aSToomas Soome
615*4a5d661aSToomas Soome
616*4a5d661aSToomas Soome
617*4a5d661aSToomas Soome
618*4a5d661aSToomas SoomeDeutsch & Gailly             Informational                     [Page 11]
619*4a5d661aSToomas Soome
620