xref: /freebsd/contrib/libarchive/libarchive/cpio.5 (revision ddce862ad8594542e1fa1af9ffae7264e12ffd27)
1caf54c4fSMartin Matuska.\" Copyright (c) 2007 Tim Kientzle
2caf54c4fSMartin Matuska.\" All rights reserved.
3caf54c4fSMartin Matuska.\"
4caf54c4fSMartin Matuska.\" Redistribution and use in source and binary forms, with or without
5caf54c4fSMartin Matuska.\" modification, are permitted provided that the following conditions
6caf54c4fSMartin Matuska.\" are met:
7caf54c4fSMartin Matuska.\" 1. Redistributions of source code must retain the above copyright
8caf54c4fSMartin Matuska.\"    notice, this list of conditions and the following disclaimer.
9caf54c4fSMartin Matuska.\" 2. Redistributions in binary form must reproduce the above copyright
10caf54c4fSMartin Matuska.\"    notice, this list of conditions and the following disclaimer in the
11caf54c4fSMartin Matuska.\"    documentation and/or other materials provided with the distribution.
12caf54c4fSMartin Matuska.\"
13caf54c4fSMartin Matuska.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14caf54c4fSMartin Matuska.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15caf54c4fSMartin Matuska.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16caf54c4fSMartin Matuska.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17caf54c4fSMartin Matuska.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18caf54c4fSMartin Matuska.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19caf54c4fSMartin Matuska.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20caf54c4fSMartin Matuska.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21caf54c4fSMartin Matuska.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22caf54c4fSMartin Matuska.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23caf54c4fSMartin Matuska.\" SUCH DAMAGE.
24caf54c4fSMartin Matuska.\"
256c22d9efSMartin Matuska.\" $FreeBSD$
26caf54c4fSMartin Matuska.\"
27fd082e96SMartin Matuska.Dd December 23, 2011
28caf54c4fSMartin Matuska.Dt CPIO 5
29caf54c4fSMartin Matuska.Os
30caf54c4fSMartin Matuska.Sh NAME
31caf54c4fSMartin Matuska.Nm cpio
32caf54c4fSMartin Matuska.Nd format of cpio archive files
33caf54c4fSMartin Matuska.Sh DESCRIPTION
34caf54c4fSMartin MatuskaThe
35caf54c4fSMartin Matuska.Nm
36caf54c4fSMartin Matuskaarchive format collects any number of files, directories, and other
37caf54c4fSMartin Matuskafile system objects (symbolic links, device nodes, etc.) into a single
38caf54c4fSMartin Matuskastream of bytes.
39caf54c4fSMartin Matuska.Ss General Format
40caf54c4fSMartin MatuskaEach file system object in a
41caf54c4fSMartin Matuska.Nm
42caf54c4fSMartin Matuskaarchive comprises a header record with basic numeric metadata
43caf54c4fSMartin Matuskafollowed by the full pathname of the entry and the file data.
44caf54c4fSMartin MatuskaThe header record stores a series of integer values that generally
45caf54c4fSMartin Matuskafollow the fields in
46caf54c4fSMartin Matuska.Va struct stat .
47caf54c4fSMartin Matuska(See
48caf54c4fSMartin Matuska.Xr stat 2
49caf54c4fSMartin Matuskafor details.)
50caf54c4fSMartin MatuskaThe variants differ primarily in how they store those integers
51caf54c4fSMartin Matuska(binary, octal, or hexadecimal).
52caf54c4fSMartin MatuskaThe header is followed by the pathname of the
53caf54c4fSMartin Matuskaentry (the length of the pathname is stored in the header)
54caf54c4fSMartin Matuskaand any file data.
55caf54c4fSMartin MatuskaThe end of the archive is indicated by a special record with
56caf54c4fSMartin Matuskathe pathname
57caf54c4fSMartin Matuska.Dq TRAILER!!! .
58caf54c4fSMartin Matuska.Ss PWB format
59*ddce862aSMartin MatuskaThe PWB binary
60caf54c4fSMartin Matuska.Nm
61*ddce862aSMartin Matuskaformat is the original format, when cpio was introduced as part of the
62*ddce862aSMartin MatuskaProgrammer's Work Bench system, a variant of 6th Edition UNIX.  It
63*ddce862aSMartin Matuskastores numbers as 2-byte and 4-byte binary values.
64caf54c4fSMartin MatuskaEach entry begins with a header in the following format:
65*ddce862aSMartin Matuska.Pp
66caf54c4fSMartin Matuska.Bd -literal -offset indent
67*ddce862aSMartin Matuskastruct header_pwb_cpio {
68*ddce862aSMartin Matuska        short   h_magic;
69*ddce862aSMartin Matuska        short   h_dev;
70*ddce862aSMartin Matuska        short   h_ino;
71*ddce862aSMartin Matuska        short   h_mode;
72*ddce862aSMartin Matuska        short   h_uid;
73*ddce862aSMartin Matuska        short   h_gid;
74*ddce862aSMartin Matuska        short   h_nlink;
75*ddce862aSMartin Matuska        short   h_majmin;
76*ddce862aSMartin Matuska        long    h_mtime;
77*ddce862aSMartin Matuska        short   h_namesize;
78*ddce862aSMartin Matuska        long    h_filesize;
79caf54c4fSMartin Matuska};
80caf54c4fSMartin Matuska.Ed
81caf54c4fSMartin Matuska.Pp
82caf54c4fSMartin MatuskaThe
83*ddce862aSMartin Matuska.Va short
84*ddce862aSMartin Matuskafields here are 16-bit integer values, while the
85*ddce862aSMartin Matuska.Va long
86*ddce862aSMartin Matuskafields are 32 bit integers.  Since PWB UNIX, like the 6th Edition UNIX
87*ddce862aSMartin Matuskait was based on, only ran on PDP-11 computers, they
88*ddce862aSMartin Matuskaare in PDP-endian format, which has little-endian shorts, and
89*ddce862aSMartin Matuskabig-endian longs.  That is, the long integer whose hexadecimal
90*ddce862aSMartin Matuskarepresentation is 0x12345678 would be stored in four successive bytes
91*ddce862aSMartin Matuskaas 0x34, 0x12, 0x78, 0x56.
92*ddce862aSMartin MatuskaThe fields are as follows:
93caf54c4fSMartin Matuska.Bl -tag -width indent
94*ddce862aSMartin Matuska.It Va h_magic
95caf54c4fSMartin MatuskaThe integer value octal 070707.
96*ddce862aSMartin Matuska.It Va h_dev , Va h_ino
97caf54c4fSMartin MatuskaThe device and inode numbers from the disk.
98caf54c4fSMartin MatuskaThese are used by programs that read
99caf54c4fSMartin Matuska.Nm
100caf54c4fSMartin Matuskaarchives to determine when two entries refer to the same file.
101caf54c4fSMartin MatuskaPrograms that synthesize
102caf54c4fSMartin Matuska.Nm
103caf54c4fSMartin Matuskaarchives should be careful to set these to distinct values for each entry.
104*ddce862aSMartin Matuska.It Va h_mode
105*ddce862aSMartin MatuskaThe mode specifies both the regular permissions and the file type, and
106*ddce862aSMartin Matuskait also holds a couple of bits that are irrelevant to the cpio format,
107*ddce862aSMartin Matuskabecause the field is actually a raw copy of the mode field in the inode
108*ddce862aSMartin Matuskarepresenting the file.  These are the IALLOC flag, which shows that
109*ddce862aSMartin Matuskathe inode entry is in use, and the ILARG flag, which shows that the
110*ddce862aSMartin Matuskafile it represents is large enough to have indirect blocks pointers in
111*ddce862aSMartin Matuskathe inode.
112*ddce862aSMartin MatuskaThe mode is decoded as follows:
113*ddce862aSMartin Matuska.Pp
114*ddce862aSMartin Matuska.Bl -tag -width "MMMMMMM" -compact
115*ddce862aSMartin Matuska.It 0100000
116*ddce862aSMartin MatuskaIALLOC flag - irrelevant to cpio.
117*ddce862aSMartin Matuska.It 0060000
118*ddce862aSMartin MatuskaThis masks the file type bits.
119*ddce862aSMartin Matuska.It 0040000
120*ddce862aSMartin MatuskaFile type value for directories.
121*ddce862aSMartin Matuska.It 0020000
122*ddce862aSMartin MatuskaFile type value for character special devices.
123*ddce862aSMartin Matuska.It 0060000
124*ddce862aSMartin MatuskaFile type value for block special devices.
125*ddce862aSMartin Matuska.It 0010000
126*ddce862aSMartin MatuskaILARG flag - irrelevant to cpio.
127*ddce862aSMartin Matuska.It 0004000
128*ddce862aSMartin MatuskaSUID bit.
129*ddce862aSMartin Matuska.It 0002000
130*ddce862aSMartin MatuskaSGID bit.
131*ddce862aSMartin Matuska.It 0001000
132*ddce862aSMartin MatuskaSticky bit.
133*ddce862aSMartin Matuska.It 0000777
134*ddce862aSMartin MatuskaThe lower 9 bits specify read/write/execute permissions
135*ddce862aSMartin Matuskafor world, group, and user following standard POSIX conventions.
136*ddce862aSMartin Matuska.El
137*ddce862aSMartin Matuska.It Va h_uid , Va h_gid
138*ddce862aSMartin MatuskaThe numeric user id and group id of the owner.
139*ddce862aSMartin Matuska.It Va h_nlink
140*ddce862aSMartin MatuskaThe number of links to this file.
141*ddce862aSMartin MatuskaDirectories always have a value of at least two here.
142*ddce862aSMartin MatuskaNote that hardlinked files include file data with every copy in the archive.
143*ddce862aSMartin Matuska.It Va h_majmin
144*ddce862aSMartin MatuskaFor block special and character special entries,
145*ddce862aSMartin Matuskathis field contains the associated device number, with the major
146*ddce862aSMartin Matuskanumber in the high byte, and the minor number in the low byte.
147*ddce862aSMartin MatuskaFor all other entry types, it should be set to zero by writers
148*ddce862aSMartin Matuskaand ignored by readers.
149*ddce862aSMartin Matuska.It Va h_mtime
150*ddce862aSMartin MatuskaModification time of the file, indicated as the number
151*ddce862aSMartin Matuskaof seconds since the start of the epoch,
152*ddce862aSMartin Matuska00:00:00 UTC January 1, 1970.
153*ddce862aSMartin Matuska.It Va h_namesize
154*ddce862aSMartin MatuskaThe number of bytes in the pathname that follows the header.
155*ddce862aSMartin MatuskaThis count includes the trailing NUL byte.
156*ddce862aSMartin Matuska.It Va h_filesize
157*ddce862aSMartin MatuskaThe size of the file.  Note that this archive format is limited to 16
158*ddce862aSMartin Matuskamegabyte file sizes, because PWB UNIX, like 6th Edition, only used
159*ddce862aSMartin Matuskaan unsigned 24 bit integer for the file size internally.
160*ddce862aSMartin Matuska.El
161*ddce862aSMartin Matuska.Pp
162*ddce862aSMartin MatuskaThe pathname immediately follows the fixed header.
163*ddce862aSMartin MatuskaIf
164*ddce862aSMartin Matuska.Cm h_namesize
165*ddce862aSMartin Matuskais odd, an additional NUL byte is added after the pathname.
166*ddce862aSMartin MatuskaThe file data is then appended, again with an additional NUL
167*ddce862aSMartin Matuskaappended if needed to get the next header at an even offset.
168*ddce862aSMartin Matuska.Pp
169*ddce862aSMartin MatuskaHardlinked files are not given special treatment;
170*ddce862aSMartin Matuskathe full file contents are included with each copy of the
171*ddce862aSMartin Matuskafile.
172*ddce862aSMartin Matuska.Ss New Binary Format
173*ddce862aSMartin MatuskaThe new binary
174*ddce862aSMartin Matuska.Nm
175*ddce862aSMartin Matuskaformat showed up when cpio was adopted into late 7th Edition UNIX.
176*ddce862aSMartin MatuskaIt is exactly like the PWB binary format, described above, except for
177*ddce862aSMartin Matuskathree changes:
178*ddce862aSMartin Matuska.Pp
179*ddce862aSMartin MatuskaFirst, UNIX now ran on more than one hardware type, so the endianness
180*ddce862aSMartin Matuskaof 16 bit integers must be determined by observing the magic number at
181*ddce862aSMartin Matuskathe start of the header.  The 32 bit integers are still always stored
182*ddce862aSMartin Matuskawith the most significant word first, though, so each of those two, in
183*ddce862aSMartin Matuskathe struct shown above, was stored as an array of two 16 bit integers,
184*ddce862aSMartin Matuskain the traditional order.  Those 16 bit integers, like all the others
185*ddce862aSMartin Matuskain the struct, were accessed using a macro that byte swapped them if
186*ddce862aSMartin Matuskanecessary.
187*ddce862aSMartin Matuska.Pp
188*ddce862aSMartin MatuskaNext, 7th Edition had more file types to store, and the IALLOC and ILARG
189*ddce862aSMartin Matuskaflag bits were re-purposed to accommodate these.  The revised use of the
190*ddce862aSMartin Matuskavarious bits is as follows:
191*ddce862aSMartin Matuska.Pp
192caf54c4fSMartin Matuska.Bl -tag -width "MMMMMMM" -compact
193caf54c4fSMartin Matuska.It 0170000
194caf54c4fSMartin MatuskaThis masks the file type bits.
195caf54c4fSMartin Matuska.It 0140000
196caf54c4fSMartin MatuskaFile type value for sockets.
197caf54c4fSMartin Matuska.It 0120000
198caf54c4fSMartin MatuskaFile type value for symbolic links.
199caf54c4fSMartin MatuskaFor symbolic links, the link body is stored as file data.
200caf54c4fSMartin Matuska.It 0100000
201caf54c4fSMartin MatuskaFile type value for regular files.
202caf54c4fSMartin Matuska.It 0060000
203caf54c4fSMartin MatuskaFile type value for block special devices.
204caf54c4fSMartin Matuska.It 0040000
205caf54c4fSMartin MatuskaFile type value for directories.
206caf54c4fSMartin Matuska.It 0020000
207caf54c4fSMartin MatuskaFile type value for character special devices.
208caf54c4fSMartin Matuska.It 0010000
209caf54c4fSMartin MatuskaFile type value for named pipes or FIFOs.
210caf54c4fSMartin Matuska.It 0004000
211caf54c4fSMartin MatuskaSUID bit.
212caf54c4fSMartin Matuska.It 0002000
213caf54c4fSMartin MatuskaSGID bit.
214caf54c4fSMartin Matuska.It 0001000
215caf54c4fSMartin MatuskaSticky bit.
216caf54c4fSMartin Matuska.It 0000777
217caf54c4fSMartin MatuskaThe lower 9 bits specify read/write/execute permissions
218caf54c4fSMartin Matuskafor world, group, and user following standard POSIX conventions.
219caf54c4fSMartin Matuska.El
220caf54c4fSMartin Matuska.Pp
221*ddce862aSMartin MatuskaFinally, the file size field now represents a signed 32 bit integer in
222*ddce862aSMartin Matuskathe underlying file system, so the maximum file size has increased to
223*ddce862aSMartin Matuska2 gigabytes.
224caf54c4fSMartin Matuska.Pp
225*ddce862aSMartin MatuskaNote that there is no obvious way to tell which of the two binary
226*ddce862aSMartin Matuskaformats an archive uses, other than to see which one makes more
227*ddce862aSMartin Matuskasense.  The typical error scenario is that a PWB format archive
228*ddce862aSMartin Matuskaunpacked as if it were in the new format will create named sockets
229*ddce862aSMartin Matuskainstead of directories, and then fail to unpack files that should
230*ddce862aSMartin Matuskago in those directories.  Running
231*ddce862aSMartin Matuska.Va bsdcpio -itv
232*ddce862aSMartin Matuskaon an unknown archive will make it obvious which it is: if it's
233*ddce862aSMartin MatuskaPWB format, directories will be listed with an 's' instead of
234*ddce862aSMartin Matuskaa 'd' as the first character of the mode string, and the larger
235*ddce862aSMartin Matuskafiles will have a '?' in that position.
236caf54c4fSMartin Matuska.Ss Portable ASCII Format
237caf54c4fSMartin Matuska.St -susv2
238caf54c4fSMartin Matuskastandardized an ASCII variant that is portable across all
239caf54c4fSMartin Matuskaplatforms.
240caf54c4fSMartin MatuskaIt is commonly known as the
241caf54c4fSMartin Matuska.Dq old character
242caf54c4fSMartin Matuskaformat or as the
243caf54c4fSMartin Matuska.Dq odc
244caf54c4fSMartin Matuskaformat.
245caf54c4fSMartin MatuskaIt stores the same numeric fields as the old binary format, but
246caf54c4fSMartin Matuskarepresents them as 6-character or 11-character octal values.
247*ddce862aSMartin Matuska.Pp
248caf54c4fSMartin Matuska.Bd -literal -offset indent
249caf54c4fSMartin Matuskastruct cpio_odc_header {
250caf54c4fSMartin Matuska        char    c_magic[6];
251caf54c4fSMartin Matuska        char    c_dev[6];
252caf54c4fSMartin Matuska        char    c_ino[6];
253caf54c4fSMartin Matuska        char    c_mode[6];
254caf54c4fSMartin Matuska        char    c_uid[6];
255caf54c4fSMartin Matuska        char    c_gid[6];
256caf54c4fSMartin Matuska        char    c_nlink[6];
257caf54c4fSMartin Matuska        char    c_rdev[6];
258caf54c4fSMartin Matuska        char    c_mtime[11];
259caf54c4fSMartin Matuska        char    c_namesize[6];
260caf54c4fSMartin Matuska        char    c_filesize[11];
261caf54c4fSMartin Matuska};
262caf54c4fSMartin Matuska.Ed
263caf54c4fSMartin Matuska.Pp
264*ddce862aSMartin MatuskaThe fields are identical to those in the new binary format.
265caf54c4fSMartin MatuskaThe name and file body follow the fixed header.
266*ddce862aSMartin MatuskaUnlike the binary formats, there is no additional padding
267caf54c4fSMartin Matuskaafter the pathname or file contents.
268caf54c4fSMartin MatuskaIf the files being archived are themselves entirely ASCII, then
269caf54c4fSMartin Matuskathe resulting archive will be entirely ASCII, except for the
270caf54c4fSMartin MatuskaNUL byte that terminates the name field.
271caf54c4fSMartin Matuska.Ss New ASCII Format
272caf54c4fSMartin MatuskaThe "new" ASCII format uses 8-byte hexadecimal fields for
273caf54c4fSMartin Matuskaall numbers and separates device numbers into separate fields
274caf54c4fSMartin Matuskafor major and minor numbers.
275*ddce862aSMartin Matuska.Pp
276caf54c4fSMartin Matuska.Bd -literal -offset indent
277caf54c4fSMartin Matuskastruct cpio_newc_header {
278caf54c4fSMartin Matuska        char    c_magic[6];
279caf54c4fSMartin Matuska        char    c_ino[8];
280caf54c4fSMartin Matuska        char    c_mode[8];
281caf54c4fSMartin Matuska        char    c_uid[8];
282caf54c4fSMartin Matuska        char    c_gid[8];
283caf54c4fSMartin Matuska        char    c_nlink[8];
284caf54c4fSMartin Matuska        char    c_mtime[8];
285caf54c4fSMartin Matuska        char    c_filesize[8];
286caf54c4fSMartin Matuska        char    c_devmajor[8];
287caf54c4fSMartin Matuska        char    c_devminor[8];
288caf54c4fSMartin Matuska        char    c_rdevmajor[8];
289caf54c4fSMartin Matuska        char    c_rdevminor[8];
290caf54c4fSMartin Matuska        char    c_namesize[8];
291caf54c4fSMartin Matuska        char    c_check[8];
292caf54c4fSMartin Matuska};
293caf54c4fSMartin Matuska.Ed
294caf54c4fSMartin Matuska.Pp
295caf54c4fSMartin MatuskaExcept as specified below, the fields here match those specified
296*ddce862aSMartin Matuskafor the new binary format above.
297caf54c4fSMartin Matuska.Bl -tag -width indent
298caf54c4fSMartin Matuska.It Va magic
299caf54c4fSMartin MatuskaThe string
300caf54c4fSMartin Matuska.Dq 070701 .
301caf54c4fSMartin Matuska.It Va check
302caf54c4fSMartin MatuskaThis field is always set to zero by writers and ignored by readers.
303caf54c4fSMartin MatuskaSee the next section for more details.
304caf54c4fSMartin Matuska.El
305caf54c4fSMartin Matuska.Pp
306caf54c4fSMartin MatuskaThe pathname is followed by NUL bytes so that the total size
307caf54c4fSMartin Matuskaof the fixed header plus pathname is a multiple of four.
308caf54c4fSMartin MatuskaLikewise, the file data is padded to a multiple of four bytes.
309caf54c4fSMartin MatuskaNote that this format supports only 4 gigabyte files (unlike the
310caf54c4fSMartin Matuskaolder ASCII format, which supports 8 gigabyte files).
311caf54c4fSMartin Matuska.Pp
312caf54c4fSMartin MatuskaIn this format, hardlinked files are handled by setting the
313f55be4fcSMartin Matuskafilesize to zero for each entry except the first one that
314caf54c4fSMartin Matuskaappears in the archive.
315caf54c4fSMartin Matuska.Ss New CRC Format
316caf54c4fSMartin MatuskaThe CRC format is identical to the new ASCII format described
317caf54c4fSMartin Matuskain the previous section except that the magic field is set
318caf54c4fSMartin Matuskato
319caf54c4fSMartin Matuska.Dq 070702
320caf54c4fSMartin Matuskaand the
321caf54c4fSMartin Matuska.Va check
322caf54c4fSMartin Matuskafield is set to the sum of all bytes in the file data.
323caf54c4fSMartin MatuskaThis sum is computed treating all bytes as unsigned values
324caf54c4fSMartin Matuskaand using unsigned arithmetic.
325caf54c4fSMartin MatuskaOnly the least-significant 32 bits of the sum are stored.
326caf54c4fSMartin Matuska.Ss HP variants
327caf54c4fSMartin MatuskaThe
328caf54c4fSMartin Matuska.Nm cpio
329caf54c4fSMartin Matuskaimplementation distributed with HPUX used XXXX but stored
330caf54c4fSMartin Matuskadevice numbers differently XXX.
331caf54c4fSMartin Matuska.Ss Other Extensions and Variants
332caf54c4fSMartin MatuskaSun Solaris uses additional file types to store extended file
333caf54c4fSMartin Matuskadata, including ACLs and extended attributes, as special
334caf54c4fSMartin Matuskaentries in cpio archives.
335caf54c4fSMartin Matuska.Pp
336caf54c4fSMartin MatuskaXXX Others? XXX
337caf54c4fSMartin Matuska.Sh SEE ALSO
338caf54c4fSMartin Matuska.Xr cpio 1 ,
339caf54c4fSMartin Matuska.Xr tar 5
340caf54c4fSMartin Matuska.Sh STANDARDS
341caf54c4fSMartin MatuskaThe
342caf54c4fSMartin Matuska.Nm cpio
343caf54c4fSMartin Matuskautility is no longer a part of POSIX or the Single Unix Standard.
344caf54c4fSMartin MatuskaIt last appeared in
345caf54c4fSMartin Matuska.St -susv2 .
346caf54c4fSMartin MatuskaIt has been supplanted in subsequent standards by
347caf54c4fSMartin Matuska.Xr pax 1 .
348caf54c4fSMartin MatuskaThe portable ASCII format is currently part of the specification for the
349caf54c4fSMartin Matuska.Xr pax 1
350caf54c4fSMartin Matuskautility.
351caf54c4fSMartin Matuska.Sh HISTORY
352caf54c4fSMartin MatuskaThe original cpio utility was written by Dick Haight
353caf54c4fSMartin Matuskawhile working in AT&T's Unix Support Group.
354caf54c4fSMartin MatuskaIt appeared in 1977 as part of PWB/UNIX 1.0, the
355caf54c4fSMartin Matuska.Dq Programmer's Work Bench
356caf54c4fSMartin Matuskaderived from
357*ddce862aSMartin Matuska.At 6th Edition UNIX
358caf54c4fSMartin Matuskathat was used internally at AT&T.
359*ddce862aSMartin MatuskaBoth the new binary and old character formats were in use
360caf54c4fSMartin Matuskaby 1980, according to the System III source released
361caf54c4fSMartin Matuskaby SCO under their
362caf54c4fSMartin Matuska.Dq Ancient Unix
363caf54c4fSMartin Matuskalicense.
364caf54c4fSMartin MatuskaThe character format was adopted as part of
365caf54c4fSMartin Matuska.St -p1003.1-88 .
366caf54c4fSMartin MatuskaXXX when did "newc" appear?  Who invented it?  When did HP come out with their variant?  When did Sun introduce ACLs and extended attributes? XXX
367e2f3482bSMartin Matuska.Sh BUGS
368e2f3482bSMartin MatuskaThe
369e2f3482bSMartin Matuska.Dq CRC
370e2f3482bSMartin Matuskaformat is mis-named, as it uses a simple checksum and
371e2f3482bSMartin Matuskanot a cyclic redundancy check.
372e2f3482bSMartin Matuska.Pp
373*ddce862aSMartin MatuskaThe binary formats are limited to 16 bits for user id, group id,
374*ddce862aSMartin Matuskadevice, and inode numbers.  They are limited to 16 megabyte and 2
375*ddce862aSMartin Matuskagigabyte file sizes for the older and newer variants, respectively.
376e2f3482bSMartin Matuska.Pp
377e2f3482bSMartin MatuskaThe old ASCII format is limited to 18 bits for
378e2f3482bSMartin Matuskathe user id, group id, device, and inode numbers.
379e2f3482bSMartin MatuskaIt is limited to 8 gigabyte file sizes.
380e2f3482bSMartin Matuska.Pp
381e2f3482bSMartin MatuskaThe new ASCII format is limited to 4 gigabyte file sizes.
382e2f3482bSMartin Matuska.Pp
383e2f3482bSMartin MatuskaNone of the cpio formats store user or group names,
384e2f3482bSMartin Matuskawhich are essential when moving files between systems with
385e2f3482bSMartin Matuskadissimilar user or group numbering.
386e2f3482bSMartin Matuska.Pp
387e2f3482bSMartin MatuskaEspecially when writing older cpio variants, it may be necessary
388e2f3482bSMartin Matuskato map actual device/inode values to synthesized values that
389e2f3482bSMartin Matuskafit the available fields.
390e2f3482bSMartin MatuskaWith very large filesystems, this may be necessary even for
391e2f3482bSMartin Matuskathe newer formats.
392