1# Welcome to libarchive! 2 3The libarchive project develops a portable, efficient C library that 4can read and write streaming archives in a variety of formats. It 5also includes implementations of the common `tar`, `cpio`, and `zcat` 6command-line tools that use the libarchive library. 7 8## Questions? Issues? 9 10* http://www.libarchive.org is the home for ongoing 11 libarchive development, including documentation, 12 and links to the libarchive mailing lists. 13* To report an issue, use the issue tracker at 14 https://github.com/libarchive/libarchive/issues 15* To submit an enhancement to libarchive, please 16 submit a pull request via GitHub: https://github.com/libarchive/libarchive/pulls 17 18## Contents of the Distribution 19 20This distribution bundle includes the following major components: 21 22* **libarchive**: a library for reading and writing streaming archives 23* **tar**: the 'bsdtar' program is a full-featured 'tar' implementation built on libarchive 24* **cpio**: the 'bsdcpio' program is a different interface to essentially the same functionality 25* **cat**: the 'bsdcat' program is a simple replacement tool for zcat, bzcat, xzcat, and such 26* **examples**: Some small example programs that you may find useful. 27* **examples/minitar**: a compact sample demonstrating use of libarchive. 28* **contrib**: Various items sent to me by third parties; please contact the authors with any questions. 29 30The top-level directory contains the following information files: 31 32* **NEWS** - highlights of recent changes 33* **COPYING** - what you can do with this 34* **INSTALL** - installation instructions 35* **README** - this file 36* **CMakeLists.txt** - input for "cmake" build tool, see INSTALL 37* **configure** - configuration script, see INSTALL for details. If your copy of the source lacks a `configure` script, you can try to construct it by running the script in `build/autogen.sh` (or use `cmake`). 38 39The following files in the top-level directory are used by the 'configure' script: 40* `Makefile.am`, `aclocal.m4`, `configure.ac` - used to build this distribution, only needed by maintainers 41* `Makefile.in`, `config.h.in` - templates used by configure script 42 43## Documentation 44 45In addition to the informational articles and documentation 46in the online [libarchive Wiki](https://github.com/libarchive/libarchive/wiki), 47the distribution also includes a number of manual pages: 48 49 * bsdtar.1 explains the use of the bsdtar program 50 * bsdcpio.1 explains the use of the bsdcpio program 51 * bsdcat.1 explains the use of the bsdcat program 52 * libarchive.3 gives an overview of the library as a whole 53 * archive_read.3, archive_write.3, archive_write_disk.3, and 54 archive_read_disk.3 provide detailed calling sequences for the read 55 and write APIs 56 * archive_entry.3 details the "struct archive_entry" utility class 57 * archive_internals.3 provides some insight into libarchive's 58 internal structure and operation. 59 * libarchive-formats.5 documents the file formats supported by the library 60 * cpio.5, mtree.5, and tar.5 provide detailed information about these 61 popular archive formats, including hard-to-find details about 62 modern cpio and tar variants. 63 64The manual pages above are provided in the 'doc' directory in 65a number of different formats. 66 67You should also read the copious comments in `archive.h` and the 68source code for the sample programs for more details. Please let us 69know about any errors or omissions you find. 70 71## Supported Formats 72 73Currently, the library automatically detects and reads the following formats: 74 * Old V7 tar archives 75 * POSIX ustar 76 * GNU tar format (including GNU long filenames, long link names, and sparse files) 77 * Solaris 9 extended tar format (including ACLs) 78 * POSIX pax interchange format 79 * POSIX octet-oriented cpio 80 * SVR4 ASCII cpio 81 * Binary cpio (big-endian or little-endian) 82 * PWB binary cpio 83 * ISO9660 CD-ROM images (with optional Rockridge or Joliet extensions) 84 * ZIP archives (with uncompressed or "deflate" compressed entries, including support for encrypted Zip archives) 85 * ZIPX archives (with support for bzip2, ppmd8, lzma and xz compressed entries) 86 * GNU and BSD 'ar' archives 87 * 'mtree' format 88 * 7-Zip archives 89 * Microsoft CAB format 90 * LHA and LZH archives 91 * RAR and RAR 5.0 archives (with some limitations due to RAR's proprietary status) 92 * XAR archives 93 94The library also detects and handles any of the following before evaluating the archive: 95 * uuencoded files 96 * files with RPM wrapper 97 * gzip compression 98 * bzip2 compression 99 * compress/LZW compression 100 * lzma, lzip, and xz compression 101 * lz4 compression 102 * lzop compression 103 * zstandard compression 104 105The library can create archives in any of the following formats: 106 * POSIX ustar 107 * POSIX pax interchange format 108 * "restricted" pax format, which will create ustar archives except for 109 entries that require pax extensions (for long filenames, ACLs, etc). 110 * Old GNU tar format 111 * Old V7 tar format 112 * POSIX octet-oriented cpio 113 * SVR4 "newc" cpio 114 * Binary cpio (little-endian) 115 * PWB binary cpio 116 * shar archives 117 * ZIP archives (with uncompressed or "deflate" compressed entries) 118 * GNU and BSD 'ar' archives 119 * 'mtree' format 120 * ISO9660 format 121 * 7-Zip archives 122 * XAR archives 123 124When creating archives, the result can be filtered with any of the following: 125 * uuencode 126 * gzip compression 127 * bzip2 compression 128 * compress/LZW compression 129 * lzma, lzip, and xz compression 130 * lz4 compression 131 * lzop compression 132 * zstandard compression 133 134## Notes about the Library Design 135 136The following notes address many of the most common 137questions we are asked about libarchive: 138 139* This is a heavily stream-oriented system. That means that 140 it is optimized to read or write the archive in a single 141 pass from beginning to end. For example, this allows 142 libarchive to process archives too large to store on disk 143 by processing them on-the-fly as they are read from or 144 written to a network or tape drive. This also makes 145 libarchive useful for tools that need to produce 146 archives on-the-fly (such as webservers that provide 147 archived contents of a users account). 148 149* In-place modification and random access to the contents 150 of an archive are not directly supported. For some formats, 151 this is not an issue: For example, tar.gz archives are not 152 designed for random access. In some other cases, libarchive 153 can re-open an archive and scan it from the beginning quickly 154 enough to provide the needed abilities even without true 155 random access. Of course, some applications do require true 156 random access; those applications should consider alternatives 157 to libarchive. 158 159* The library is designed to be extended with new compression and 160 archive formats. The only requirement is that the format be 161 readable or writable as a stream and that each archive entry be 162 independent. There are articles on the libarchive Wiki explaining 163 how to extend libarchive. 164 165* On read, compression and format are always detected automatically. 166 167* The same API is used for all formats; it should be very 168 easy for software using libarchive to transparently handle 169 any of libarchive's archiving formats. 170 171* Libarchive's automatic support for decompression can be used 172 without archiving by explicitly selecting the "raw" and "empty" 173 formats. 174 175* I've attempted to minimize static link pollution. If you don't 176 explicitly invoke a particular feature (such as support for a 177 particular compression or format), it won't get pulled in to 178 statically-linked programs. In particular, if you don't explicitly 179 enable a particular compression or decompression support, you won't 180 need to link against the corresponding compression or decompression 181 libraries. This also reduces the size of statically-linked 182 binaries in environments where that matters. 183 184* The library is generally _thread safe_ depending on the platform: 185 it does not define any global variables of its own. However, some 186 platforms do not provide fully thread-safe versions of key C library 187 functions. On those platforms, libarchive will use the non-thread-safe 188 functions. Patches to improve this are of great interest to us. 189 190* In particular, libarchive's modules to read or write a directory 191 tree do use `chdir()` to optimize the directory traversals. This 192 can cause problems for programs that expect to do disk access from 193 multiple threads. Of course, those modules are completely 194 optional and you can use the rest of libarchive without them. 195 196* The library is _not_ thread aware, however. It does no locking 197 or thread management of any kind. If you create a libarchive 198 object and need to access it from multiple threads, you will 199 need to provide your own locking. 200 201* On read, the library accepts whatever blocks you hand it. 202 Your read callback is free to pass the library a byte at a time 203 or mmap the entire archive and give it to the library at once. 204 On write, the library always produces correctly-blocked output. 205 206* The object-style approach allows you to have multiple archive streams 207 open at once. bsdtar uses this in its "@archive" extension. 208 209* The archive itself is read/written using callback functions. 210 You can read an archive directly from an in-memory buffer or 211 write it to a socket, if you wish. There are some utility 212 functions to provide easy-to-use "open file," etc, capabilities. 213 214* The read/write APIs are designed to allow individual entries 215 to be read or written to any data source: You can create 216 a block of data in memory and add it to a tar archive without 217 first writing a temporary file. You can also read an entry from 218 an archive and write the data directly to a socket. If you want 219 to read/write entries to disk, there are convenience functions to 220 make this especially easy. 221 222* Note: The "pax interchange format" is a POSIX standard extended tar 223 format that should be used when the older _ustar_ format is not 224 appropriate. It has many advantages over other tar formats 225 (including the legacy GNU tar format) and is widely supported by 226 current tar implementations. 227 228