1# Welcome to libarchive! 2 3The libarchive project develops a portable, efficient C library that 4can read and write streaming archives in a variety of formats. It 5also includes implementations of the common `tar`, `cpio`, and `zcat` 6command-line tools that use the libarchive library. 7 8## Questions? Issues? 9 10* http://www.libarchive.org is the home for ongoing 11 libarchive development, including documentation, 12 and links to the libarchive mailing lists. 13* To report an issue, use the issue tracker at 14 https://github.com/libarchive/libarchive/issues 15* To submit an enhancement to libarchive, please 16 submit a pull request via GitHub: https://github.com/libarchive/libarchive/pulls 17 18## Contents of the Distribution 19 20This distribution bundle includes the following major components: 21 22* **libarchive**: a library for reading and writing streaming archives 23* **tar**: the 'bsdtar' program is a full-featured 'tar' implementation built on libarchive 24* **cpio**: the 'bsdcpio' program is a different interface to essentially the same functionality 25* **cat**: the 'bsdcat' program is a simple replacement tool for zcat, bzcat, xzcat, and such 26* **examples**: Some small example programs that you may find useful. 27* **examples/minitar**: a compact sample demonstrating use of libarchive. 28* **contrib**: Various items sent to me by third parties; please contact the authors with any questions. 29 30The top-level directory contains the following information files: 31 32* **NEWS** - highlights of recent changes 33* **COPYING** - what you can do with this 34* **INSTALL** - installation instructions 35* **README** - this file 36* **CMakeLists.txt** - input for "cmake" build tool, see INSTALL 37* **configure** - configuration script, see INSTALL for details. If your copy of the source lacks a `configure` script, you can try to construct it by running the script in `build/autogen.sh` (or use `cmake`). 38 39The following files in the top-level directory are used by the 'configure' script: 40 41* `Makefile.am`, `aclocal.m4`, `configure.ac` - used to build this distribution, only needed by maintainers 42* `Makefile.in`, `config.h.in` - templates used by configure script 43 44## Documentation 45 46In addition to the informational articles and documentation 47in the online [libarchive Wiki](https://github.com/libarchive/libarchive/wiki), 48the distribution also includes a number of manual pages: 49 50 * bsdtar.1 explains the use of the bsdtar program 51 * bsdcpio.1 explains the use of the bsdcpio program 52 * bsdcat.1 explains the use of the bsdcat program 53 * libarchive.3 gives an overview of the library as a whole 54 * archive_read.3, archive_write.3, archive_write_disk.3, and 55 archive_read_disk.3 provide detailed calling sequences for the read 56 and write APIs 57 * archive_entry.3 details the "struct archive_entry" utility class 58 * archive_internals.3 provides some insight into libarchive's 59 internal structure and operation. 60 * libarchive-formats.5 documents the file formats supported by the library 61 * cpio.5, mtree.5, and tar.5 provide detailed information about these 62 popular archive formats, including hard-to-find details about 63 modern cpio and tar variants. 64 65The manual pages above are provided in the 'doc' directory in 66a number of different formats. 67 68You should also read the copious comments in `archive.h` and the 69source code for the sample programs for more details. Please let us 70know about any errors or omissions you find. 71 72## Supported Formats 73 74Currently, the library automatically detects and reads the following formats: 75 76 * Old V7 tar archives 77 * POSIX ustar 78 * GNU tar format (including GNU long filenames, long link names, and sparse files) 79 * Solaris 9 extended tar format (including ACLs) 80 * POSIX pax interchange format 81 * POSIX octet-oriented cpio 82 * SVR4 ASCII cpio 83 * Binary cpio (big-endian or little-endian) 84 * PWB binary cpio 85 * ISO9660 CD-ROM images (with optional Rockridge or Joliet extensions) 86 * ZIP archives (with uncompressed or "deflate" compressed entries, including support for encrypted Zip archives) 87 * ZIPX archives (with support for bzip2, ppmd8, lzma and xz compressed entries) 88 * GNU and BSD 'ar' archives 89 * 'mtree' format 90 * 7-Zip archives 91 * Microsoft CAB format 92 * LHA and LZH archives 93 * RAR and RAR 5.0 archives (with some limitations due to RAR's proprietary status) 94 * XAR archives 95 96The library also detects and handles any of the following before evaluating the archive: 97 98 * uuencoded files 99 * files with RPM wrapper 100 * gzip compression 101 * bzip2 compression 102 * compress/LZW compression 103 * lzma, lzip, and xz compression 104 * lz4 compression 105 * lzop compression 106 * zstandard compression 107 108The library can create archives in any of the following formats: 109 110 * POSIX ustar 111 * POSIX pax interchange format 112 * "restricted" pax format, which will create ustar archives except for 113 entries that require pax extensions (for long filenames, ACLs, etc). 114 * Old GNU tar format 115 * Old V7 tar format 116 * POSIX octet-oriented cpio 117 * SVR4 "newc" cpio 118 * Binary cpio (little-endian) 119 * PWB binary cpio 120 * shar archives 121 * ZIP archives (with uncompressed or "deflate" compressed entries) 122 * GNU and BSD 'ar' archives 123 * 'mtree' format 124 * ISO9660 format 125 * 7-Zip archives 126 * XAR archives 127 128When creating archives, the result can be filtered with any of the following: 129 130 * uuencode 131 * gzip compression 132 * bzip2 compression 133 * compress/LZW compression 134 * lzma, lzip, and xz compression 135 * lz4 compression 136 * lzop compression 137 * zstandard compression 138 139## Notes about the Library Design 140 141The following notes address many of the most common 142questions we are asked about libarchive: 143 144* This is a heavily stream-oriented system. That means that 145 it is optimized to read or write the archive in a single 146 pass from beginning to end. For example, this allows 147 libarchive to process archives too large to store on disk 148 by processing them on-the-fly as they are read from or 149 written to a network or tape drive. This also makes 150 libarchive useful for tools that need to produce 151 archives on-the-fly (such as webservers that provide 152 archived contents of a users account). 153 154* In-place modification and random access to the contents 155 of an archive are not directly supported. For some formats, 156 this is not an issue: For example, tar.gz archives are not 157 designed for random access. In some other cases, libarchive 158 can re-open an archive and scan it from the beginning quickly 159 enough to provide the needed abilities even without true 160 random access. Of course, some applications do require true 161 random access; those applications should consider alternatives 162 to libarchive. 163 164* The library is designed to be extended with new compression and 165 archive formats. The only requirement is that the format be 166 readable or writable as a stream and that each archive entry be 167 independent. There are articles on the libarchive Wiki explaining 168 how to extend libarchive. 169 170* On read, compression and format are always detected automatically. 171 172* The same API is used for all formats; it should be very 173 easy for software using libarchive to transparently handle 174 any of libarchive's archiving formats. 175 176* Libarchive's automatic support for decompression can be used 177 without archiving by explicitly selecting the "raw" and "empty" 178 formats. 179 180* I've attempted to minimize static link pollution. If you don't 181 explicitly invoke a particular feature (such as support for a 182 particular compression or format), it won't get pulled in to 183 statically-linked programs. In particular, if you don't explicitly 184 enable a particular compression or decompression support, you won't 185 need to link against the corresponding compression or decompression 186 libraries. This also reduces the size of statically-linked 187 binaries in environments where that matters. 188 189* The library is generally _thread safe_ depending on the platform: 190 it does not define any global variables of its own. However, some 191 platforms do not provide fully thread-safe versions of key C library 192 functions. On those platforms, libarchive will use the non-thread-safe 193 functions. Patches to improve this are of great interest to us. 194 195* In particular, libarchive's modules to read or write a directory 196 tree do use `chdir()` to optimize the directory traversals. This 197 can cause problems for programs that expect to do disk access from 198 multiple threads. Of course, those modules are completely 199 optional and you can use the rest of libarchive without them. 200 201* The library is _not_ thread aware, however. It does no locking 202 or thread management of any kind. If you create a libarchive 203 object and need to access it from multiple threads, you will 204 need to provide your own locking. 205 206* On read, the library accepts whatever blocks you hand it. 207 Your read callback is free to pass the library a byte at a time 208 or mmap the entire archive and give it to the library at once. 209 On write, the library always produces correctly-blocked output. 210 211* The object-style approach allows you to have multiple archive streams 212 open at once. bsdtar uses this in its "@archive" extension. 213 214* The archive itself is read/written using callback functions. 215 You can read an archive directly from an in-memory buffer or 216 write it to a socket, if you wish. There are some utility 217 functions to provide easy-to-use "open file," etc, capabilities. 218 219* The read/write APIs are designed to allow individual entries 220 to be read or written to any data source: You can create 221 a block of data in memory and add it to a tar archive without 222 first writing a temporary file. You can also read an entry from 223 an archive and write the data directly to a socket. If you want 224 to read/write entries to disk, there are convenience functions to 225 make this especially easy. 226 227* Note: The "pax interchange format" is a POSIX standard extended tar 228 format that should be used when the older _ustar_ format is not 229 appropriate. It has many advantages over other tar formats 230 (including the legacy GNU tar format) and is widely supported by 231 current tar implementations. 232 233