xref: /freebsd/contrib/file/README.md (revision 898496ee09ed2b7d25f6807edc4515628196ec0a)
1a4d6d3b8SXin LI## README for file(1) Command and the libmagic(3) library ##
2a4d6d3b8SXin LI
3*898496eeSXin LI    @(#) $File: README.md,v 1.5 2023/05/28 13:59:47 christos Exp $
4a4d6d3b8SXin LI
5a4d6d3b8SXin LI- Bug Tracker: <https://bugs.astron.com/>
6a4d6d3b8SXin LI- Build Status: <https://travis-ci.org/file/file>
7a4d6d3b8SXin LI- Download link: <ftp://ftp.astron.com/pub/file/>
8a4d6d3b8SXin LI- E-mail: <christos@astron.com>
9a4d6d3b8SXin LI- Fuzzing link: <https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&can=1&q=proj:file>
10a4d6d3b8SXin LI- Home page: https://www.darwinsys.com/file/
11a4d6d3b8SXin LI- Mailing List archives: <https://mailman.astron.com/pipermail/file/>
12a4d6d3b8SXin LI- Mailing List: <file@astron.com>
13a4d6d3b8SXin LI- Public repo: <https://github.com/file/file>
14a4d6d3b8SXin LI- Test framework: <https://github.com/file/file-tests>
15a4d6d3b8SXin LI
16a4d6d3b8SXin LIPhone: Do not even think of telephoning me about this program. Send
17a4d6d3b8SXin LIcash first!
18a4d6d3b8SXin LI
19a4d6d3b8SXin LIThis is Release 5.x of Ian Darwin's (copyright but distributable)
20a4d6d3b8SXin LIfile(1) command, an implementation of the Unix File(1) command.
21a4d6d3b8SXin LIIt knows the 'magic number' of several thousands of file types.
22a4d6d3b8SXin LIThis version is the standard "file" command for Linux, *BSD, and
23a4d6d3b8SXin LIother systems. (See "patchlevel.h" for the exact release number).
24a4d6d3b8SXin LI
25a4d6d3b8SXin LIThe major changes for 5.x are CDF file parsing, indirect magic,
26a4d6d3b8SXin LIname/use (recursion) and overhaul in mime and ascii encoding
27a4d6d3b8SXin LIhandling.
28a4d6d3b8SXin LI
29a4d6d3b8SXin LIThe major feature of 4.x is the refactoring of the code into a
30a4d6d3b8SXin LIlibrary, and the re-write of the file command in terms of that
31a4d6d3b8SXin LIlibrary. The library itself, libmagic can be used by 3rd party
32a4d6d3b8SXin LIprograms that wish to identify file types without having to fork()
33a4d6d3b8SXin LIand exec() file. The prime contributor for 4.0 was Mans Rullgard.
34a4d6d3b8SXin LI
35a4d6d3b8SXin LIUNIX is a trademark of UNIX System Laboratories.
36a4d6d3b8SXin LI
37a4d6d3b8SXin LIThe prime contributor to Release 3.8 was Guy Harris, who put in
38a4d6d3b8SXin LImegachanges including byte-order independence.
39a4d6d3b8SXin LI
40a4d6d3b8SXin LIThe prime contributor to Release 3.0 was Christos Zoulas, who put
41a4d6d3b8SXin LIin hundreds of lines of source code changes, including his own
42a4d6d3b8SXin LIANSIfication of the code (I liked my own ANSIfication better, but
43a4d6d3b8SXin LIhis (__P()) is the "Berkeley standard" way of doing it, and I wanted
44a4d6d3b8SXin LIUCB to include the code...), his HP-like "indirection" (a feature
45a4d6d3b8SXin LIof the HP file command, I think), and his mods that finally got
46a4d6d3b8SXin LIthe uncompress (-z) mode finished and working.
47a4d6d3b8SXin LI
48a4d6d3b8SXin LIThis release has compiled in numerous environments; see PORTING
49a4d6d3b8SXin LIfor a list and problems.
50a4d6d3b8SXin LI
51a4d6d3b8SXin LIThis fine freeware file(1) follows the USG (System V) model of the
52a4d6d3b8SXin LIfile command, rather than the Research (V7) version or the V7-derived
53a4d6d3b8SXin LI4.[23] Berkeley one. That is, the file /etc/magic contains much of
54a4d6d3b8SXin LIthe ritual information that is the source of this program's power.
55a4d6d3b8SXin LIMy version knows a little more magic (including tar archives) than
56a4d6d3b8SXin LISystem V; the /etc/magic parsing seems to be compatible with the
57a4d6d3b8SXin LI(poorly documented) System V /etc/magic format (with one exception;
58a4d6d3b8SXin LIsee the man page).
59a4d6d3b8SXin LI
60a4d6d3b8SXin LIIn addition, the /etc/magic file is built from a subdirectory
61a4d6d3b8SXin LIfor easier(?) maintenance.  I will act as a clearinghouse for
62a4d6d3b8SXin LImagic numbers assigned to all sorts of data files that
63a4d6d3b8SXin LIare in reasonable circulation. Send your magic numbers,
64a4d6d3b8SXin LIin magic(5) format please, to the maintainer, Christos Zoulas.
65a4d6d3b8SXin LI
66a4d6d3b8SXin LICOPYING - read this first.
67a4d6d3b8SXin LI* `README` - read this second (you are currently reading this file).
68a4d6d3b8SXin LI* `INSTALL` - read on how to install
69a4d6d3b8SXin LI* `src/apprentice.c` - parses /etc/magic to learn magic
70a4d6d3b8SXin LI* `src/apptype.c` - used for OS/2 specific application type magic
71a4d6d3b8SXin LI* `src/ascmagic.c` - third & last set of tests, based on hardwired assumptions.
72a4d6d3b8SXin LI* `src/asctime_r.c` - replacement for OS's that don't have it.
73a4d6d3b8SXin LI* `src/asprintf.c` - replacement for OS's that don't have it.
74a4d6d3b8SXin LI* `src/buffer.c` - buffer handling functions.
75a4d6d3b8SXin LI* `src/cdf.[ch]` - parser for Microsoft Compound Document Files
76a4d6d3b8SXin LI* `src/cdf_time.c` - time converter for CDF.
77a4d6d3b8SXin LI* `src/compress.c` - handles decompressing files to look inside.
78a4d6d3b8SXin LI* `src/ctime_r.c` - replacement for OS's that don't have it.
79a4d6d3b8SXin LI* `src/der.[ch]` - parser for Distinguished Encoding Rules
80a4d6d3b8SXin LI* `src/dprintf.c` - replacement for OS's that don't have it.
81a4d6d3b8SXin LI* `src/elfclass.h` - common code for elf 32/64.
82a4d6d3b8SXin LI* `src/encoding.c` - handles unicode encodings
83a4d6d3b8SXin LI* `src/file.c` - the main program
84a4d6d3b8SXin LI* `src/file.h` - header file
85a4d6d3b8SXin LI* `src/file_opts.h` - list of options
86a4d6d3b8SXin LI* `src/fmtcheck.c` - replacement for OS's that don't have it.
87a4d6d3b8SXin LI* `src/fsmagic.c` - first set of tests the program runs, based on filesystem info
88a4d6d3b8SXin LI* `src/funcs.c` - utilility functions
89a4d6d3b8SXin LI* `src/getline.c` - replacement for OS's that don't have it.
90a4d6d3b8SXin LI* `src/getopt_long.c` - replacement for OS's that don't have it.
91a4d6d3b8SXin LI* `src/gmtime_r.c` - replacement for OS's that don't have it.
92a4d6d3b8SXin LI* `src/is_csv.c` - knows about Comma Separated Value file format (RFC 4180).
93a4d6d3b8SXin LI* `src/is_json.c` - knows about JavaScript Object Notation format (RFC 8259).
94*898496eeSXin LI* `src/is_simh.c` - knows about SIMH tape file format.
95a4d6d3b8SXin LI* `src/is_tar.c, tar.h` - knows about Tape ARchive format (courtesy John Gilmore).
96a4d6d3b8SXin LI* `src/localtime_r.c` - replacement for OS's that don't have it.
97a4d6d3b8SXin LI* `src/magic.h.in` - source file for magic.h
98a4d6d3b8SXin LI* `src/mygetopt.h` - replacement for OS's that don't have it.
99a4d6d3b8SXin LI* `src/magic.c` - the libmagic api
100a4d6d3b8SXin LI* `src/names.h` - header file for ascmagic.c
101a4d6d3b8SXin LI* `src/pread.c` - replacement for OS's that don't have it.
102a4d6d3b8SXin LI* `src/print.c` - print results, errors, warnings.
103a4d6d3b8SXin LI* `src/readcdf.c` - CDF wrapper.
104a4d6d3b8SXin LI* `src/readelf.[ch]` - Stand-alone elf parsing code.
105a4d6d3b8SXin LI* `src/softmagic.c` - 2nd set of tests, based on /etc/magic
106a4d6d3b8SXin LI* `src/mygetopt.h` - replacement for OS's that don't have it.
107a4d6d3b8SXin LI* `src/strcasestr.c` - replacement for OS's that don't have it.
108a4d6d3b8SXin LI* `src/strlcat.c` - replacement for OS's that don't have it.
109a4d6d3b8SXin LI* `src/strlcpy.c` - replacement for OS's that don't have it.
110a4d6d3b8SXin LI* `src/strndup.c` - replacement for OS's that don't have it.
111a4d6d3b8SXin LI* `src/tar.h` - tar file definitions
112a4d6d3b8SXin LI* `src/vasprintf.c` - for systems that don't have it.
113a4d6d3b8SXin LI* `doc/file.man` - man page for the command
114a4d6d3b8SXin LI* `doc/magic.man` - man page for the magic file, courtesy Guy Harris.
115a4d6d3b8SXin LI	Install as magic.4 on USG and magic.5 on V7 or Berkeley; cf Makefile.
116a4d6d3b8SXin LI
117a4d6d3b8SXin LIMagdir - directory of /etc/magic pieces
118a4d6d3b8SXin LI------------------------------------------------------------------------------
119a4d6d3b8SXin LI
120a4d6d3b8SXin LIIf you submit a new magic entry please make sure you read the following
121a4d6d3b8SXin LIguidelines:
122a4d6d3b8SXin LI
123a4d6d3b8SXin LI- Initial match is preferably at least 32 bits long, and is a _unique_ match
124a4d6d3b8SXin LI- If this is not feasible, use additional check
125a4d6d3b8SXin LI- Match of <= 16 bits are not accepted
126a4d6d3b8SXin LI- Delay printing string as much as possible, don't print output too early
127a4d6d3b8SXin LI- Avoid printf arbitrary byte as string, which can be a source of
128a4d6d3b8SXin LI  crash and buffer overflow
129a4d6d3b8SXin LI
130a4d6d3b8SXin LI- Provide complete information with entry:
131a4d6d3b8SXin LI  * One line short summary
132a4d6d3b8SXin LI  * Optional long description
133a4d6d3b8SXin LI  * File extension, if applicable
134a4d6d3b8SXin LI  * Full name and contact method (for discussion when entry has problem)
135a4d6d3b8SXin LI  * Further reference, such as documentation of format
136a4d6d3b8SXin LI
137a4d6d3b8SXin LIgpg for dummies:
138a4d6d3b8SXin LI------------------------------------------------------------------------------
139a4d6d3b8SXin LI
140a4d6d3b8SXin LI```
141a4d6d3b8SXin LI$ gpg --verify file-X.YY.tar.gz.asc file-X.YY.tar.gz
142a4d6d3b8SXin LIgpg: assuming signed data in `file-X.YY.tar.gz'
143a4d6d3b8SXin LIgpg: Signature made WWW MMM DD HH:MM:SS YYYY ZZZ using DSA key ID KKKKKKKK
144a4d6d3b8SXin LI```
145a4d6d3b8SXin LI
146a4d6d3b8SXin LITo download the key:
147a4d6d3b8SXin LI
148a4d6d3b8SXin LI```
149a4d6d3b8SXin LI$ gpg --keyserver hkp://keys.gnupg.net --recv-keys KKKKKKKK
150a4d6d3b8SXin LI```
151a4d6d3b8SXin LI------------------------------------------------------------------------------
152a4d6d3b8SXin LI
153a4d6d3b8SXin LI
154a4d6d3b8SXin LIParts of this software were developed at SoftQuad Inc., developers
155a4d6d3b8SXin LIof SGML/HTML/XML publishing software, in Toronto, Canada.
156a4d6d3b8SXin LISoftQuad was swallowed up by Corel in 2002 and does not exist any longer.
157