1*a4d6d3b8SXin LI## README for file(1) Command and the libmagic(3) library ## 2*a4d6d3b8SXin LI 3*a4d6d3b8SXin LI @(#) $File: README.md,v 1.4 2021/10/21 01:51:31 christos Exp $ 4*a4d6d3b8SXin LI 5*a4d6d3b8SXin LI- Bug Tracker: <https://bugs.astron.com/> 6*a4d6d3b8SXin LI- Build Status: <https://travis-ci.org/file/file> 7*a4d6d3b8SXin LI- Download link: <ftp://ftp.astron.com/pub/file/> 8*a4d6d3b8SXin LI- E-mail: <christos@astron.com> 9*a4d6d3b8SXin LI- Fuzzing link: <https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&can=1&q=proj:file> 10*a4d6d3b8SXin LI- Home page: https://www.darwinsys.com/file/ 11*a4d6d3b8SXin LI- Mailing List archives: <https://mailman.astron.com/pipermail/file/> 12*a4d6d3b8SXin LI- Mailing List: <file@astron.com> 13*a4d6d3b8SXin LI- Public repo: <https://github.com/file/file> 14*a4d6d3b8SXin LI- Test framework: <https://github.com/file/file-tests> 15*a4d6d3b8SXin LI 16*a4d6d3b8SXin LIPhone: Do not even think of telephoning me about this program. Send 17*a4d6d3b8SXin LIcash first! 18*a4d6d3b8SXin LI 19*a4d6d3b8SXin LIThis is Release 5.x of Ian Darwin's (copyright but distributable) 20*a4d6d3b8SXin LIfile(1) command, an implementation of the Unix File(1) command. 21*a4d6d3b8SXin LIIt knows the 'magic number' of several thousands of file types. 22*a4d6d3b8SXin LIThis version is the standard "file" command for Linux, *BSD, and 23*a4d6d3b8SXin LIother systems. (See "patchlevel.h" for the exact release number). 24*a4d6d3b8SXin LI 25*a4d6d3b8SXin LIThe major changes for 5.x are CDF file parsing, indirect magic, 26*a4d6d3b8SXin LIname/use (recursion) and overhaul in mime and ascii encoding 27*a4d6d3b8SXin LIhandling. 28*a4d6d3b8SXin LI 29*a4d6d3b8SXin LIThe major feature of 4.x is the refactoring of the code into a 30*a4d6d3b8SXin LIlibrary, and the re-write of the file command in terms of that 31*a4d6d3b8SXin LIlibrary. The library itself, libmagic can be used by 3rd party 32*a4d6d3b8SXin LIprograms that wish to identify file types without having to fork() 33*a4d6d3b8SXin LIand exec() file. The prime contributor for 4.0 was Mans Rullgard. 34*a4d6d3b8SXin LI 35*a4d6d3b8SXin LIUNIX is a trademark of UNIX System Laboratories. 36*a4d6d3b8SXin LI 37*a4d6d3b8SXin LIThe prime contributor to Release 3.8 was Guy Harris, who put in 38*a4d6d3b8SXin LImegachanges including byte-order independence. 39*a4d6d3b8SXin LI 40*a4d6d3b8SXin LIThe prime contributor to Release 3.0 was Christos Zoulas, who put 41*a4d6d3b8SXin LIin hundreds of lines of source code changes, including his own 42*a4d6d3b8SXin LIANSIfication of the code (I liked my own ANSIfication better, but 43*a4d6d3b8SXin LIhis (__P()) is the "Berkeley standard" way of doing it, and I wanted 44*a4d6d3b8SXin LIUCB to include the code...), his HP-like "indirection" (a feature 45*a4d6d3b8SXin LIof the HP file command, I think), and his mods that finally got 46*a4d6d3b8SXin LIthe uncompress (-z) mode finished and working. 47*a4d6d3b8SXin LI 48*a4d6d3b8SXin LIThis release has compiled in numerous environments; see PORTING 49*a4d6d3b8SXin LIfor a list and problems. 50*a4d6d3b8SXin LI 51*a4d6d3b8SXin LIThis fine freeware file(1) follows the USG (System V) model of the 52*a4d6d3b8SXin LIfile command, rather than the Research (V7) version or the V7-derived 53*a4d6d3b8SXin LI4.[23] Berkeley one. That is, the file /etc/magic contains much of 54*a4d6d3b8SXin LIthe ritual information that is the source of this program's power. 55*a4d6d3b8SXin LIMy version knows a little more magic (including tar archives) than 56*a4d6d3b8SXin LISystem V; the /etc/magic parsing seems to be compatible with the 57*a4d6d3b8SXin LI(poorly documented) System V /etc/magic format (with one exception; 58*a4d6d3b8SXin LIsee the man page). 59*a4d6d3b8SXin LI 60*a4d6d3b8SXin LIIn addition, the /etc/magic file is built from a subdirectory 61*a4d6d3b8SXin LIfor easier(?) maintenance. I will act as a clearinghouse for 62*a4d6d3b8SXin LImagic numbers assigned to all sorts of data files that 63*a4d6d3b8SXin LIare in reasonable circulation. Send your magic numbers, 64*a4d6d3b8SXin LIin magic(5) format please, to the maintainer, Christos Zoulas. 65*a4d6d3b8SXin LI 66*a4d6d3b8SXin LICOPYING - read this first. 67*a4d6d3b8SXin LI* `README` - read this second (you are currently reading this file). 68*a4d6d3b8SXin LI* `INSTALL` - read on how to install 69*a4d6d3b8SXin LI* `src/apprentice.c` - parses /etc/magic to learn magic 70*a4d6d3b8SXin LI* `src/apptype.c` - used for OS/2 specific application type magic 71*a4d6d3b8SXin LI* `src/ascmagic.c` - third & last set of tests, based on hardwired assumptions. 72*a4d6d3b8SXin LI* `src/asctime_r.c` - replacement for OS's that don't have it. 73*a4d6d3b8SXin LI* `src/asprintf.c` - replacement for OS's that don't have it. 74*a4d6d3b8SXin LI* `src/buffer.c` - buffer handling functions. 75*a4d6d3b8SXin LI* `src/cdf.[ch]` - parser for Microsoft Compound Document Files 76*a4d6d3b8SXin LI* `src/cdf_time.c` - time converter for CDF. 77*a4d6d3b8SXin LI* `src/compress.c` - handles decompressing files to look inside. 78*a4d6d3b8SXin LI* `src/ctime_r.c` - replacement for OS's that don't have it. 79*a4d6d3b8SXin LI* `src/der.[ch]` - parser for Distinguished Encoding Rules 80*a4d6d3b8SXin LI* `src/dprintf.c` - replacement for OS's that don't have it. 81*a4d6d3b8SXin LI* `src/elfclass.h` - common code for elf 32/64. 82*a4d6d3b8SXin LI* `src/encoding.c` - handles unicode encodings 83*a4d6d3b8SXin LI* `src/file.c` - the main program 84*a4d6d3b8SXin LI* `src/file.h` - header file 85*a4d6d3b8SXin LI* `src/file_opts.h` - list of options 86*a4d6d3b8SXin LI* `src/fmtcheck.c` - replacement for OS's that don't have it. 87*a4d6d3b8SXin LI* `src/fsmagic.c` - first set of tests the program runs, based on filesystem info 88*a4d6d3b8SXin LI* `src/funcs.c` - utilility functions 89*a4d6d3b8SXin LI* `src/getline.c` - replacement for OS's that don't have it. 90*a4d6d3b8SXin LI* `src/getopt_long.c` - replacement for OS's that don't have it. 91*a4d6d3b8SXin LI* `src/gmtime_r.c` - replacement for OS's that don't have it. 92*a4d6d3b8SXin LI* `src/is_csv.c` - knows about Comma Separated Value file format (RFC 4180). 93*a4d6d3b8SXin LI* `src/is_json.c` - knows about JavaScript Object Notation format (RFC 8259). 94*a4d6d3b8SXin LI* `src/is_tar.c, tar.h` - knows about Tape ARchive format (courtesy John Gilmore). 95*a4d6d3b8SXin LI* `src/localtime_r.c` - replacement for OS's that don't have it. 96*a4d6d3b8SXin LI* `src/magic.h.in` - source file for magic.h 97*a4d6d3b8SXin LI* `src/mygetopt.h` - replacement for OS's that don't have it. 98*a4d6d3b8SXin LI* `src/magic.c` - the libmagic api 99*a4d6d3b8SXin LI* `src/names.h` - header file for ascmagic.c 100*a4d6d3b8SXin LI* `src/pread.c` - replacement for OS's that don't have it. 101*a4d6d3b8SXin LI* `src/print.c` - print results, errors, warnings. 102*a4d6d3b8SXin LI* `src/readcdf.c` - CDF wrapper. 103*a4d6d3b8SXin LI* `src/readelf.[ch]` - Stand-alone elf parsing code. 104*a4d6d3b8SXin LI* `src/softmagic.c` - 2nd set of tests, based on /etc/magic 105*a4d6d3b8SXin LI* `src/mygetopt.h` - replacement for OS's that don't have it. 106*a4d6d3b8SXin LI* `src/strcasestr.c` - replacement for OS's that don't have it. 107*a4d6d3b8SXin LI* `src/strlcat.c` - replacement for OS's that don't have it. 108*a4d6d3b8SXin LI* `src/strlcpy.c` - replacement for OS's that don't have it. 109*a4d6d3b8SXin LI* `src/strndup.c` - replacement for OS's that don't have it. 110*a4d6d3b8SXin LI* `src/tar.h` - tar file definitions 111*a4d6d3b8SXin LI* `src/vasprintf.c` - for systems that don't have it. 112*a4d6d3b8SXin LI* `doc/file.man` - man page for the command 113*a4d6d3b8SXin LI* `doc/magic.man` - man page for the magic file, courtesy Guy Harris. 114*a4d6d3b8SXin LI Install as magic.4 on USG and magic.5 on V7 or Berkeley; cf Makefile. 115*a4d6d3b8SXin LI 116*a4d6d3b8SXin LIMagdir - directory of /etc/magic pieces 117*a4d6d3b8SXin LI------------------------------------------------------------------------------ 118*a4d6d3b8SXin LI 119*a4d6d3b8SXin LIIf you submit a new magic entry please make sure you read the following 120*a4d6d3b8SXin LIguidelines: 121*a4d6d3b8SXin LI 122*a4d6d3b8SXin LI- Initial match is preferably at least 32 bits long, and is a _unique_ match 123*a4d6d3b8SXin LI- If this is not feasible, use additional check 124*a4d6d3b8SXin LI- Match of <= 16 bits are not accepted 125*a4d6d3b8SXin LI- Delay printing string as much as possible, don't print output too early 126*a4d6d3b8SXin LI- Avoid printf arbitrary byte as string, which can be a source of 127*a4d6d3b8SXin LI crash and buffer overflow 128*a4d6d3b8SXin LI 129*a4d6d3b8SXin LI- Provide complete information with entry: 130*a4d6d3b8SXin LI * One line short summary 131*a4d6d3b8SXin LI * Optional long description 132*a4d6d3b8SXin LI * File extension, if applicable 133*a4d6d3b8SXin LI * Full name and contact method (for discussion when entry has problem) 134*a4d6d3b8SXin LI * Further reference, such as documentation of format 135*a4d6d3b8SXin LI 136*a4d6d3b8SXin LIgpg for dummies: 137*a4d6d3b8SXin LI------------------------------------------------------------------------------ 138*a4d6d3b8SXin LI 139*a4d6d3b8SXin LI``` 140*a4d6d3b8SXin LI$ gpg --verify file-X.YY.tar.gz.asc file-X.YY.tar.gz 141*a4d6d3b8SXin LIgpg: assuming signed data in `file-X.YY.tar.gz' 142*a4d6d3b8SXin LIgpg: Signature made WWW MMM DD HH:MM:SS YYYY ZZZ using DSA key ID KKKKKKKK 143*a4d6d3b8SXin LI``` 144*a4d6d3b8SXin LI 145*a4d6d3b8SXin LITo download the key: 146*a4d6d3b8SXin LI 147*a4d6d3b8SXin LI``` 148*a4d6d3b8SXin LI$ gpg --keyserver hkp://keys.gnupg.net --recv-keys KKKKKKKK 149*a4d6d3b8SXin LI``` 150*a4d6d3b8SXin LI------------------------------------------------------------------------------ 151*a4d6d3b8SXin LI 152*a4d6d3b8SXin LI 153*a4d6d3b8SXin LIParts of this software were developed at SoftQuad Inc., developers 154*a4d6d3b8SXin LIof SGML/HTML/XML publishing software, in Toronto, Canada. 155*a4d6d3b8SXin LISoftQuad was swallowed up by Corel in 2002 and does not exist any longer. 156