1*ae316d1dSXin LI.\" $File: file.man,v 1.151 2024/04/07 21:27:35 christos Exp $ 2*ae316d1dSXin LI.Dd April 7, 2024 3b6cee71dSXin LI.Dt FILE __CSECTION__ 4b6cee71dSXin LI.Os 5b6cee71dSXin LI.Sh NAME 6b6cee71dSXin LI.Nm file 7b6cee71dSXin LI.Nd determine file type 8b6cee71dSXin LI.Sh SYNOPSIS 9b6cee71dSXin LI.Nm 10b6cee71dSXin LI.Bk -words 1158a0f0d0SEitan Adler.Op Fl bcdEhiklLNnprsSvzZ0 12b6cee71dSXin LI.Op Fl Fl apple 132726a701SXin LI.Op Fl Fl exclude-quiet 145f0216bdSXin LI.Op Fl Fl extension 15b6cee71dSXin LI.Op Fl Fl mime-encoding 16b6cee71dSXin LI.Op Fl Fl mime-type 17b6cee71dSXin LI.Op Fl e Ar testname 18b6cee71dSXin LI.Op Fl F Ar separator 19b6cee71dSXin LI.Op Fl f Ar namefile 20b6cee71dSXin LI.Op Fl m Ar magicfiles 21c2931133SXin LI.Op Fl P Ar name=value 22b6cee71dSXin LI.Ar 23b6cee71dSXin LI.Ek 24b6cee71dSXin LI.Nm 25b6cee71dSXin LI.Fl C 26b6cee71dSXin LI.Op Fl m Ar magicfiles 27b6cee71dSXin LI.Nm 28b6cee71dSXin LI.Op Fl Fl help 29b6cee71dSXin LI.Sh DESCRIPTION 30b6cee71dSXin LIThis manual page documents version __VERSION__ of the 31b6cee71dSXin LI.Nm 32b6cee71dSXin LIcommand. 33b6cee71dSXin LI.Pp 34b6cee71dSXin LI.Nm 35b6cee71dSXin LItests each argument in an attempt to classify it. 36b6cee71dSXin LIThere are three sets of tests, performed in this order: 37b6cee71dSXin LIfilesystem tests, magic tests, and language tests. 38b6cee71dSXin LIThe 39b6cee71dSXin LI.Em first 40b6cee71dSXin LItest that succeeds causes the file type to be printed. 41b6cee71dSXin LI.Pp 42b6cee71dSXin LIThe type printed will usually contain one of the words 43b6cee71dSXin LI.Em text 44b6cee71dSXin LI(the file contains only 45b6cee71dSXin LIprinting characters and a few common control 46b6cee71dSXin LIcharacters and is probably safe to read on an 47b6cee71dSXin LI.Dv ASCII 48b6cee71dSXin LIterminal), 49b6cee71dSXin LI.Em executable 50b6cee71dSXin LI(the file contains the result of compiling a program 51b6cee71dSXin LIin a form understandable to some 52b6cee71dSXin LI.Tn UNIX 53b6cee71dSXin LIkernel or another), 54b6cee71dSXin LIor 55b6cee71dSXin LI.Em data 56b6cee71dSXin LImeaning anything else (data is usually 57b6cee71dSXin LI.Dq binary 58b6cee71dSXin LIor non-printable). 59b6cee71dSXin LIExceptions are well-known file formats (core files, tar archives) 60b6cee71dSXin LIthat are known to contain binary data. 61b6cee71dSXin LIWhen modifying magic files or the program itself, make sure to 6243a5ec4eSXin LI.Em preserve these keywords . 63b6cee71dSXin LIUsers depend on knowing that all the readable files in a directory 64b6cee71dSXin LIhave the word 65b6cee71dSXin LI.Dq text 66b6cee71dSXin LIprinted. 67b6cee71dSXin LIDon't do as Berkeley did and change 68b6cee71dSXin LI.Dq shell commands text 69b6cee71dSXin LIto 70b6cee71dSXin LI.Dq shell script . 71b6cee71dSXin LI.Pp 72b6cee71dSXin LIThe filesystem tests are based on examining the return from a 73b6cee71dSXin LI.Xr stat 2 74b6cee71dSXin LIsystem call. 75b6cee71dSXin LIThe program checks to see if the file is empty, 76b6cee71dSXin LIor if it's some sort of special file. 77b6cee71dSXin LIAny known file types appropriate to the system you are running on 78b6cee71dSXin LI(sockets, symbolic links, or named pipes (FIFOs) on those systems that 79b6cee71dSXin LIimplement them) 80b6cee71dSXin LIare intuited if they are defined in the system header file 81b6cee71dSXin LI.In sys/stat.h . 82b6cee71dSXin LI.Pp 83b6cee71dSXin LIThe magic tests are used to check for files with data in 84b6cee71dSXin LIparticular fixed formats. 85b6cee71dSXin LIThe canonical example of this is a binary executable (compiled program) 86b6cee71dSXin LI.Dv a.out 87b6cee71dSXin LIfile, whose format is defined in 88b6cee71dSXin LI.In elf.h , 89b6cee71dSXin LI.In a.out.h 90b6cee71dSXin LIand possibly 91b6cee71dSXin LI.In exec.h 92b6cee71dSXin LIin the standard include directory. 93b6cee71dSXin LIThese files have a 9443a5ec4eSXin LI.Dq magic number 95b6cee71dSXin LIstored in a particular place 96b6cee71dSXin LInear the beginning of the file that tells the 97b6cee71dSXin LI.Tn UNIX 98b6cee71dSXin LIoperating system 99b6cee71dSXin LIthat the file is a binary executable, and which of several types thereof. 100b6cee71dSXin LIThe concept of a 10143a5ec4eSXin LI.Dq magic number 102b6cee71dSXin LIhas been applied by extension to data files. 103b6cee71dSXin LIAny file with some invariant identifier at a small fixed 104b6cee71dSXin LIoffset into the file can usually be described in this way. 105b6cee71dSXin LIThe information identifying these files is read from the compiled 106b6cee71dSXin LImagic file 107b6cee71dSXin LI.Pa __MAGIC__.mgc , 108b6cee71dSXin LIor the files in the directory 109b6cee71dSXin LI.Pa __MAGIC__ 110b6cee71dSXin LIif the compiled file does not exist. 111b6cee71dSXin LIIn addition, if 112b6cee71dSXin LI.Pa $HOME/.magic.mgc 113b6cee71dSXin LIor 114b6cee71dSXin LI.Pa $HOME/.magic 115b6cee71dSXin LIexists, it will be used in preference to the system magic files. 116b6cee71dSXin LI.Pp 117b6cee71dSXin LIIf a file does not match any of the entries in the magic file, 118b6cee71dSXin LIit is examined to see if it seems to be a text file. 119b6cee71dSXin LIASCII, ISO-8859-x, non-ISO 8-bit extended-ASCII character sets 120b6cee71dSXin LI(such as those used on Macintosh and IBM PC systems), 121b6cee71dSXin LIUTF-8-encoded Unicode, UTF-16-encoded Unicode, and EBCDIC 122b6cee71dSXin LIcharacter sets can be distinguished by the different 123b6cee71dSXin LIranges and sequences of bytes that constitute printable text 124b6cee71dSXin LIin each set. 125b6cee71dSXin LIIf a file passes any of these tests, its character set is reported. 126b6cee71dSXin LIASCII, ISO-8859-x, UTF-8, and extended-ASCII files are identified 127b6cee71dSXin LIas 128b6cee71dSXin LI.Dq text 129b6cee71dSXin LIbecause they will be mostly readable on nearly any terminal; 130b6cee71dSXin LIUTF-16 and EBCDIC are only 131b6cee71dSXin LI.Dq character data 132b6cee71dSXin LIbecause, while 133b6cee71dSXin LIthey contain text, it is text that will require translation 134b6cee71dSXin LIbefore it can be read. 135b6cee71dSXin LIIn addition, 136b6cee71dSXin LI.Nm 137b6cee71dSXin LIwill attempt to determine other characteristics of text-type files. 138b6cee71dSXin LIIf the lines of a file are terminated by CR, CRLF, or NEL, instead 139b6cee71dSXin LIof the Unix-standard LF, this will be reported. 140b6cee71dSXin LIFiles that contain embedded escape sequences or overstriking 141b6cee71dSXin LIwill also be identified. 142b6cee71dSXin LI.Pp 143b6cee71dSXin LIOnce 144b6cee71dSXin LI.Nm 145b6cee71dSXin LIhas determined the character set used in a text-type file, 146b6cee71dSXin LIit will 147b6cee71dSXin LIattempt to determine in what language the file is written. 148b6cee71dSXin LIThe language tests look for particular strings (cf. 149b6cee71dSXin LI.In names.h ) 150b6cee71dSXin LIthat can appear anywhere in the first few blocks of a file. 151b6cee71dSXin LIFor example, the keyword 152b6cee71dSXin LI.Em .br 153b6cee71dSXin LIindicates that the file is most likely a 154b6cee71dSXin LI.Xr troff 1 155b6cee71dSXin LIinput file, just as the keyword 156b6cee71dSXin LI.Em struct 157b6cee71dSXin LIindicates a C program. 158b6cee71dSXin LIThese tests are less reliable than the previous 159b6cee71dSXin LItwo groups, so they are performed last. 160b6cee71dSXin LIThe language test routines also test for some miscellany 161b6cee71dSXin LI(such as 162b6cee71dSXin LI.Xr tar 1 16348c779cdSXin LIarchives, JSON files). 164b6cee71dSXin LI.Pp 165b6cee71dSXin LIAny file that cannot be identified as having been written 166b6cee71dSXin LIin any of the character sets listed above is simply said to be 167b6cee71dSXin LI.Dq data . 168b6cee71dSXin LI.Sh OPTIONS 169b6cee71dSXin LI.Bl -tag -width indent 170b6cee71dSXin LI.It Fl Fl apple 17143a5ec4eSXin LICauses the 17243a5ec4eSXin LI.Nm 17343a5ec4eSXin LIcommand to output the file type and creator code as 17458a0f0d0SEitan Adlerused by older MacOS versions. 17558a0f0d0SEitan AdlerThe code consists of eight letters, 176b6cee71dSXin LIthe first describing the file type, the latter the creator. 17748c779cdSXin LIThis option works properly only for file formats that have the 17848c779cdSXin LIapple-style output defined. 179b6cee71dSXin LI.It Fl b , Fl Fl brief 180b6cee71dSXin LIDo not prepend filenames to output lines (brief mode). 181b6cee71dSXin LI.It Fl C , Fl Fl compile 182b6cee71dSXin LIWrite a 183b6cee71dSXin LI.Pa magic.mgc 184b6cee71dSXin LIoutput file that contains a pre-parsed version of the magic file or directory. 185b6cee71dSXin LI.It Fl c , Fl Fl checking-printout 186b6cee71dSXin LICause a checking printout of the parsed form of the magic file. 187b6cee71dSXin LIThis is usually used in conjunction with the 188b6cee71dSXin LI.Fl m 18943a5ec4eSXin LIoption to debug a new magic file before installing it. 190a5d223e6SXin LI.It Fl d 191a5d223e6SXin LIPrints internal debugging information to stderr. 192b6cee71dSXin LI.It Fl E 193b6cee71dSXin LIOn filesystem errors (file not found etc), instead of handling the error 194b6cee71dSXin LIas regular output as POSIX mandates and keep going, issue an error message 195b6cee71dSXin LIand exit. 196b6cee71dSXin LI.It Fl e , Fl Fl exclude Ar testname 197b6cee71dSXin LIExclude the test named in 198b6cee71dSXin LI.Ar testname 199b6cee71dSXin LIfrom the list of tests made to determine the file type. 200b6cee71dSXin LIValid test names are: 201b6cee71dSXin LI.Bl -tag -width compress 202b6cee71dSXin LI.It apptype 203b6cee71dSXin LI.Dv EMX 204b6cee71dSXin LIapplication type (only on EMX). 205b6cee71dSXin LI.It ascii 206b6cee71dSXin LIVarious types of text files (this test will try to guess the text 207b6cee71dSXin LIencoding, irrespective of the setting of the 208b6cee71dSXin LI.Sq encoding 209b6cee71dSXin LIoption). 210b6cee71dSXin LI.It encoding 211b6cee71dSXin LIDifferent text encodings for soft magic tests. 212b6cee71dSXin LI.It tokens 213b6cee71dSXin LIIgnored for backwards compatibility. 214b6cee71dSXin LI.It cdf 215b6cee71dSXin LIPrints details of Compound Document Files. 216b6cee71dSXin LI.It compress 217b6cee71dSXin LIChecks for, and looks inside, compressed files. 218d38c30c0SXin LI.It csv 219d38c30c0SXin LIChecks Comma Separated Value files. 220b6cee71dSXin LI.It elf 221a5d223e6SXin LIPrints ELF file details, provided soft magic tests are enabled and the 222a5d223e6SXin LIelf magic is found. 22348c779cdSXin LI.It json 22448c779cdSXin LIExamines JSON (RFC-7159) files by parsing them for compliance. 225b6cee71dSXin LI.It soft 226b6cee71dSXin LIConsults magic files. 227898496eeSXin LI.It simh 228898496eeSXin LIExamines SIMH tape files. 229b6cee71dSXin LI.It tar 23058a0f0d0SEitan AdlerExamines tar files by verifying the checksum of the 512 byte tar header. 23158a0f0d0SEitan AdlerExcluding this test can provide more detailed content description by using 23258a0f0d0SEitan Adlerthe soft magic method. 233282e23f0SXin LI.It text 234282e23f0SXin LIA synonym for 235282e23f0SXin LI.Sq ascii . 236b6cee71dSXin LI.El 2372726a701SXin LI.It Fl Fl exclude-quiet 2382726a701SXin LILike 2392726a701SXin LI.Fl Fl exclude 2402726a701SXin LIbut ignore tests that 2412726a701SXin LI.Nm 2422726a701SXin LIdoes not know about. 24343a5ec4eSXin LIThis is intended for compatibility with older versions of 2442726a701SXin LI.Nm . 2455f0216bdSXin LI.It Fl Fl extension 2465f0216bdSXin LIPrint a slash-separated list of valid extensions for the file type found. 247b6cee71dSXin LI.It Fl F , Fl Fl separator Ar separator 248b6cee71dSXin LIUse the specified string as the separator between the filename and the 249b6cee71dSXin LIfile result returned. 250b6cee71dSXin LIDefaults to 251b6cee71dSXin LI.Sq \&: . 252b6cee71dSXin LI.It Fl f , Fl Fl files-from Ar namefile 253b6cee71dSXin LIRead the names of the files to be examined from 254b6cee71dSXin LI.Ar namefile 255b6cee71dSXin LI(one per line) 256b6cee71dSXin LIbefore the argument list. 257b6cee71dSXin LIEither 258b6cee71dSXin LI.Ar namefile 259b6cee71dSXin LIor at least one filename argument must be present; 260b6cee71dSXin LIto test the standard input, use 261b6cee71dSXin LI.Sq - 262b6cee71dSXin LIas a filename argument. 263b6cee71dSXin LIPlease note that 264b6cee71dSXin LI.Ar namefile 265b6cee71dSXin LIis unwrapped and the enclosed filenames are processed when this option is 266b6cee71dSXin LIencountered and before any further options processing is done. 267b6cee71dSXin LIThis allows one to process multiple lists of files with different command line 268b6cee71dSXin LIarguments on the same 269b6cee71dSXin LI.Nm 270b6cee71dSXin LIinvocation. 271b6cee71dSXin LIThus if you want to set the delimiter, you need to do it before you specify 272b6cee71dSXin LIthe list of files, like: 273b6cee71dSXin LI.Dq Fl F Ar @ Fl f Ar namefile , 274b6cee71dSXin LIinstead of: 275b6cee71dSXin LI.Dq Fl f Ar namefile Fl F Ar @ . 276b6cee71dSXin LI.It Fl h , Fl Fl no-dereference 27743a5ec4eSXin LIThis option causes symlinks not to be followed 278b6cee71dSXin LI(on systems that support symbolic links). 279b6cee71dSXin LIThis is the default if the environment variable 280b6cee71dSXin LI.Dv POSIXLY_CORRECT 281b6cee71dSXin LIis not defined. 282b6cee71dSXin LI.It Fl i , Fl Fl mime 28343a5ec4eSXin LICauses the 28443a5ec4eSXin LI.Nm 28543a5ec4eSXin LIcommand to output mime type strings rather than the more 286b6cee71dSXin LItraditional human readable ones. 287b6cee71dSXin LIThus it may say 288b6cee71dSXin LI.Sq text/plain; charset=us-ascii 289b6cee71dSXin LIrather than 290b6cee71dSXin LI.Dq ASCII text . 291b6cee71dSXin LI.It Fl Fl mime-type , Fl Fl mime-encoding 292b6cee71dSXin LILike 293b6cee71dSXin LI.Fl i , 294b6cee71dSXin LIbut print only the specified element(s). 295b6cee71dSXin LI.It Fl k , Fl Fl keep-going 296b6cee71dSXin LIDon't stop at the first match, keep going. 297b6cee71dSXin LISubsequent matches will be 298b6cee71dSXin LIhave the string 299b6cee71dSXin LI.Sq "\[rs]012\- " 300b6cee71dSXin LIprepended. 301b6cee71dSXin LI(If you want a newline, see the 302b6cee71dSXin LI.Fl r 303b6cee71dSXin LIoption.) 304b6cee71dSXin LIThe magic pattern with the highest strength (see the 305b6cee71dSXin LI.Fl l 306b6cee71dSXin LIoption) comes first. 307b6cee71dSXin LI.It Fl l , Fl Fl list 308b6cee71dSXin LIShows a list of patterns and their strength sorted descending by 309d38c30c0SXin LI.Xr magic __FSECTION__ 310b6cee71dSXin LIstrength 311b6cee71dSXin LIwhich is used for the matching (see also the 312b6cee71dSXin LI.Fl k 313b6cee71dSXin LIoption). 314b6cee71dSXin LI.It Fl L , Fl Fl dereference 31543a5ec4eSXin LIThis option causes symlinks to be followed, as the like-named option in 316b6cee71dSXin LI.Xr ls 1 317b6cee71dSXin LI(on systems that support symbolic links). 318b6cee71dSXin LIThis is the default if the environment variable 319b6cee71dSXin LI.Ev POSIXLY_CORRECT 320b6cee71dSXin LIis defined. 321b6cee71dSXin LI.It Fl m , Fl Fl magic-file Ar magicfiles 322b6cee71dSXin LISpecify an alternate list of files and directories containing magic. 323b6cee71dSXin LIThis can be a single item, or a colon-separated list. 324b6cee71dSXin LIIf a compiled magic file is found alongside a file or directory, 325b6cee71dSXin LIit will be used instead. 326b6cee71dSXin LI.It Fl N , Fl Fl no-pad 327b6cee71dSXin LIDon't pad filenames so that they align in the output. 328b6cee71dSXin LI.It Fl n , Fl Fl no-buffer 329b6cee71dSXin LIForce stdout to be flushed after checking each file. 330b6cee71dSXin LIThis is only useful if checking a list of files. 331b6cee71dSXin LIIt is intended to be used by programs that want filetype output from a pipe. 332b6cee71dSXin LI.It Fl p , Fl Fl preserve-date 333b6cee71dSXin LIOn systems that support 334b6cee71dSXin LI.Xr utime 3 335b6cee71dSXin LIor 336b6cee71dSXin LI.Xr utimes 2 , 337b6cee71dSXin LIattempt to preserve the access time of files analyzed, to pretend that 338b6cee71dSXin LI.Nm 339b6cee71dSXin LInever read them. 340c2931133SXin LI.It Fl P , Fl Fl parameter Ar name=value 341c2931133SXin LISet various parameter limits. 342898496eeSXin LI.Bl -column "elf_phnum" "Default" "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" 343c2931133SXin LI.It Sy "Name" Ta Sy "Default" Ta Sy "Explanation" 344898496eeSXin LI.It Li bytes Ta 1M Ta max number of bytes to read from file 3452726a701SXin LI.It Li elf_notes Ta 256 Ta max ELF notes processed 346898496eeSXin LI.It Li elf_phnum Ta 2K Ta max ELF program sections processed 347898496eeSXin LI.It Li elf_shnum Ta 32K Ta max ELF sections processed 348898496eeSXin LI.It Li elf_shsize Ta 128MB Ta max ELF section size processed 349898496eeSXin LI.It Li encoding Ta 65K Ta max number of bytes to determine encoding 3502726a701SXin LI.It Li indir Ta 50 Ta recursion limit for indirect magic 351*ae316d1dSXin LI.It Li name Ta 100 Ta use count limit for name/use magic 352898496eeSXin LI.It Li regex Ta 8K Ta length limit for regex searches 353c2931133SXin LI.El 354b6cee71dSXin LI.It Fl r , Fl Fl raw 355b6cee71dSXin LIDon't translate unprintable characters to \eooo. 356b6cee71dSXin LINormally 357b6cee71dSXin LI.Nm 358b6cee71dSXin LItranslates unprintable characters to their octal representation. 359b6cee71dSXin LI.It Fl s , Fl Fl special-files 360b6cee71dSXin LINormally, 361b6cee71dSXin LI.Nm 362b6cee71dSXin LIonly attempts to read and determine the type of argument files which 363b6cee71dSXin LI.Xr stat 2 364b6cee71dSXin LIreports are ordinary files. 365b6cee71dSXin LIThis prevents problems, because reading special files may have peculiar 366b6cee71dSXin LIconsequences. 367b6cee71dSXin LISpecifying the 368b6cee71dSXin LI.Fl s 369b6cee71dSXin LIoption causes 370b6cee71dSXin LI.Nm 371b6cee71dSXin LIto also read argument files which are block or character special files. 372b6cee71dSXin LIThis is useful for determining the filesystem types of the data in raw 373b6cee71dSXin LIdisk partitions, which are block special files. 374b6cee71dSXin LIThis option also causes 375b6cee71dSXin LI.Nm 376b6cee71dSXin LIto disregard the file size as reported by 377b6cee71dSXin LI.Xr stat 2 378b6cee71dSXin LIsince on some systems it reports a zero size for raw disk partitions. 3792dc4dbb9SEitan Adler.It Fl S , Fl Fl no-sandbox 38058a0f0d0SEitan AdlerOn systems where libseccomp 38158a0f0d0SEitan Adler.Pa ( https://github.com/seccomp/libseccomp ) 38258a0f0d0SEitan Adleris available, the 38358a0f0d0SEitan Adler.Fl S 38443a5ec4eSXin LIoption disables sandboxing which is enabled by default. 38543a5ec4eSXin LIThis option is needed for 38643a5ec4eSXin LI.Nm 38743a5ec4eSXin LIto execute external decompressing programs, 38858a0f0d0SEitan Adleri.e. when the 38958a0f0d0SEitan Adler.Fl z 39043a5ec4eSXin LIoption is specified and the built-in decompressors are not available. 391d38c30c0SXin LIOn systems where sandboxing is not available, this option has no effect. 392b6cee71dSXin LI.It Fl v , Fl Fl version 393b6cee71dSXin LIPrint the version of the program and exit. 394b6cee71dSXin LI.It Fl z , Fl Fl uncompress 395b6cee71dSXin LITry to look inside compressed files. 3965f0216bdSXin LI.It Fl Z , Fl Fl uncompress-noreport 3975f0216bdSXin LITry to look inside compressed files, but report information about the contents 3985f0216bdSXin LIonly not the compression. 399b6cee71dSXin LI.It Fl 0 , Fl Fl print0 400b6cee71dSXin LIOutput a null character 401b6cee71dSXin LI.Sq \e0 402b6cee71dSXin LIafter the end of the filename. 403b6cee71dSXin LINice to 404b6cee71dSXin LI.Xr cut 1 405b6cee71dSXin LIthe output. 406b6cee71dSXin LIThis does not affect the separator, which is still printed. 4073e41d09dSXin LI.Pp 4083e41d09dSXin LIIf this option is repeated more than once, then 4093e41d09dSXin LI.Nm 4103e41d09dSXin LIprints just the filename followed by a NUL followed by the description 4113e41d09dSXin LI(or ERROR: text) followed by a second NUL for each entry. 412b6cee71dSXin LI.It Fl -help 413b6cee71dSXin LIPrint a help message and exit. 414b6cee71dSXin LI.El 415b6cee71dSXin LI.Sh ENVIRONMENT 416b6cee71dSXin LIThe environment variable 417b6cee71dSXin LI.Ev MAGIC 418b6cee71dSXin LIcan be used to set the default magic file name. 419b6cee71dSXin LIIf that variable is set, then 420b6cee71dSXin LI.Nm 421b6cee71dSXin LIwill not attempt to open 422b6cee71dSXin LI.Pa $HOME/.magic . 423b6cee71dSXin LI.Nm 424b6cee71dSXin LIadds 425b6cee71dSXin LI.Dq Pa .mgc 426b6cee71dSXin LIto the value of this variable as appropriate. 427b6cee71dSXin LIThe environment variable 428b6cee71dSXin LI.Ev POSIXLY_CORRECT 429b6cee71dSXin LIcontrols (on systems that support symbolic links), whether 430b6cee71dSXin LI.Nm 431b6cee71dSXin LIwill attempt to follow symlinks or not. 432b6cee71dSXin LIIf set, then 433b6cee71dSXin LI.Nm 434b6cee71dSXin LIfollows symlink, otherwise it does not. 435b6cee71dSXin LIThis is also controlled by the 436b6cee71dSXin LI.Fl L 437b6cee71dSXin LIand 438b6cee71dSXin LI.Fl h 439b6cee71dSXin LIoptions. 44058a0f0d0SEitan Adler.Sh FILES 44158a0f0d0SEitan Adler.Bl -tag -width __MAGIC__.mgc -compact 44258a0f0d0SEitan Adler.It Pa __MAGIC__.mgc 44358a0f0d0SEitan AdlerDefault compiled list of magic. 44458a0f0d0SEitan Adler.It Pa __MAGIC__ 44558a0f0d0SEitan AdlerDirectory containing default magic files. 44658a0f0d0SEitan Adler.El 44758a0f0d0SEitan Adler.Sh EXIT STATUS 44858a0f0d0SEitan Adler.Nm 44958a0f0d0SEitan Adlerwill exit with 45058a0f0d0SEitan Adler.Dv 0 45158a0f0d0SEitan Adlerif the operation was successful or 45258a0f0d0SEitan Adler.Dv >0 45358a0f0d0SEitan Adlerif an error was encountered. 45458a0f0d0SEitan AdlerThe following errors cause diagnostic messages, but don't affect the program 45558a0f0d0SEitan Adlerexit code (as POSIX requires), unless 45658a0f0d0SEitan Adler.Fl E 45758a0f0d0SEitan Adleris specified: 45858a0f0d0SEitan Adler.Bl -bullet -compact -offset indent 45958a0f0d0SEitan Adler.It 46058a0f0d0SEitan AdlerA file cannot be found 46158a0f0d0SEitan Adler.It 46258a0f0d0SEitan AdlerThere is no permission to read a file 46358a0f0d0SEitan Adler.It 46458a0f0d0SEitan AdlerThe file type cannot be determined 46558a0f0d0SEitan Adler.El 46658a0f0d0SEitan Adler.Sh EXAMPLES 46758a0f0d0SEitan Adler.Bd -literal -offset indent 46858a0f0d0SEitan Adler$ file file.c file /dev/{wd0a,hda} 46958a0f0d0SEitan Adlerfile.c: C program text 47058a0f0d0SEitan Adlerfile: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), 47158a0f0d0SEitan Adler dynamically linked (uses shared libs), stripped 47258a0f0d0SEitan Adler/dev/wd0a: block special (0/0) 47358a0f0d0SEitan Adler/dev/hda: block special (3/0) 47458a0f0d0SEitan Adler 47558a0f0d0SEitan Adler$ file -s /dev/wd0{b,d} 47658a0f0d0SEitan Adler/dev/wd0b: data 47758a0f0d0SEitan Adler/dev/wd0d: x86 boot sector 47858a0f0d0SEitan Adler 47958a0f0d0SEitan Adler$ file -s /dev/hda{,1,2,3,4,5,6,7,8,9,10} 48058a0f0d0SEitan Adler/dev/hda: x86 boot sector 48158a0f0d0SEitan Adler/dev/hda1: Linux/i386 ext2 filesystem 48258a0f0d0SEitan Adler/dev/hda2: x86 boot sector 48358a0f0d0SEitan Adler/dev/hda3: x86 boot sector, extended partition table 48458a0f0d0SEitan Adler/dev/hda4: Linux/i386 ext2 filesystem 48558a0f0d0SEitan Adler/dev/hda5: Linux/i386 swap file 48658a0f0d0SEitan Adler/dev/hda6: Linux/i386 swap file 48758a0f0d0SEitan Adler/dev/hda7: Linux/i386 swap file 48858a0f0d0SEitan Adler/dev/hda8: Linux/i386 swap file 48958a0f0d0SEitan Adler/dev/hda9: empty 49058a0f0d0SEitan Adler/dev/hda10: empty 49158a0f0d0SEitan Adler 49258a0f0d0SEitan Adler$ file -i file.c file /dev/{wd0a,hda} 49358a0f0d0SEitan Adlerfile.c: text/x-c 49458a0f0d0SEitan Adlerfile: application/x-executable 49558a0f0d0SEitan Adler/dev/hda: application/x-not-regular-file 49658a0f0d0SEitan Adler/dev/wd0a: application/x-not-regular-file 49758a0f0d0SEitan Adler 49858a0f0d0SEitan Adler.Ed 499b6cee71dSXin LI.Sh SEE ALSO 500b6cee71dSXin LI.Xr hexdump 1 , 501b6cee71dSXin LI.Xr od 1 , 502b6cee71dSXin LI.Xr strings 1 , 50340427ccaSGordon Tetlow.Xr magic __FSECTION__ , 504be3a49eeSEdward Tomasz Napierala.Xr fstyp 8 505b6cee71dSXin LI.Sh STANDARDS CONFORMANCE 506b6cee71dSXin LIThis program is believed to exceed the System V Interface Definition 507b6cee71dSXin LIof FILE(CMD), as near as one can determine from the vague language 508b6cee71dSXin LIcontained therein. 509b6cee71dSXin LIIts behavior is mostly compatible with the System V program of the same name. 510b6cee71dSXin LIThis version knows more magic, however, so it will produce 511b6cee71dSXin LIdifferent (albeit more accurate) output in many cases. 512b6cee71dSXin LI.\" URL: http://www.opengroup.org/onlinepubs/009695399/utilities/file.html 513b6cee71dSXin LI.Pp 514b6cee71dSXin LIThe one significant difference 515b6cee71dSXin LIbetween this version and System V 516b6cee71dSXin LIis that this version treats any white space 517b6cee71dSXin LIas a delimiter, so that spaces in pattern strings must be escaped. 518b6cee71dSXin LIFor example, 519b6cee71dSXin LI.Bd -literal -offset indent 520b6cee71dSXin LI\*[Gt]10 string language impress\ (imPRESS data) 521b6cee71dSXin LI.Ed 522b6cee71dSXin LI.Pp 523b6cee71dSXin LIin an existing magic file would have to be changed to 524b6cee71dSXin LI.Bd -literal -offset indent 525b6cee71dSXin LI\*[Gt]10 string language\e impress (imPRESS data) 526b6cee71dSXin LI.Ed 527b6cee71dSXin LI.Pp 528b6cee71dSXin LIIn addition, in this version, if a pattern string contains a backslash, 529b6cee71dSXin LIit must be escaped. 530b6cee71dSXin LIFor example 531b6cee71dSXin LI.Bd -literal -offset indent 532b6cee71dSXin LI0 string \ebegindata Andrew Toolkit document 533b6cee71dSXin LI.Ed 534b6cee71dSXin LI.Pp 535b6cee71dSXin LIin an existing magic file would have to be changed to 536b6cee71dSXin LI.Bd -literal -offset indent 537b6cee71dSXin LI0 string \e\ebegindata Andrew Toolkit document 538b6cee71dSXin LI.Ed 539b6cee71dSXin LI.Pp 540b6cee71dSXin LISunOS releases 3.2 and later from Sun Microsystems include a 541b6cee71dSXin LI.Nm 542b6cee71dSXin LIcommand derived from the System V one, but with some extensions. 543b6cee71dSXin LIThis version differs from Sun's only in minor ways. 544b6cee71dSXin LIIt includes the extension of the 545b6cee71dSXin LI.Sq \*[Am] 546b6cee71dSXin LIoperator, used as, 547b6cee71dSXin LIfor example, 548b6cee71dSXin LI.Bd -literal -offset indent 549b6cee71dSXin LI\*[Gt]16 long\*[Am]0x7fffffff \*[Gt]0 not stripped 550b6cee71dSXin LI.Ed 55158a0f0d0SEitan Adler.Sh SECURITY 55258a0f0d0SEitan AdlerOn systems where libseccomp 55358a0f0d0SEitan Adler.Pa ( https://github.com/seccomp/libseccomp ) 55458a0f0d0SEitan Adleris available, 55558a0f0d0SEitan Adler.Nm 55658a0f0d0SEitan Adleris enforces limiting system calls to only the ones necessary for the 55758a0f0d0SEitan Adleroperation of the program. 55858a0f0d0SEitan AdlerThis enforcement does not provide any security benefit when 55958a0f0d0SEitan Adler.Nm 56058a0f0d0SEitan Adleris asked to decompress input files running external programs with 56158a0f0d0SEitan Adlerthe 56258a0f0d0SEitan Adler.Fl z 56358a0f0d0SEitan Adleroption. 56458a0f0d0SEitan AdlerTo enable execution of external decompressors, one needs to disable 56558a0f0d0SEitan Adlersandboxing using the 56658a0f0d0SEitan Adler.Fl S 56743a5ec4eSXin LIoption. 568b6cee71dSXin LI.Sh MAGIC DIRECTORY 569b6cee71dSXin LIThe magic file entries have been collected from various sources, 570b6cee71dSXin LImainly USENET, and contributed by various authors. 571b6cee71dSXin LIChristos Zoulas (address below) will collect additional 572b6cee71dSXin LIor corrected magic file entries. 573b6cee71dSXin LIA consolidation of magic file entries 574b6cee71dSXin LIwill be distributed periodically. 575b6cee71dSXin LI.Pp 576b6cee71dSXin LIThe order of entries in the magic file is significant. 577b6cee71dSXin LIDepending on what system you are using, the order that 578b6cee71dSXin LIthey are put together may be incorrect. 579b6cee71dSXin LIIf your old 580b6cee71dSXin LI.Nm 581b6cee71dSXin LIcommand uses a magic file, 582b6cee71dSXin LIkeep the old magic file around for comparison purposes 583b6cee71dSXin LI(rename it to 584b6cee71dSXin LI.Pa __MAGIC__.orig ) . 585b6cee71dSXin LI.Sh HISTORY 586b6cee71dSXin LIThere has been a 587b6cee71dSXin LI.Nm 588b6cee71dSXin LIcommand in every 589b6cee71dSXin LI.Dv UNIX since at least Research Version 4 590b6cee71dSXin LI(man page dated November, 1973). 591b6cee71dSXin LIThe System V version introduced one significant major change: 592b6cee71dSXin LIthe external list of magic types. 593b6cee71dSXin LIThis slowed the program down slightly but made it a lot more flexible. 594b6cee71dSXin LI.Pp 595b6cee71dSXin LIThis program, based on the System V version, 596b6cee71dSXin LIwas written by Ian Darwin 597b6cee71dSXin LI.Aq ian@darwinsys.com 598b6cee71dSXin LIwithout looking at anybody else's source code. 599b6cee71dSXin LI.Pp 600b6cee71dSXin LIJohn Gilmore revised the code extensively, making it better than 601b6cee71dSXin LIthe first version. 602b6cee71dSXin LIGeoff Collyer found several inadequacies 603b6cee71dSXin LIand provided some magic file entries. 60440427ccaSGordon TetlowContributions of the 605b6cee71dSXin LI.Sq \*[Am] 606b6cee71dSXin LIoperator by Rob McMahon, 607b6cee71dSXin LI.Aq cudcv@warwick.ac.uk , 608b6cee71dSXin LI1989. 609b6cee71dSXin LI.Pp 610b6cee71dSXin LIGuy Harris, 611b6cee71dSXin LI.Aq guy@netapp.com , 612b6cee71dSXin LImade many changes from 1993 to the present. 613b6cee71dSXin LI.Pp 614b6cee71dSXin LIPrimary development and maintenance from 1990 to the present by 615b6cee71dSXin LIChristos Zoulas 616b6cee71dSXin LI.Aq christos@astron.com . 617b6cee71dSXin LI.Pp 618b6cee71dSXin LIAltered by Chris Lowth 619b6cee71dSXin LI.Aq chris@lowth.com , 620b6cee71dSXin LI2000: handle the 621b6cee71dSXin LI.Fl i 622b6cee71dSXin LIoption to output mime type strings, using an alternative 623b6cee71dSXin LImagic file and internal logic. 624b6cee71dSXin LI.Pp 625b6cee71dSXin LIAltered by Eric Fischer 626b6cee71dSXin LI.Aq enf@pobox.com , 627b6cee71dSXin LIJuly, 2000, 628b6cee71dSXin LIto identify character codes and attempt to identify the languages 629b6cee71dSXin LIof non-ASCII files. 630b6cee71dSXin LI.Pp 631b6cee71dSXin LIAltered by Reuben Thomas 632b6cee71dSXin LI.Aq rrt@sc3d.org , 633b6cee71dSXin LI2007-2011, to improve MIME support, merge MIME and non-MIME magic, 634b6cee71dSXin LIsupport directories as well as files of magic, apply many bug fixes, 635b6cee71dSXin LIupdate and fix a lot of magic, improve the build system, improve the 636b6cee71dSXin LIdocumentation, and rewrite the Python bindings in pure Python. 637b6cee71dSXin LI.Pp 638b6cee71dSXin LIThe list of contributors to the 639b6cee71dSXin LI.Sq magic 640b6cee71dSXin LIdirectory (magic files) 641b6cee71dSXin LIis too long to include here. 642b6cee71dSXin LIYou know who you are; thank you. 643b6cee71dSXin LIMany contributors are listed in the source files. 644b6cee71dSXin LI.Sh LEGAL NOTICE 645b6cee71dSXin LICopyright (c) Ian F. Darwin, Toronto, Canada, 1986-1999. 646b6cee71dSXin LICovered by the standard Berkeley Software Distribution copyright; see the file 647b6cee71dSXin LICOPYING in the source distribution. 648b6cee71dSXin LI.Pp 649b6cee71dSXin LIThe files 650b6cee71dSXin LI.Pa tar.h 651b6cee71dSXin LIand 652b6cee71dSXin LI.Pa is_tar.c 653b6cee71dSXin LIwere written by John Gilmore from his public-domain 654b6cee71dSXin LI.Xr tar 1 655b6cee71dSXin LIprogram, and are not covered by the above license. 656b6cee71dSXin LI.Sh BUGS 657b6cee71dSXin LIPlease report bugs and send patches to the bug tracker at 65848c779cdSXin LI.Pa https://bugs.astron.com/ 659b6cee71dSXin LIor the mailing list at 6602dc4dbb9SEitan Adler.Aq file@astron.com 661b6cee71dSXin LI(visit 66248c779cdSXin LI.Pa https://mailman.astron.com/mailman/listinfo/file 663b6cee71dSXin LIfirst to subscribe). 664b6cee71dSXin LI.Sh TODO 665b6cee71dSXin LIFix output so that tests for MIME and APPLE flags are not needed all 666b6cee71dSXin LIover the place, and actual output is only done in one place. 667b6cee71dSXin LIThis needs a design. 668b6cee71dSXin LISuggestion: push possible outputs on to a list, then pick the 669b6cee71dSXin LIlast-pushed (most specific, one hopes) value at the end, or 670b6cee71dSXin LIuse a default if the list is empty. 671b6cee71dSXin LIThis should not slow down evaluation. 672b6cee71dSXin LI.Pp 6735f0216bdSXin LIThe handling of 6745f0216bdSXin LI.Dv MAGIC_CONTINUE 6755f0216bdSXin LIand printing \e012- between entries is clumsy and complicated; refactor 6765f0216bdSXin LIand centralize. 6775f0216bdSXin LI.Pp 6785f0216bdSXin LISome of the encoding logic is hard-coded in encoding.c and can be moved 67943a5ec4eSXin LIto the magic files if we had a !:charset annotation. 6805f0216bdSXin LI.Pp 681b6cee71dSXin LIContinue to squash all magic bugs. 682b6cee71dSXin LISee Debian BTS for a good source. 683b6cee71dSXin LI.Pp 684b6cee71dSXin LIStore arbitrarily long strings, for example for %s patterns, so that 685b6cee71dSXin LIthey can be printed out. 686b6cee71dSXin LIFixes Debian bug #271672. 6875f0216bdSXin LIThis can be done by allocating strings in a string pool, storing the 6885f0216bdSXin LIstring pool at the end of the magic file and converting all the string 6895f0216bdSXin LIpointers to relative offsets from the string pool. 690b6cee71dSXin LI.Pp 691b6cee71dSXin LIAdd syntax for relative offsets after current level (Debian bug #466037). 692b6cee71dSXin LI.Pp 693b6cee71dSXin LIMake file -ki work, i.e. give multiple MIME types. 694b6cee71dSXin LI.Pp 695b6cee71dSXin LIAdd a zip library so we can peek inside Office2007 documents to 6965f0216bdSXin LIprint more details about their contents. 697b6cee71dSXin LI.Pp 698b6cee71dSXin LIAdd an option to print URLs for the sources of the file descriptions. 699b6cee71dSXin LI.Pp 700b6cee71dSXin LICombine script searches and add a way to map executable names to MIME 701b6cee71dSXin LItypes (e.g. have a magic value for !:mime which causes the resulting 702b6cee71dSXin LIstring to be looked up in a table). 703b6cee71dSXin LIThis would avoid adding the same magic repeatedly for each new 704b6cee71dSXin LIhash-bang interpreter. 705b6cee71dSXin LI.Pp 7065f0216bdSXin LIWhen a file descriptor is available, we can skip and adjust the buffer 7075f0216bdSXin LIinstead of the hacky buffer management we do now. 7085f0216bdSXin LI.Pp 709b6cee71dSXin LIFix 710b6cee71dSXin LI.Dq name 711b6cee71dSXin LIand 712b6cee71dSXin LI.Dq use 713b6cee71dSXin LIto check for consistency at compile time (duplicate 714b6cee71dSXin LI.Dq name , 715b6cee71dSXin LI.Dq use 716b6cee71dSXin LIpointing to undefined 717b6cee71dSXin LI.Dq name 718b6cee71dSXin LI). 719b6cee71dSXin LIMake 720b6cee71dSXin LI.Dq name 721b6cee71dSXin LI/ 722b6cee71dSXin LI.Dq use 723b6cee71dSXin LImore efficient by keeping a sorted list of names. 724b6cee71dSXin LISpecial-case ^ to flip endianness in the parser so that it does not 725b6cee71dSXin LIhave to be escaped, and document it. 7265f0216bdSXin LI.Pp 7275f0216bdSXin LIIf the offsets specified internally in the file exceed the buffer size 7285f0216bdSXin LI( 7295f0216bdSXin LI.Dv HOWMANY 7305f0216bdSXin LIvariable in file.h), then we don't seek to that offset, but we give up. 7315f0216bdSXin LIIt would be better if buffer managements was done when the file descriptor 73243a5ec4eSXin LIis available so we can seek around the file. 73343a5ec4eSXin LIOne must be careful though because this has performance and thus security 734898496eeSXin LIconsiderations, because one can slow down things by repeatedly seeking. 73543a5ec4eSXin LI.Pp 73643a5ec4eSXin LIThere is support now for keeping separate buffers and having offsets from 73743a5ec4eSXin LIthe end of the file, but the internal buffer management still needs an 73843a5ec4eSXin LIoverhaul. 739b6cee71dSXin LI.Sh AVAILABILITY 740b6cee71dSXin LIYou can obtain the original author's latest version by anonymous FTP 741b6cee71dSXin LIon 742b6cee71dSXin LI.Pa ftp.astron.com 743b6cee71dSXin LIin the directory 744b6cee71dSXin LI.Pa /pub/file/file-X.YZ.tar.gz . 745