xref: /freebsd/contrib/file/doc/file.man (revision ae316d1d1cffd71ab7751f94e10118777a88e027)
1*ae316d1dSXin LI.\" $File: file.man,v 1.151 2024/04/07 21:27:35 christos Exp $
2*ae316d1dSXin LI.Dd April 7, 2024
3b6cee71dSXin LI.Dt FILE __CSECTION__
4b6cee71dSXin LI.Os
5b6cee71dSXin LI.Sh NAME
6b6cee71dSXin LI.Nm file
7b6cee71dSXin LI.Nd determine file type
8b6cee71dSXin LI.Sh SYNOPSIS
9b6cee71dSXin LI.Nm
10b6cee71dSXin LI.Bk -words
1158a0f0d0SEitan Adler.Op Fl bcdEhiklLNnprsSvzZ0
12b6cee71dSXin LI.Op Fl Fl apple
132726a701SXin LI.Op Fl Fl exclude-quiet
145f0216bdSXin LI.Op Fl Fl extension
15b6cee71dSXin LI.Op Fl Fl mime-encoding
16b6cee71dSXin LI.Op Fl Fl mime-type
17b6cee71dSXin LI.Op Fl e Ar testname
18b6cee71dSXin LI.Op Fl F Ar separator
19b6cee71dSXin LI.Op Fl f Ar namefile
20b6cee71dSXin LI.Op Fl m Ar magicfiles
21c2931133SXin LI.Op Fl P Ar name=value
22b6cee71dSXin LI.Ar
23b6cee71dSXin LI.Ek
24b6cee71dSXin LI.Nm
25b6cee71dSXin LI.Fl C
26b6cee71dSXin LI.Op Fl m Ar magicfiles
27b6cee71dSXin LI.Nm
28b6cee71dSXin LI.Op Fl Fl help
29b6cee71dSXin LI.Sh DESCRIPTION
30b6cee71dSXin LIThis manual page documents version __VERSION__ of the
31b6cee71dSXin LI.Nm
32b6cee71dSXin LIcommand.
33b6cee71dSXin LI.Pp
34b6cee71dSXin LI.Nm
35b6cee71dSXin LItests each argument in an attempt to classify it.
36b6cee71dSXin LIThere are three sets of tests, performed in this order:
37b6cee71dSXin LIfilesystem tests, magic tests, and language tests.
38b6cee71dSXin LIThe
39b6cee71dSXin LI.Em first
40b6cee71dSXin LItest that succeeds causes the file type to be printed.
41b6cee71dSXin LI.Pp
42b6cee71dSXin LIThe type printed will usually contain one of the words
43b6cee71dSXin LI.Em text
44b6cee71dSXin LI(the file contains only
45b6cee71dSXin LIprinting characters and a few common control
46b6cee71dSXin LIcharacters and is probably safe to read on an
47b6cee71dSXin LI.Dv ASCII
48b6cee71dSXin LIterminal),
49b6cee71dSXin LI.Em executable
50b6cee71dSXin LI(the file contains the result of compiling a program
51b6cee71dSXin LIin a form understandable to some
52b6cee71dSXin LI.Tn UNIX
53b6cee71dSXin LIkernel or another),
54b6cee71dSXin LIor
55b6cee71dSXin LI.Em data
56b6cee71dSXin LImeaning anything else (data is usually
57b6cee71dSXin LI.Dq binary
58b6cee71dSXin LIor non-printable).
59b6cee71dSXin LIExceptions are well-known file formats (core files, tar archives)
60b6cee71dSXin LIthat are known to contain binary data.
61b6cee71dSXin LIWhen modifying magic files or the program itself, make sure to
6243a5ec4eSXin LI.Em preserve these keywords .
63b6cee71dSXin LIUsers depend on knowing that all the readable files in a directory
64b6cee71dSXin LIhave the word
65b6cee71dSXin LI.Dq text
66b6cee71dSXin LIprinted.
67b6cee71dSXin LIDon't do as Berkeley did and change
68b6cee71dSXin LI.Dq shell commands text
69b6cee71dSXin LIto
70b6cee71dSXin LI.Dq shell script .
71b6cee71dSXin LI.Pp
72b6cee71dSXin LIThe filesystem tests are based on examining the return from a
73b6cee71dSXin LI.Xr stat 2
74b6cee71dSXin LIsystem call.
75b6cee71dSXin LIThe program checks to see if the file is empty,
76b6cee71dSXin LIor if it's some sort of special file.
77b6cee71dSXin LIAny known file types appropriate to the system you are running on
78b6cee71dSXin LI(sockets, symbolic links, or named pipes (FIFOs) on those systems that
79b6cee71dSXin LIimplement them)
80b6cee71dSXin LIare intuited if they are defined in the system header file
81b6cee71dSXin LI.In sys/stat.h .
82b6cee71dSXin LI.Pp
83b6cee71dSXin LIThe magic tests are used to check for files with data in
84b6cee71dSXin LIparticular fixed formats.
85b6cee71dSXin LIThe canonical example of this is a binary executable (compiled program)
86b6cee71dSXin LI.Dv a.out
87b6cee71dSXin LIfile, whose format is defined in
88b6cee71dSXin LI.In elf.h ,
89b6cee71dSXin LI.In a.out.h
90b6cee71dSXin LIand possibly
91b6cee71dSXin LI.In exec.h
92b6cee71dSXin LIin the standard include directory.
93b6cee71dSXin LIThese files have a
9443a5ec4eSXin LI.Dq magic number
95b6cee71dSXin LIstored in a particular place
96b6cee71dSXin LInear the beginning of the file that tells the
97b6cee71dSXin LI.Tn UNIX
98b6cee71dSXin LIoperating system
99b6cee71dSXin LIthat the file is a binary executable, and which of several types thereof.
100b6cee71dSXin LIThe concept of a
10143a5ec4eSXin LI.Dq magic number
102b6cee71dSXin LIhas been applied by extension to data files.
103b6cee71dSXin LIAny file with some invariant identifier at a small fixed
104b6cee71dSXin LIoffset into the file can usually be described in this way.
105b6cee71dSXin LIThe information identifying these files is read from the compiled
106b6cee71dSXin LImagic file
107b6cee71dSXin LI.Pa __MAGIC__.mgc ,
108b6cee71dSXin LIor the files in the directory
109b6cee71dSXin LI.Pa __MAGIC__
110b6cee71dSXin LIif the compiled file does not exist.
111b6cee71dSXin LIIn addition, if
112b6cee71dSXin LI.Pa $HOME/.magic.mgc
113b6cee71dSXin LIor
114b6cee71dSXin LI.Pa $HOME/.magic
115b6cee71dSXin LIexists, it will be used in preference to the system magic files.
116b6cee71dSXin LI.Pp
117b6cee71dSXin LIIf a file does not match any of the entries in the magic file,
118b6cee71dSXin LIit is examined to see if it seems to be a text file.
119b6cee71dSXin LIASCII, ISO-8859-x, non-ISO 8-bit extended-ASCII character sets
120b6cee71dSXin LI(such as those used on Macintosh and IBM PC systems),
121b6cee71dSXin LIUTF-8-encoded Unicode, UTF-16-encoded Unicode, and EBCDIC
122b6cee71dSXin LIcharacter sets can be distinguished by the different
123b6cee71dSXin LIranges and sequences of bytes that constitute printable text
124b6cee71dSXin LIin each set.
125b6cee71dSXin LIIf a file passes any of these tests, its character set is reported.
126b6cee71dSXin LIASCII, ISO-8859-x, UTF-8, and extended-ASCII files are identified
127b6cee71dSXin LIas
128b6cee71dSXin LI.Dq text
129b6cee71dSXin LIbecause they will be mostly readable on nearly any terminal;
130b6cee71dSXin LIUTF-16 and EBCDIC are only
131b6cee71dSXin LI.Dq character data
132b6cee71dSXin LIbecause, while
133b6cee71dSXin LIthey contain text, it is text that will require translation
134b6cee71dSXin LIbefore it can be read.
135b6cee71dSXin LIIn addition,
136b6cee71dSXin LI.Nm
137b6cee71dSXin LIwill attempt to determine other characteristics of text-type files.
138b6cee71dSXin LIIf the lines of a file are terminated by CR, CRLF, or NEL, instead
139b6cee71dSXin LIof the Unix-standard LF, this will be reported.
140b6cee71dSXin LIFiles that contain embedded escape sequences or overstriking
141b6cee71dSXin LIwill also be identified.
142b6cee71dSXin LI.Pp
143b6cee71dSXin LIOnce
144b6cee71dSXin LI.Nm
145b6cee71dSXin LIhas determined the character set used in a text-type file,
146b6cee71dSXin LIit will
147b6cee71dSXin LIattempt to determine in what language the file is written.
148b6cee71dSXin LIThe language tests look for particular strings (cf.
149b6cee71dSXin LI.In names.h )
150b6cee71dSXin LIthat can appear anywhere in the first few blocks of a file.
151b6cee71dSXin LIFor example, the keyword
152b6cee71dSXin LI.Em .br
153b6cee71dSXin LIindicates that the file is most likely a
154b6cee71dSXin LI.Xr troff 1
155b6cee71dSXin LIinput file, just as the keyword
156b6cee71dSXin LI.Em struct
157b6cee71dSXin LIindicates a C program.
158b6cee71dSXin LIThese tests are less reliable than the previous
159b6cee71dSXin LItwo groups, so they are performed last.
160b6cee71dSXin LIThe language test routines also test for some miscellany
161b6cee71dSXin LI(such as
162b6cee71dSXin LI.Xr tar 1
16348c779cdSXin LIarchives, JSON files).
164b6cee71dSXin LI.Pp
165b6cee71dSXin LIAny file that cannot be identified as having been written
166b6cee71dSXin LIin any of the character sets listed above is simply said to be
167b6cee71dSXin LI.Dq data .
168b6cee71dSXin LI.Sh OPTIONS
169b6cee71dSXin LI.Bl -tag -width indent
170b6cee71dSXin LI.It Fl Fl apple
17143a5ec4eSXin LICauses the
17243a5ec4eSXin LI.Nm
17343a5ec4eSXin LIcommand to output the file type and creator code as
17458a0f0d0SEitan Adlerused by older MacOS versions.
17558a0f0d0SEitan AdlerThe code consists of eight letters,
176b6cee71dSXin LIthe first describing the file type, the latter the creator.
17748c779cdSXin LIThis option works properly only for file formats that have the
17848c779cdSXin LIapple-style output defined.
179b6cee71dSXin LI.It Fl b , Fl Fl brief
180b6cee71dSXin LIDo not prepend filenames to output lines (brief mode).
181b6cee71dSXin LI.It Fl C , Fl Fl compile
182b6cee71dSXin LIWrite a
183b6cee71dSXin LI.Pa magic.mgc
184b6cee71dSXin LIoutput file that contains a pre-parsed version of the magic file or directory.
185b6cee71dSXin LI.It Fl c , Fl Fl checking-printout
186b6cee71dSXin LICause a checking printout of the parsed form of the magic file.
187b6cee71dSXin LIThis is usually used in conjunction with the
188b6cee71dSXin LI.Fl m
18943a5ec4eSXin LIoption to debug a new magic file before installing it.
190a5d223e6SXin LI.It Fl d
191a5d223e6SXin LIPrints internal debugging information to stderr.
192b6cee71dSXin LI.It Fl E
193b6cee71dSXin LIOn filesystem errors (file not found etc), instead of handling the error
194b6cee71dSXin LIas regular output as POSIX mandates and keep going, issue an error message
195b6cee71dSXin LIand exit.
196b6cee71dSXin LI.It Fl e , Fl Fl exclude Ar testname
197b6cee71dSXin LIExclude the test named in
198b6cee71dSXin LI.Ar testname
199b6cee71dSXin LIfrom the list of tests made to determine the file type.
200b6cee71dSXin LIValid test names are:
201b6cee71dSXin LI.Bl -tag -width compress
202b6cee71dSXin LI.It apptype
203b6cee71dSXin LI.Dv EMX
204b6cee71dSXin LIapplication type (only on EMX).
205b6cee71dSXin LI.It ascii
206b6cee71dSXin LIVarious types of text files (this test will try to guess the text
207b6cee71dSXin LIencoding, irrespective of the setting of the
208b6cee71dSXin LI.Sq encoding
209b6cee71dSXin LIoption).
210b6cee71dSXin LI.It encoding
211b6cee71dSXin LIDifferent text encodings for soft magic tests.
212b6cee71dSXin LI.It tokens
213b6cee71dSXin LIIgnored for backwards compatibility.
214b6cee71dSXin LI.It cdf
215b6cee71dSXin LIPrints details of Compound Document Files.
216b6cee71dSXin LI.It compress
217b6cee71dSXin LIChecks for, and looks inside, compressed files.
218d38c30c0SXin LI.It csv
219d38c30c0SXin LIChecks Comma Separated Value files.
220b6cee71dSXin LI.It elf
221a5d223e6SXin LIPrints ELF file details, provided soft magic tests are enabled and the
222a5d223e6SXin LIelf magic is found.
22348c779cdSXin LI.It json
22448c779cdSXin LIExamines JSON (RFC-7159) files by parsing them for compliance.
225b6cee71dSXin LI.It soft
226b6cee71dSXin LIConsults magic files.
227898496eeSXin LI.It simh
228898496eeSXin LIExamines SIMH tape files.
229b6cee71dSXin LI.It tar
23058a0f0d0SEitan AdlerExamines tar files by verifying the checksum of the 512 byte tar header.
23158a0f0d0SEitan AdlerExcluding this test can provide more detailed content description by using
23258a0f0d0SEitan Adlerthe soft magic method.
233282e23f0SXin LI.It text
234282e23f0SXin LIA synonym for
235282e23f0SXin LI.Sq ascii .
236b6cee71dSXin LI.El
2372726a701SXin LI.It Fl Fl exclude-quiet
2382726a701SXin LILike
2392726a701SXin LI.Fl Fl exclude
2402726a701SXin LIbut ignore tests that
2412726a701SXin LI.Nm
2422726a701SXin LIdoes not know about.
24343a5ec4eSXin LIThis is intended for compatibility with older versions of
2442726a701SXin LI.Nm .
2455f0216bdSXin LI.It Fl Fl extension
2465f0216bdSXin LIPrint a slash-separated list of valid extensions for the file type found.
247b6cee71dSXin LI.It Fl F , Fl Fl separator Ar separator
248b6cee71dSXin LIUse the specified string as the separator between the filename and the
249b6cee71dSXin LIfile result returned.
250b6cee71dSXin LIDefaults to
251b6cee71dSXin LI.Sq \&: .
252b6cee71dSXin LI.It Fl f , Fl Fl files-from Ar namefile
253b6cee71dSXin LIRead the names of the files to be examined from
254b6cee71dSXin LI.Ar namefile
255b6cee71dSXin LI(one per line)
256b6cee71dSXin LIbefore the argument list.
257b6cee71dSXin LIEither
258b6cee71dSXin LI.Ar namefile
259b6cee71dSXin LIor at least one filename argument must be present;
260b6cee71dSXin LIto test the standard input, use
261b6cee71dSXin LI.Sq -
262b6cee71dSXin LIas a filename argument.
263b6cee71dSXin LIPlease note that
264b6cee71dSXin LI.Ar namefile
265b6cee71dSXin LIis unwrapped and the enclosed filenames are processed when this option is
266b6cee71dSXin LIencountered and before any further options processing is done.
267b6cee71dSXin LIThis allows one to process multiple lists of files with different command line
268b6cee71dSXin LIarguments on the same
269b6cee71dSXin LI.Nm
270b6cee71dSXin LIinvocation.
271b6cee71dSXin LIThus if you want to set the delimiter, you need to do it before you specify
272b6cee71dSXin LIthe list of files, like:
273b6cee71dSXin LI.Dq Fl F Ar @ Fl f Ar namefile ,
274b6cee71dSXin LIinstead of:
275b6cee71dSXin LI.Dq Fl f Ar namefile Fl F Ar @ .
276b6cee71dSXin LI.It Fl h , Fl Fl no-dereference
27743a5ec4eSXin LIThis option causes symlinks not to be followed
278b6cee71dSXin LI(on systems that support symbolic links).
279b6cee71dSXin LIThis is the default if the environment variable
280b6cee71dSXin LI.Dv POSIXLY_CORRECT
281b6cee71dSXin LIis not defined.
282b6cee71dSXin LI.It Fl i , Fl Fl mime
28343a5ec4eSXin LICauses the
28443a5ec4eSXin LI.Nm
28543a5ec4eSXin LIcommand to output mime type strings rather than the more
286b6cee71dSXin LItraditional human readable ones.
287b6cee71dSXin LIThus it may say
288b6cee71dSXin LI.Sq text/plain; charset=us-ascii
289b6cee71dSXin LIrather than
290b6cee71dSXin LI.Dq ASCII text .
291b6cee71dSXin LI.It Fl Fl mime-type , Fl Fl mime-encoding
292b6cee71dSXin LILike
293b6cee71dSXin LI.Fl i ,
294b6cee71dSXin LIbut print only the specified element(s).
295b6cee71dSXin LI.It Fl k , Fl Fl keep-going
296b6cee71dSXin LIDon't stop at the first match, keep going.
297b6cee71dSXin LISubsequent matches will be
298b6cee71dSXin LIhave the string
299b6cee71dSXin LI.Sq "\[rs]012\- "
300b6cee71dSXin LIprepended.
301b6cee71dSXin LI(If you want a newline, see the
302b6cee71dSXin LI.Fl r
303b6cee71dSXin LIoption.)
304b6cee71dSXin LIThe magic pattern with the highest strength (see the
305b6cee71dSXin LI.Fl l
306b6cee71dSXin LIoption) comes first.
307b6cee71dSXin LI.It Fl l , Fl Fl list
308b6cee71dSXin LIShows a list of patterns and their strength sorted descending by
309d38c30c0SXin LI.Xr magic __FSECTION__
310b6cee71dSXin LIstrength
311b6cee71dSXin LIwhich is used for the matching (see also the
312b6cee71dSXin LI.Fl k
313b6cee71dSXin LIoption).
314b6cee71dSXin LI.It Fl L , Fl Fl dereference
31543a5ec4eSXin LIThis option causes symlinks to be followed, as the like-named option in
316b6cee71dSXin LI.Xr ls 1
317b6cee71dSXin LI(on systems that support symbolic links).
318b6cee71dSXin LIThis is the default if the environment variable
319b6cee71dSXin LI.Ev POSIXLY_CORRECT
320b6cee71dSXin LIis defined.
321b6cee71dSXin LI.It Fl m , Fl Fl magic-file Ar magicfiles
322b6cee71dSXin LISpecify an alternate list of files and directories containing magic.
323b6cee71dSXin LIThis can be a single item, or a colon-separated list.
324b6cee71dSXin LIIf a compiled magic file is found alongside a file or directory,
325b6cee71dSXin LIit will be used instead.
326b6cee71dSXin LI.It Fl N , Fl Fl no-pad
327b6cee71dSXin LIDon't pad filenames so that they align in the output.
328b6cee71dSXin LI.It Fl n , Fl Fl no-buffer
329b6cee71dSXin LIForce stdout to be flushed after checking each file.
330b6cee71dSXin LIThis is only useful if checking a list of files.
331b6cee71dSXin LIIt is intended to be used by programs that want filetype output from a pipe.
332b6cee71dSXin LI.It Fl p , Fl Fl preserve-date
333b6cee71dSXin LIOn systems that support
334b6cee71dSXin LI.Xr utime 3
335b6cee71dSXin LIor
336b6cee71dSXin LI.Xr utimes 2 ,
337b6cee71dSXin LIattempt to preserve the access time of files analyzed, to pretend that
338b6cee71dSXin LI.Nm
339b6cee71dSXin LInever read them.
340c2931133SXin LI.It Fl P , Fl Fl parameter Ar name=value
341c2931133SXin LISet various parameter limits.
342898496eeSXin LI.Bl -column "elf_phnum" "Default" "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
343c2931133SXin LI.It Sy "Name" Ta Sy "Default" Ta Sy "Explanation"
344898496eeSXin LI.It Li bytes Ta 1M Ta max number of bytes to read from file
3452726a701SXin LI.It Li elf_notes Ta 256 Ta max ELF notes processed
346898496eeSXin LI.It Li elf_phnum Ta 2K Ta max ELF program sections processed
347898496eeSXin LI.It Li elf_shnum Ta 32K Ta max ELF sections processed
348898496eeSXin LI.It Li elf_shsize Ta 128MB Ta max ELF section size processed
349898496eeSXin LI.It Li encoding Ta 65K Ta max number of bytes to determine encoding
3502726a701SXin LI.It Li indir Ta 50 Ta recursion limit for indirect magic
351*ae316d1dSXin LI.It Li name Ta 100 Ta use count limit for name/use magic
352898496eeSXin LI.It Li regex Ta 8K Ta length limit for regex searches
353c2931133SXin LI.El
354b6cee71dSXin LI.It Fl r , Fl Fl raw
355b6cee71dSXin LIDon't translate unprintable characters to \eooo.
356b6cee71dSXin LINormally
357b6cee71dSXin LI.Nm
358b6cee71dSXin LItranslates unprintable characters to their octal representation.
359b6cee71dSXin LI.It Fl s , Fl Fl special-files
360b6cee71dSXin LINormally,
361b6cee71dSXin LI.Nm
362b6cee71dSXin LIonly attempts to read and determine the type of argument files which
363b6cee71dSXin LI.Xr stat 2
364b6cee71dSXin LIreports are ordinary files.
365b6cee71dSXin LIThis prevents problems, because reading special files may have peculiar
366b6cee71dSXin LIconsequences.
367b6cee71dSXin LISpecifying the
368b6cee71dSXin LI.Fl s
369b6cee71dSXin LIoption causes
370b6cee71dSXin LI.Nm
371b6cee71dSXin LIto also read argument files which are block or character special files.
372b6cee71dSXin LIThis is useful for determining the filesystem types of the data in raw
373b6cee71dSXin LIdisk partitions, which are block special files.
374b6cee71dSXin LIThis option also causes
375b6cee71dSXin LI.Nm
376b6cee71dSXin LIto disregard the file size as reported by
377b6cee71dSXin LI.Xr stat 2
378b6cee71dSXin LIsince on some systems it reports a zero size for raw disk partitions.
3792dc4dbb9SEitan Adler.It Fl S , Fl Fl no-sandbox
38058a0f0d0SEitan AdlerOn systems where libseccomp
38158a0f0d0SEitan Adler.Pa ( https://github.com/seccomp/libseccomp )
38258a0f0d0SEitan Adleris available, the
38358a0f0d0SEitan Adler.Fl S
38443a5ec4eSXin LIoption disables sandboxing which is enabled by default.
38543a5ec4eSXin LIThis option is needed for
38643a5ec4eSXin LI.Nm
38743a5ec4eSXin LIto execute external decompressing programs,
38858a0f0d0SEitan Adleri.e. when the
38958a0f0d0SEitan Adler.Fl z
39043a5ec4eSXin LIoption is specified and the built-in decompressors are not available.
391d38c30c0SXin LIOn systems where sandboxing is not available, this option has no effect.
392b6cee71dSXin LI.It Fl v , Fl Fl version
393b6cee71dSXin LIPrint the version of the program and exit.
394b6cee71dSXin LI.It Fl z , Fl Fl uncompress
395b6cee71dSXin LITry to look inside compressed files.
3965f0216bdSXin LI.It Fl Z , Fl Fl uncompress-noreport
3975f0216bdSXin LITry to look inside compressed files, but report information about the contents
3985f0216bdSXin LIonly not the compression.
399b6cee71dSXin LI.It Fl 0 , Fl Fl print0
400b6cee71dSXin LIOutput a null character
401b6cee71dSXin LI.Sq \e0
402b6cee71dSXin LIafter the end of the filename.
403b6cee71dSXin LINice to
404b6cee71dSXin LI.Xr cut 1
405b6cee71dSXin LIthe output.
406b6cee71dSXin LIThis does not affect the separator, which is still printed.
4073e41d09dSXin LI.Pp
4083e41d09dSXin LIIf this option is repeated more than once, then
4093e41d09dSXin LI.Nm
4103e41d09dSXin LIprints just the filename followed by a NUL followed by the description
4113e41d09dSXin LI(or ERROR: text) followed by a second NUL for each entry.
412b6cee71dSXin LI.It Fl -help
413b6cee71dSXin LIPrint a help message and exit.
414b6cee71dSXin LI.El
415b6cee71dSXin LI.Sh ENVIRONMENT
416b6cee71dSXin LIThe environment variable
417b6cee71dSXin LI.Ev MAGIC
418b6cee71dSXin LIcan be used to set the default magic file name.
419b6cee71dSXin LIIf that variable is set, then
420b6cee71dSXin LI.Nm
421b6cee71dSXin LIwill not attempt to open
422b6cee71dSXin LI.Pa $HOME/.magic .
423b6cee71dSXin LI.Nm
424b6cee71dSXin LIadds
425b6cee71dSXin LI.Dq Pa .mgc
426b6cee71dSXin LIto the value of this variable as appropriate.
427b6cee71dSXin LIThe environment variable
428b6cee71dSXin LI.Ev POSIXLY_CORRECT
429b6cee71dSXin LIcontrols (on systems that support symbolic links), whether
430b6cee71dSXin LI.Nm
431b6cee71dSXin LIwill attempt to follow symlinks or not.
432b6cee71dSXin LIIf set, then
433b6cee71dSXin LI.Nm
434b6cee71dSXin LIfollows symlink, otherwise it does not.
435b6cee71dSXin LIThis is also controlled by the
436b6cee71dSXin LI.Fl L
437b6cee71dSXin LIand
438b6cee71dSXin LI.Fl h
439b6cee71dSXin LIoptions.
44058a0f0d0SEitan Adler.Sh FILES
44158a0f0d0SEitan Adler.Bl -tag -width __MAGIC__.mgc -compact
44258a0f0d0SEitan Adler.It Pa __MAGIC__.mgc
44358a0f0d0SEitan AdlerDefault compiled list of magic.
44458a0f0d0SEitan Adler.It Pa __MAGIC__
44558a0f0d0SEitan AdlerDirectory containing default magic files.
44658a0f0d0SEitan Adler.El
44758a0f0d0SEitan Adler.Sh EXIT STATUS
44858a0f0d0SEitan Adler.Nm
44958a0f0d0SEitan Adlerwill exit with
45058a0f0d0SEitan Adler.Dv 0
45158a0f0d0SEitan Adlerif the operation was successful or
45258a0f0d0SEitan Adler.Dv >0
45358a0f0d0SEitan Adlerif an error was encountered.
45458a0f0d0SEitan AdlerThe following errors cause diagnostic messages, but don't affect the program
45558a0f0d0SEitan Adlerexit code (as POSIX requires), unless
45658a0f0d0SEitan Adler.Fl E
45758a0f0d0SEitan Adleris specified:
45858a0f0d0SEitan Adler.Bl -bullet -compact -offset indent
45958a0f0d0SEitan Adler.It
46058a0f0d0SEitan AdlerA file cannot be found
46158a0f0d0SEitan Adler.It
46258a0f0d0SEitan AdlerThere is no permission to read a file
46358a0f0d0SEitan Adler.It
46458a0f0d0SEitan AdlerThe file type cannot be determined
46558a0f0d0SEitan Adler.El
46658a0f0d0SEitan Adler.Sh EXAMPLES
46758a0f0d0SEitan Adler.Bd -literal -offset indent
46858a0f0d0SEitan Adler$ file file.c file /dev/{wd0a,hda}
46958a0f0d0SEitan Adlerfile.c:	  C program text
47058a0f0d0SEitan Adlerfile:	  ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
47158a0f0d0SEitan Adler	  dynamically linked (uses shared libs), stripped
47258a0f0d0SEitan Adler/dev/wd0a: block special (0/0)
47358a0f0d0SEitan Adler/dev/hda: block special (3/0)
47458a0f0d0SEitan Adler
47558a0f0d0SEitan Adler$ file -s /dev/wd0{b,d}
47658a0f0d0SEitan Adler/dev/wd0b: data
47758a0f0d0SEitan Adler/dev/wd0d: x86 boot sector
47858a0f0d0SEitan Adler
47958a0f0d0SEitan Adler$ file -s /dev/hda{,1,2,3,4,5,6,7,8,9,10}
48058a0f0d0SEitan Adler/dev/hda:   x86 boot sector
48158a0f0d0SEitan Adler/dev/hda1:  Linux/i386 ext2 filesystem
48258a0f0d0SEitan Adler/dev/hda2:  x86 boot sector
48358a0f0d0SEitan Adler/dev/hda3:  x86 boot sector, extended partition table
48458a0f0d0SEitan Adler/dev/hda4:  Linux/i386 ext2 filesystem
48558a0f0d0SEitan Adler/dev/hda5:  Linux/i386 swap file
48658a0f0d0SEitan Adler/dev/hda6:  Linux/i386 swap file
48758a0f0d0SEitan Adler/dev/hda7:  Linux/i386 swap file
48858a0f0d0SEitan Adler/dev/hda8:  Linux/i386 swap file
48958a0f0d0SEitan Adler/dev/hda9:  empty
49058a0f0d0SEitan Adler/dev/hda10: empty
49158a0f0d0SEitan Adler
49258a0f0d0SEitan Adler$ file -i file.c file /dev/{wd0a,hda}
49358a0f0d0SEitan Adlerfile.c:	     text/x-c
49458a0f0d0SEitan Adlerfile:	     application/x-executable
49558a0f0d0SEitan Adler/dev/hda:    application/x-not-regular-file
49658a0f0d0SEitan Adler/dev/wd0a:   application/x-not-regular-file
49758a0f0d0SEitan Adler
49858a0f0d0SEitan Adler.Ed
499b6cee71dSXin LI.Sh SEE ALSO
500b6cee71dSXin LI.Xr hexdump 1 ,
501b6cee71dSXin LI.Xr od 1 ,
502b6cee71dSXin LI.Xr strings 1 ,
50340427ccaSGordon Tetlow.Xr magic __FSECTION__ ,
504be3a49eeSEdward Tomasz Napierala.Xr fstyp 8
505b6cee71dSXin LI.Sh STANDARDS CONFORMANCE
506b6cee71dSXin LIThis program is believed to exceed the System V Interface Definition
507b6cee71dSXin LIof FILE(CMD), as near as one can determine from the vague language
508b6cee71dSXin LIcontained therein.
509b6cee71dSXin LIIts behavior is mostly compatible with the System V program of the same name.
510b6cee71dSXin LIThis version knows more magic, however, so it will produce
511b6cee71dSXin LIdifferent (albeit more accurate) output in many cases.
512b6cee71dSXin LI.\" URL: http://www.opengroup.org/onlinepubs/009695399/utilities/file.html
513b6cee71dSXin LI.Pp
514b6cee71dSXin LIThe one significant difference
515b6cee71dSXin LIbetween this version and System V
516b6cee71dSXin LIis that this version treats any white space
517b6cee71dSXin LIas a delimiter, so that spaces in pattern strings must be escaped.
518b6cee71dSXin LIFor example,
519b6cee71dSXin LI.Bd -literal -offset indent
520b6cee71dSXin LI\*[Gt]10	string	language impress\	(imPRESS data)
521b6cee71dSXin LI.Ed
522b6cee71dSXin LI.Pp
523b6cee71dSXin LIin an existing magic file would have to be changed to
524b6cee71dSXin LI.Bd -literal -offset indent
525b6cee71dSXin LI\*[Gt]10	string	language\e impress	(imPRESS data)
526b6cee71dSXin LI.Ed
527b6cee71dSXin LI.Pp
528b6cee71dSXin LIIn addition, in this version, if a pattern string contains a backslash,
529b6cee71dSXin LIit must be escaped.
530b6cee71dSXin LIFor example
531b6cee71dSXin LI.Bd -literal -offset indent
532b6cee71dSXin LI0	string		\ebegindata	Andrew Toolkit document
533b6cee71dSXin LI.Ed
534b6cee71dSXin LI.Pp
535b6cee71dSXin LIin an existing magic file would have to be changed to
536b6cee71dSXin LI.Bd -literal -offset indent
537b6cee71dSXin LI0	string		\e\ebegindata	Andrew Toolkit document
538b6cee71dSXin LI.Ed
539b6cee71dSXin LI.Pp
540b6cee71dSXin LISunOS releases 3.2 and later from Sun Microsystems include a
541b6cee71dSXin LI.Nm
542b6cee71dSXin LIcommand derived from the System V one, but with some extensions.
543b6cee71dSXin LIThis version differs from Sun's only in minor ways.
544b6cee71dSXin LIIt includes the extension of the
545b6cee71dSXin LI.Sq \*[Am]
546b6cee71dSXin LIoperator, used as,
547b6cee71dSXin LIfor example,
548b6cee71dSXin LI.Bd -literal -offset indent
549b6cee71dSXin LI\*[Gt]16	long\*[Am]0x7fffffff	\*[Gt]0		not stripped
550b6cee71dSXin LI.Ed
55158a0f0d0SEitan Adler.Sh SECURITY
55258a0f0d0SEitan AdlerOn systems where libseccomp
55358a0f0d0SEitan Adler.Pa ( https://github.com/seccomp/libseccomp )
55458a0f0d0SEitan Adleris available,
55558a0f0d0SEitan Adler.Nm
55658a0f0d0SEitan Adleris enforces limiting system calls to only the ones necessary for the
55758a0f0d0SEitan Adleroperation of the program.
55858a0f0d0SEitan AdlerThis enforcement does not provide any security benefit when
55958a0f0d0SEitan Adler.Nm
56058a0f0d0SEitan Adleris asked to decompress input files running external programs with
56158a0f0d0SEitan Adlerthe
56258a0f0d0SEitan Adler.Fl z
56358a0f0d0SEitan Adleroption.
56458a0f0d0SEitan AdlerTo enable execution of external decompressors, one needs to disable
56558a0f0d0SEitan Adlersandboxing using the
56658a0f0d0SEitan Adler.Fl S
56743a5ec4eSXin LIoption.
568b6cee71dSXin LI.Sh MAGIC DIRECTORY
569b6cee71dSXin LIThe magic file entries have been collected from various sources,
570b6cee71dSXin LImainly USENET, and contributed by various authors.
571b6cee71dSXin LIChristos Zoulas (address below) will collect additional
572b6cee71dSXin LIor corrected magic file entries.
573b6cee71dSXin LIA consolidation of magic file entries
574b6cee71dSXin LIwill be distributed periodically.
575b6cee71dSXin LI.Pp
576b6cee71dSXin LIThe order of entries in the magic file is significant.
577b6cee71dSXin LIDepending on what system you are using, the order that
578b6cee71dSXin LIthey are put together may be incorrect.
579b6cee71dSXin LIIf your old
580b6cee71dSXin LI.Nm
581b6cee71dSXin LIcommand uses a magic file,
582b6cee71dSXin LIkeep the old magic file around for comparison purposes
583b6cee71dSXin LI(rename it to
584b6cee71dSXin LI.Pa __MAGIC__.orig ) .
585b6cee71dSXin LI.Sh HISTORY
586b6cee71dSXin LIThere has been a
587b6cee71dSXin LI.Nm
588b6cee71dSXin LIcommand in every
589b6cee71dSXin LI.Dv UNIX since at least Research Version 4
590b6cee71dSXin LI(man page dated November, 1973).
591b6cee71dSXin LIThe System V version introduced one significant major change:
592b6cee71dSXin LIthe external list of magic types.
593b6cee71dSXin LIThis slowed the program down slightly but made it a lot more flexible.
594b6cee71dSXin LI.Pp
595b6cee71dSXin LIThis program, based on the System V version,
596b6cee71dSXin LIwas written by Ian Darwin
597b6cee71dSXin LI.Aq ian@darwinsys.com
598b6cee71dSXin LIwithout looking at anybody else's source code.
599b6cee71dSXin LI.Pp
600b6cee71dSXin LIJohn Gilmore revised the code extensively, making it better than
601b6cee71dSXin LIthe first version.
602b6cee71dSXin LIGeoff Collyer found several inadequacies
603b6cee71dSXin LIand provided some magic file entries.
60440427ccaSGordon TetlowContributions of the
605b6cee71dSXin LI.Sq \*[Am]
606b6cee71dSXin LIoperator by Rob McMahon,
607b6cee71dSXin LI.Aq cudcv@warwick.ac.uk ,
608b6cee71dSXin LI1989.
609b6cee71dSXin LI.Pp
610b6cee71dSXin LIGuy Harris,
611b6cee71dSXin LI.Aq guy@netapp.com ,
612b6cee71dSXin LImade many changes from 1993 to the present.
613b6cee71dSXin LI.Pp
614b6cee71dSXin LIPrimary development and maintenance from 1990 to the present by
615b6cee71dSXin LIChristos Zoulas
616b6cee71dSXin LI.Aq christos@astron.com .
617b6cee71dSXin LI.Pp
618b6cee71dSXin LIAltered by Chris Lowth
619b6cee71dSXin LI.Aq chris@lowth.com ,
620b6cee71dSXin LI2000: handle the
621b6cee71dSXin LI.Fl i
622b6cee71dSXin LIoption to output mime type strings, using an alternative
623b6cee71dSXin LImagic file and internal logic.
624b6cee71dSXin LI.Pp
625b6cee71dSXin LIAltered by Eric Fischer
626b6cee71dSXin LI.Aq enf@pobox.com ,
627b6cee71dSXin LIJuly, 2000,
628b6cee71dSXin LIto identify character codes and attempt to identify the languages
629b6cee71dSXin LIof non-ASCII files.
630b6cee71dSXin LI.Pp
631b6cee71dSXin LIAltered by Reuben Thomas
632b6cee71dSXin LI.Aq rrt@sc3d.org ,
633b6cee71dSXin LI2007-2011, to improve MIME support, merge MIME and non-MIME magic,
634b6cee71dSXin LIsupport directories as well as files of magic, apply many bug fixes,
635b6cee71dSXin LIupdate and fix a lot of magic, improve the build system, improve the
636b6cee71dSXin LIdocumentation, and rewrite the Python bindings in pure Python.
637b6cee71dSXin LI.Pp
638b6cee71dSXin LIThe list of contributors to the
639b6cee71dSXin LI.Sq magic
640b6cee71dSXin LIdirectory (magic files)
641b6cee71dSXin LIis too long to include here.
642b6cee71dSXin LIYou know who you are; thank you.
643b6cee71dSXin LIMany contributors are listed in the source files.
644b6cee71dSXin LI.Sh LEGAL NOTICE
645b6cee71dSXin LICopyright (c) Ian F. Darwin, Toronto, Canada, 1986-1999.
646b6cee71dSXin LICovered by the standard Berkeley Software Distribution copyright; see the file
647b6cee71dSXin LICOPYING in the source distribution.
648b6cee71dSXin LI.Pp
649b6cee71dSXin LIThe files
650b6cee71dSXin LI.Pa tar.h
651b6cee71dSXin LIand
652b6cee71dSXin LI.Pa is_tar.c
653b6cee71dSXin LIwere written by John Gilmore from his public-domain
654b6cee71dSXin LI.Xr tar 1
655b6cee71dSXin LIprogram, and are not covered by the above license.
656b6cee71dSXin LI.Sh BUGS
657b6cee71dSXin LIPlease report bugs and send patches to the bug tracker at
65848c779cdSXin LI.Pa https://bugs.astron.com/
659b6cee71dSXin LIor the mailing list at
6602dc4dbb9SEitan Adler.Aq file@astron.com
661b6cee71dSXin LI(visit
66248c779cdSXin LI.Pa https://mailman.astron.com/mailman/listinfo/file
663b6cee71dSXin LIfirst to subscribe).
664b6cee71dSXin LI.Sh TODO
665b6cee71dSXin LIFix output so that tests for MIME and APPLE flags are not needed all
666b6cee71dSXin LIover the place, and actual output is only done in one place.
667b6cee71dSXin LIThis needs a design.
668b6cee71dSXin LISuggestion: push possible outputs on to a list, then pick the
669b6cee71dSXin LIlast-pushed (most specific, one hopes) value at the end, or
670b6cee71dSXin LIuse a default if the list is empty.
671b6cee71dSXin LIThis should not slow down evaluation.
672b6cee71dSXin LI.Pp
6735f0216bdSXin LIThe handling of
6745f0216bdSXin LI.Dv MAGIC_CONTINUE
6755f0216bdSXin LIand printing \e012- between entries is clumsy and complicated; refactor
6765f0216bdSXin LIand centralize.
6775f0216bdSXin LI.Pp
6785f0216bdSXin LISome of the encoding logic is hard-coded in encoding.c and can be moved
67943a5ec4eSXin LIto the magic files if we had a !:charset annotation.
6805f0216bdSXin LI.Pp
681b6cee71dSXin LIContinue to squash all magic bugs.
682b6cee71dSXin LISee Debian BTS for a good source.
683b6cee71dSXin LI.Pp
684b6cee71dSXin LIStore arbitrarily long strings, for example for %s patterns, so that
685b6cee71dSXin LIthey can be printed out.
686b6cee71dSXin LIFixes Debian bug #271672.
6875f0216bdSXin LIThis can be done by allocating strings in a string pool, storing the
6885f0216bdSXin LIstring pool at the end of the magic file and converting all the string
6895f0216bdSXin LIpointers to relative offsets from the string pool.
690b6cee71dSXin LI.Pp
691b6cee71dSXin LIAdd syntax for relative offsets after current level (Debian bug #466037).
692b6cee71dSXin LI.Pp
693b6cee71dSXin LIMake file -ki work, i.e. give multiple MIME types.
694b6cee71dSXin LI.Pp
695b6cee71dSXin LIAdd a zip library so we can peek inside Office2007 documents to
6965f0216bdSXin LIprint more details about their contents.
697b6cee71dSXin LI.Pp
698b6cee71dSXin LIAdd an option to print URLs for the sources of the file descriptions.
699b6cee71dSXin LI.Pp
700b6cee71dSXin LICombine script searches and add a way to map executable names to MIME
701b6cee71dSXin LItypes (e.g. have a magic value for !:mime which causes the resulting
702b6cee71dSXin LIstring to be looked up in a table).
703b6cee71dSXin LIThis would avoid adding the same magic repeatedly for each new
704b6cee71dSXin LIhash-bang interpreter.
705b6cee71dSXin LI.Pp
7065f0216bdSXin LIWhen a file descriptor is available, we can skip and adjust the buffer
7075f0216bdSXin LIinstead of the hacky buffer management we do now.
7085f0216bdSXin LI.Pp
709b6cee71dSXin LIFix
710b6cee71dSXin LI.Dq name
711b6cee71dSXin LIand
712b6cee71dSXin LI.Dq use
713b6cee71dSXin LIto check for consistency at compile time (duplicate
714b6cee71dSXin LI.Dq name ,
715b6cee71dSXin LI.Dq use
716b6cee71dSXin LIpointing to undefined
717b6cee71dSXin LI.Dq name
718b6cee71dSXin LI).
719b6cee71dSXin LIMake
720b6cee71dSXin LI.Dq name
721b6cee71dSXin LI/
722b6cee71dSXin LI.Dq use
723b6cee71dSXin LImore efficient by keeping a sorted list of names.
724b6cee71dSXin LISpecial-case ^ to flip endianness in the parser so that it does not
725b6cee71dSXin LIhave to be escaped, and document it.
7265f0216bdSXin LI.Pp
7275f0216bdSXin LIIf the offsets specified internally in the file exceed the buffer size
7285f0216bdSXin LI(
7295f0216bdSXin LI.Dv HOWMANY
7305f0216bdSXin LIvariable in file.h), then we don't seek to that offset, but we give up.
7315f0216bdSXin LIIt would be better if buffer managements was done when the file descriptor
73243a5ec4eSXin LIis available so we can seek around the file.
73343a5ec4eSXin LIOne must be careful though because this has performance and thus security
734898496eeSXin LIconsiderations, because one can slow down things by repeatedly seeking.
73543a5ec4eSXin LI.Pp
73643a5ec4eSXin LIThere is support now for keeping separate buffers and having offsets from
73743a5ec4eSXin LIthe end of the file, but the internal buffer management still needs an
73843a5ec4eSXin LIoverhaul.
739b6cee71dSXin LI.Sh AVAILABILITY
740b6cee71dSXin LIYou can obtain the original author's latest version by anonymous FTP
741b6cee71dSXin LIon
742b6cee71dSXin LI.Pa ftp.astron.com
743b6cee71dSXin LIin the directory
744b6cee71dSXin LI.Pa /pub/file/file-X.YZ.tar.gz .
745