xref: /freebsd/contrib/file/doc/magic.man (revision 0e8011faf58b743cc652e3b2ad0f7671227610df)
1.\" $File: magic.man,v 1.110 2024/11/27 15:37:00 christos Exp $
2.Dd November 27, 2024
3.Dt MAGIC __FSECTION__
4.Os
5.\" install as magic.4 on USG, magic.5 on V7, Berkeley and Linux systems.
6.Sh NAME
7.Nm magic
8.Nd file command's magic pattern file
9.Sh DESCRIPTION
10This manual page documents the format of magic files as
11used by the
12.Xr file __CSECTION__
13command, version __VERSION__.
14The
15.Xr file __CSECTION__
16command identifies the type of a file using,
17among other tests,
18a test for whether the file contains certain
19.Dq "magic patterns" .
20The database of these
21.Dq "magic patterns"
22is usually located in a binary file in
23.Pa __MAGIC__.mgc
24or a directory of source text magic pattern fragment files in
25.Pa __MAGIC__ .
26The database specifies what patterns are to be tested for, what message or
27MIME type to print if a particular pattern is found,
28and additional information to extract from the file.
29.Pp
30The format of the source fragment files that are used to build this database
31is as follows:
32Each line of a fragment file specifies a test to be performed.
33A test compares the data starting at a particular offset
34in the file with a byte value, a string or a numeric value.
35If the test succeeds, a message is printed.
36The line consists of the following fields:
37.Bl -tag -width ".Dv message"
38.It Dv offset
39A number specifying the offset (in bytes) into the file of the data
40which is to be tested.
41This offset can be a negative number if it is:
42.Bl -bullet  -compact
43.It
44The first direct offset of the magic entry (at continuation level 0),
45in which case it is interpreted an offset from end end of the file
46going backwards.
47This works only when a file descriptor to the file is available and it
48is a regular file.
49.It
50A continuation offset relative to the end of the last up-level field
51.Dv ( \*[Am] ) .
52.El
53If the offset starts with the symbol
54.Dq + ,
55then all offsets are interpreted as from the beginning of the file (the
56default).
57.It Dv type
58The type of the data to be tested.
59The possible values are:
60.Bl -tag -width ".Dv lestring16"
61.It Dv byte
62A one-byte value.
63.It Dv short
64A two-byte value in this machine's native byte order.
65.It Dv long
66A four-byte value in this machine's native byte order.
67.It Dv quad
68An eight-byte value in this machine's native byte order.
69.It Dv float
70A 32-bit single precision IEEE floating point number in this machine's native byte order.
71.It Dv double
72A 64-bit double precision IEEE floating point number in this machine's native byte order.
73.It Dv string
74A string of bytes.
75The string type specification can be optionally followed by a /<width>
76option and optionally followed by a set of flags /[bCcftTtWw]*.
77The width limits the number of characters to be copied.
78Zero means all characters.
79The following flags are supported:
80.Bl -tag -width B -compact -offset XXXX
81.It b
82Force binary file test.
83.It C
84Use upper case insensitive matching: upper case
85characters in the magic match both lower and upper case characters in the
86target, whereas lower case characters in the magic only match upper case
87characters in the target.
88.It c
89Use lower case insensitive matching: lower case
90characters in the magic match both lower and upper case characters in the
91target, whereas upper case characters in the magic only match upper case
92characters in the target.
93To do a complete case insensitive match, specify both
94.Dq c
95and
96.Dq C .
97.It f
98Require that the matched string is a full word, not a partial word match.
99.It T
100Trim the string, i.e. leading and trailing whitespace
101.It t
102Force text file test.
103.It W
104Compact whitespace in the target, which must
105contain at least one whitespace character.
106If the magic has
107.Dv n
108consecutive blanks, the target needs at least
109.Dv n
110consecutive blanks to match.
111.It w
112Treat every blank in the magic as an optional blank.
113is deleted before the string is printed.
114.El
115.It Dv pstring
116A Pascal-style string where the first byte/short/int is interpreted as the
117unsigned length.
118The length defaults to byte and can be specified as a modifier.
119The following modifiers are supported:
120.Bl -tag -width B -compact -offset XXXX
121.It B
122A byte length (default).
123.It H
124A 2 byte big endian length.
125.It h
126A 2 byte little endian length.
127.It L
128A 4 byte big endian length.
129.It l
130A 4 byte little endian length.
131.It J
132The length includes itself in its count.
133.El
134The string is not NUL terminated.
135.Dq J
136is used rather than the more
137valuable
138.Dq I
139because this type of length is a feature of the JPEG
140format.
141.It Dv date
142A four-byte value interpreted as a UNIX date.
143.It Dv qdate
144An eight-byte value interpreted as a UNIX date.
145.It Dv ldate
146A four-byte value interpreted as a UNIX-style date, but interpreted as
147local time rather than UTC.
148.It Dv qldate
149An eight-byte value interpreted as a UNIX-style date, but interpreted as
150local time rather than UTC.
151.It Dv qwdate
152An eight-byte value interpreted as a Windows-style date.
153.It Dv msdosdate
154A two-byte value interpreted as FAT/DOS-style date.
155.It Dv msdostime
156A two-byte value interpreted as FAT/DOS-style time.
157.It Dv beid3
158A 32-bit ID3 length in big-endian byte order.
159.It Dv beshort
160A two-byte value in big-endian byte order.
161.It Dv belong
162A four-byte value in big-endian byte order.
163.It Dv bequad
164An eight-byte value in big-endian byte order.
165.It Dv befloat
166A 32-bit single precision IEEE floating point number in big-endian byte order.
167.It Dv bedouble
168A 64-bit double precision IEEE floating point number in big-endian byte order.
169.It Dv bedate
170A four-byte value in big-endian byte order,
171interpreted as a Unix date.
172.It Dv beqdate
173An eight-byte value in big-endian byte order,
174interpreted as a Unix date.
175.It Dv beldate
176A four-byte value in big-endian byte order,
177interpreted as a UNIX-style date, but interpreted as local time rather
178than UTC.
179.It Dv beqldate
180An eight-byte value in big-endian byte order,
181interpreted as a UNIX-style date, but interpreted as local time rather
182than UTC.
183.It Dv beqwdate
184An eight-byte value in big-endian byte order,
185interpreted as a Windows-style date.
186.It Dv bemsdosdate
187A two-byte value in big-endian byte order,
188interpreted as FAT/DOS-style date.
189.It Dv bemsdostime
190A two-byte value in big-endian byte order,
191interpreted as FAT/DOS-style time.
192.It Dv bestring16
193A two-byte unicode (UCS16) string in big-endian byte order.
194.It Dv leid3
195A 32-bit ID3 length in little-endian byte order.
196.It Dv leshort
197A two-byte value in little-endian byte order.
198.It Dv lelong
199A four-byte value in little-endian byte order.
200.It Dv lequad
201An eight-byte value in little-endian byte order.
202.It Dv lefloat
203A 32-bit single precision IEEE floating point number in little-endian byte order.
204.It Dv ledouble
205A 64-bit double precision IEEE floating point number in little-endian byte order.
206.It Dv ledate
207A four-byte value in little-endian byte order,
208interpreted as a UNIX date.
209.It Dv leqdate
210An eight-byte value in little-endian byte order,
211interpreted as a UNIX date.
212.It Dv leldate
213A four-byte value in little-endian byte order,
214interpreted as a UNIX-style date, but interpreted as local time rather
215than UTC.
216.It Dv leqldate
217An eight-byte value in little-endian byte order,
218interpreted as a UNIX-style date, but interpreted as local time rather
219than UTC.
220.It Dv leqwdate
221An eight-byte value in little-endian byte order,
222interpreted as a Windows-style date.
223.It Dv lemsdosdate
224A two-byte value in big-endian byte order,
225interpreted as FAT/DOS-style date.
226.It Dv lemsdostime
227A two-byte value in big-endian byte order,
228interpreted as FAT/DOS-style time.
229.It Dv lestring16
230A two-byte unicode (UCS16) string in little-endian byte order.
231.It Dv melong
232A four-byte value in middle-endian (PDP-11) byte order.
233.It Dv medate
234A four-byte value in middle-endian (PDP-11) byte order,
235interpreted as a UNIX date.
236.It Dv meldate
237A four-byte value in middle-endian (PDP-11) byte order,
238interpreted as a UNIX-style date, but interpreted as local time rather
239than UTC.
240.It Dv indirect
241Starting at the given offset, consult the magic database again.
242The offset of the
243.Dv indirect
244magic is by default absolute in the file, but one can specify
245.Dv /r
246to indicate that the offset is relative from the beginning of the entry.
247.It Dv name
248Define a
249.Dq named
250magic instance that can be called from another
251.Dv use
252magic entry, like a subroutine call.
253Named instance direct magic offsets are relative to the offset of the
254previous matched entry, but indirect offsets are relative to the beginning
255of the file as usual.
256Named magic entries always match.
257.It Dv use
258Recursively call the named magic starting from the current offset.
259If the name of the referenced begins with a
260.Dv ^
261then the endianness of the magic is switched; if the magic mentioned
262.Dv leshort
263for example,
264it is treated as
265.Dv beshort
266and vice versa.
267This is useful to avoid duplicating the rules for different endianness.
268.It Dv regex
269A regular expression match in extended POSIX regular expression syntax
270(like egrep).
271Regular expressions can take exponential time to process, and their
272performance is hard to predict, so their use is discouraged.
273When used in production environments, their performance
274should be carefully checked.
275The size of the string to search should also be limited by specifying
276.Dv /<length> ,
277to avoid performance issues scanning long files.
278The type specification can also be optionally followed by
279.Dv /[c][s][l] .
280The
281.Dq c
282flag makes the match case insensitive, while the
283.Dq s
284flag update the offset to the start offset of the match, rather than the end.
285The
286.Dq l
287modifier, changes the limit of length to mean number of lines instead of a
288byte count.
289Lines are delimited by the platforms native line delimiter.
290When a line count is specified, an implicit byte count also computed assuming
291each line is 80 characters long.
292If neither a byte or line count is specified, the search is limited automatically
293to 8KiB.
294.Dv ^
295and
296.Dv $
297match the beginning and end of individual lines, respectively,
298not beginning and end of file.
299.It Dv search
300A literal string search starting at the given offset.
301The same modifier flags can be used as for string patterns.
302The search expression must contain the range in the form
303.Dv /number,
304that is the number of positions at which the match will be
305attempted, starting from the start offset.
306This is suitable for
307searching larger binary expressions with variable offsets, using
308.Dv \e
309escapes for special characters.
310The order of modifier and number is not relevant.
311.It Dv default
312This is intended to be used with the test
313.Em x
314(which is always true) and it has no type.
315It matches when no other test at that continuation level has matched before.
316Clearing that matched tests for a continuation level, can be done using the
317.Dv clear
318test.
319.It Dv clear
320This test is always true and clears the match flag for that continuation level.
321It is intended to be used with the
322.Dv default
323test.
324.It Dv der
325Parse the file as a DER Certificate file.
326The test field is used as a der type that needs to be matched.
327The DER types are:
328.Dv eoc ,
329.Dv bool ,
330.Dv int ,
331.Dv bit_str ,
332.Dv octet_str ,
333.Dv null ,
334.Dv obj_id ,
335.Dv obj_desc ,
336.Dv ext ,
337.Dv real ,
338.Dv enum ,
339.Dv embed ,
340.Dv utf8_str ,
341.Dv rel_oid ,
342.Dv time ,
343.Dv res2 ,
344.Dv seq ,
345.Dv set ,
346.Dv num_str ,
347.Dv prt_str ,
348.Dv t61_str ,
349.Dv vid_str ,
350.Dv ia5_str ,
351.Dv utc_time ,
352.Dv gen_time ,
353.Dv gr_str ,
354.Dv vis_str ,
355.Dv gen_str ,
356.Dv univ_str ,
357.Dv char_str ,
358.Dv bmp_str ,
359.Dv date ,
360.Dv tod ,
361.Dv datetime ,
362.Dv duration ,
363.Dv oid-iri ,
364.Dv rel-oid-iri .
365These types can be followed by an optional numeric size, which indicates
366the field width in bytes.
367.It Dv guid
368A Globally Unique Identifier, parsed and printed as
369XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX.
370It's format is a string.
371.It Dv offset
372This is a quad value indicating the current offset of the file.
373It can be used to determine the size of the file or the magic buffer.
374For example the magic entries:
375.Bd -literal -offset indent
376-0	offset	x	this file is %lld bytes
377-0	offset	<=100	must be more than 100 \e
378    bytes and is only %lld
379.Ed
380.It Dv octal
381A string representing an octal number.
382.El
383.Pp
384For compatibility with the Single
385.Ux
386Standard, the type specifiers
387.Dv dC
388and
389.Dv d1
390are equivalent to
391.Dv byte ,
392the type specifiers
393.Dv uC
394and
395.Dv u1
396are equivalent to
397.Dv ubyte ,
398the type specifiers
399.Dv dS
400and
401.Dv d2
402are equivalent to
403.Dv short ,
404the type specifiers
405.Dv uS
406and
407.Dv u2
408are equivalent to
409.Dv ushort ,
410the type specifiers
411.Dv dI ,
412.Dv dL ,
413and
414.Dv d4
415are equivalent to
416.Dv long ,
417the type specifiers
418.Dv uI ,
419.Dv uL ,
420and
421.Dv u4
422are equivalent to
423.Dv ulong ,
424the type specifier
425.Dv d8
426is equivalent to
427.Dv quad ,
428the type specifier
429.Dv u8
430is equivalent to
431.Dv uquad ,
432and the type specifier
433.Dv s
434is equivalent to
435.Dv string .
436In addition, the type specifier
437.Dv dQ
438is equivalent to
439.Dv quad
440and the type specifier
441.Dv uQ
442is equivalent to
443.Dv uquad .
444.Pp
445Each top-level magic pattern (see below for an explanation of levels)
446is classified as text or binary according to the types used.
447Types
448.Dq regex
449and
450.Dq search
451are classified as text tests, unless non-printable characters are used
452in the pattern.
453All other tests are classified as binary.
454A top-level
455pattern is considered to be a test text when all its patterns are text
456patterns; otherwise, it is considered to be a binary pattern.
457When
458matching a file, binary patterns are tried first; if no match is
459found, and the file looks like text, then its encoding is determined
460and the text patterns are tried.
461.Pp
462The numeric types may optionally be followed by
463.Dv \*[Am]
464and a numeric value,
465to specify that the value is to be AND'ed with the
466numeric value before any comparisons are done.
467Prepending a
468.Dv u
469to the type indicates that ordered comparisons should be unsigned.
470.It Dv test
471The value to be compared with the value from the file.
472If the type is
473numeric, this value
474is specified in C form; if it is a string, it is specified as a C string
475with the usual escapes permitted (e.g. \en for new-line).
476.Pp
477Numeric values
478may be preceded by a character indicating the operation to be performed.
479It may be
480.Dv = ,
481to specify that the value from the file must equal the specified value,
482.Dv \*[Lt] ,
483to specify that the value from the file must be less than the specified
484value,
485.Dv \*[Gt] ,
486to specify that the value from the file must be greater than the specified
487value,
488.Dv \*[Am] ,
489to specify that the value from the file must have set all of the bits
490that are set in the specified value,
491.Dv ^ ,
492to specify that the value from the file must have clear any of the bits
493that are set in the specified value, or
494.Dv ~ ,
495the value specified after is negated before tested.
496.Dv x ,
497to specify that any value will match.
498If the character is omitted, it is assumed to be
499.Dv = .
500Operators
501.Dv \*[Am] ,
502.Dv ^ ,
503and
504.Dv ~
505don't work with floats and doubles.
506The operator
507.Dv !\&
508specifies that the line matches if the test does
509.Em not
510succeed.
511.Pp
512Numeric values are specified in C form; e.g.
513.Dv 13
514is decimal,
515.Dv 013
516is octal, and
517.Dv 0x13
518is hexadecimal.
519.Pp
520Numeric operations are not performed on date types, instead the numeric
521value is interpreted as an offset.
522.Pp
523For string values, the string from the
524file must match the specified string.
525The operators
526.Dv = ,
527.Dv \*[Lt]
528and
529.Dv \*[Gt]
530(but not
531.Dv \*[Am] )
532can be applied to strings.
533The length used for matching is that of the string argument
534in the magic file.
535This means that a line can match any non-empty string (usually used to
536then print the string), with
537.Em \*[Gt]\e0
538(because all non-empty strings are greater than the empty string).
539.Pp
540Dates are treated as numerical values in the respective internal
541representation.
542.Pp
543The special test
544.Em x
545always evaluates to true.
546.It Dv message
547The message to be printed if the comparison succeeds.
548If the string contains a
549.Xr printf 3
550format specification, the value from the file (with any specified masking
551performed) is printed using the message as the format string.
552If the string begins with
553.Dq \eb ,
554the message printed is the remainder of the string with no whitespace
555added before it: multiple matches are normally separated by a single
556space.
557.El
558.Pp
559An APPLE 4+4 character APPLE creator and type can be specified as:
560.Bd -literal -offset indent
561!:apple	CREATYPE
562.Ed
563.Pp
564A slash-separated list of commonly found filename extensions can be specified
565as:
566.Bd -literal -offset indent
567!:ext	ext[/ext...]
568.Ed
569.Pp
570i.e. the literal string
571.Dq !:ext
572followed by a slash-separated list of commonly found extensions; for example
573for JPEG images:
574.Bd -literal -offset indent
575!:ext jpeg/jpg/jpe/jfif
576.Ed
577.Pp
578A MIME type is given on a separate line, which must be the next
579non-blank or comment line after the magic line that identifies the
580file type, and has the following format:
581.Bd -literal -offset indent
582!:mime	MIMETYPE
583.Ed
584.Pp
585i.e. the literal string
586.Dq !:mime
587followed by the MIME type.
588.Pp
589An optional strength can be supplied on a separate line which refers to
590the current magic description using the following format:
591.Bd -literal -offset indent
592!:strength OP VALUE
593.Ed
594.Pp
595The operand
596.Dv OP
597can be:
598.Dv + ,
599.Dv - ,
600.Dv * ,
601or
602.Dv /
603and
604.Dv VALUE
605is a constant between 0 and 255.
606This constant is applied using the specified operand
607to the currently computed default magic strength.
608.Pp
609Some file formats contain additional information which is to be printed
610along with the file type or need additional tests to determine the true
611file type.
612These additional tests are introduced by one or more
613.Em \*[Gt]
614characters preceding the offset.
615The number of
616.Em \*[Gt]
617on the line indicates the level of the test; a line with no
618.Em \*[Gt]
619at the beginning is considered to be at level 0.
620Tests are arranged in a tree-like hierarchy:
621if the test on a line at level
622.Em n
623succeeds, all following tests at level
624.Em n+1
625are performed, and the messages printed if the tests succeed, until a line
626with level
627.Em n
628(or less) appears.
629For more complex files, one can use empty messages to get just the
630"if/then" effect, in the following way:
631.Bd -literal -offset indent
6320      string    MZ
633\*[Gt]0x18  uleshort  \*[Lt]0x40   MS-DOS executable
634\*[Gt]0x18  uleshort  \*[Gt]0x3f   extended PC executable (e.g., MS Windows)
635.Ed
636.Pp
637Offsets do not need to be constant, but can also be read from the file
638being examined.
639If the first character following the last
640.Em \*[Gt]
641is a
642.Em \&(
643then the string after the parenthesis is interpreted as an indirect offset.
644That means that the number after the parenthesis is used as an offset in
645the file.
646The value at that offset is read, and is used again as an offset
647in the file.
648Indirect offsets are of the form:
649.Em ( x [[.,][bBcCeEfFgGhHiIlmosSqQ]][+\-][ y ]) .
650The value of
651.Em x
652is used as an offset in the file.
653A byte, id3 length, short or long is read at that offset depending on the
654.Em [bBcCeEfFgGhHiIlLmsSqQ]
655type specifier.
656The value is treated as signed if
657.Dq \&,
658is specified or unsigned if
659.Dq \&.
660is specified.
661The capitalized types interpret the number as a big endian
662value, whereas the small letter versions interpret the number as a little
663endian value;
664the
665.Em m
666type interprets the number as a middle endian (PDP-11) value.
667To that number the value of
668.Em y
669is added and the result is used as an offset in the file.
670The default type if one is not specified is long.
671The following types are recognized:
672.Bl -column -offset indent "Type" "Half/Short" "Little" "Size"
673.It Sy Type	Sy Mnemonic	Sy Endian	Sy Size
674.It bcBC	Byte/Char	N/A	1
675.It efg	Double	Little	8
676.It EFG	Double	Big	8
677.It hs	Half/Short	Little	2
678.It HS	Half/Short	Big	2
679.It i	ID3	Little	4
680.It I	ID3	Big	4
681.It l	Long	Little	4
682.It L	Long	Big	4
683.It m	Middle	Middle	4
684.It o	Octal	Textual	Variable
685.It q	Quad	Little	8
686.It Q	Quad	Big	8
687.El
688.Pp
689That way variable length structures can be examined:
690.Bd -literal -offset indent
691# MS Windows executables are also valid MS-DOS executables
6920           string   MZ
693\*[Gt]0x18       uleshort \*[Lt]0x40  MZ executable (MS-DOS)
694# skip the whole block below if it is not an extended executable
695\*[Gt]0x18       uleshort \*[Gt]0x3f
696\*[Gt]\*[Gt](0x3c.l)  string   PE\e0\e0 PE executable (MS-Windows)
697\*[Gt]\*[Gt](0x3c.l)  string   LX\e0\e0 LX executable (OS/2)
698.Ed
699.Pp
700This strategy of examining has a drawback: you must make sure that you
701eventually print something, or users may get empty output (such as when
702there is neither PE\e0\e0 nor LE\e0\e0 in the above example).
703.Pp
704If this indirect offset cannot be used directly, simple calculations are
705possible: appending
706.Em [+-*/%\*[Am]|^]number
707inside parentheses allows one to modify
708the value read from the file before it is used as an offset:
709.Bd -literal -offset indent
710# MS Windows executables are also valid MS-DOS executables
7110           string   MZ
712# sometimes, the value at 0x18 is less that 0x40 but there's still an
713# extended executable, simply appended to the file
714\*[Gt]0x18       uleshort \*[Lt]0x40
715\*[Gt]\*[Gt](4.s*512) leshort  0x014c  COFF executable (MS-DOS, DJGPP)
716\*[Gt]\*[Gt](4.s*512) leshort  !0x014c MZ executable (MS-DOS)
717.Ed
718.Pp
719Sometimes you do not know the exact offset as this depends on the length or
720position (when indirection was used before) of preceding fields.
721You can specify an offset relative to the end of the last up-level
722field using
723.Sq \*[Am]
724as a prefix to the offset:
725.Bd -literal -offset indent
7260           string   MZ
727\*[Gt]0x18       uleshort \*[Gt]0x3f
728\*[Gt]\*[Gt](0x3c.l)  string   PE\e0\e0    PE executable (MS-Windows)
729# immediately following the PE signature is the CPU type
730\*[Gt]\*[Gt]\*[Gt]\*[Am]0       leshort  0x14c     for Intel 80386
731\*[Gt]\*[Gt]\*[Gt]\*[Am]0       leshort  0x8664    for x86-64
732\*[Gt]\*[Gt]\*[Gt]\*[Am]0       leshort  0x184     for DEC Alpha
733.Ed
734.Pp
735Indirect and relative offsets can be combined:
736.Bd -literal -offset indent
7370             string   MZ
738\*[Gt]0x18         uleshort \*[Lt]0x40
739\*[Gt]\*[Gt](4.s*512)   leshort  !0x014c MZ executable (MS-DOS)
740# if it's not COFF, go back 512 bytes and add the offset taken
741# from byte 2/3, which is yet another way of finding the start
742# of the extended executable
743\*[Gt]\*[Gt]\*[Gt]\*[Am](2.s-514) string   LE      LE executable (MS Windows VxD driver)
744.Ed
745.Pp
746Or the other way around:
747.Bd -literal -offset indent
7480                 string   MZ
749\*[Gt]0x18             uleshort \*[Gt]0x3f
750\*[Gt]\*[Gt](0x3c.l)        string   LE\e0\e0  LE executable (MS-Windows)
751# at offset 0x80 (-4, since relative offsets start at the end
752# of the up-level match) inside the LE header, we find the absolute
753# offset to the code area, where we look for a specific signature
754\*[Gt]\*[Gt]\*[Gt](\*[Am]0x7c.l+0x26) string   UPX     \eb, UPX compressed
755.Ed
756.Pp
757Or even both!
758.Bd -literal -offset indent
7590                string   MZ
760\*[Gt]0x18            uleshort \*[Gt]0x3f
761\*[Gt]\*[Gt](0x3c.l)       string   LE\e0\e0 LE executable (MS-Windows)
762# at offset 0x58 inside the LE header, we find the relative offset
763# to a data area where we look for a specific signature
764\*[Gt]\*[Gt]\*[Gt]\*[Am](\*[Am]0x54.l-3)  string   UNACE  \eb, ACE self-extracting archive
765.Ed
766.Pp
767If you have to deal with offset/length pairs in your file, even the
768second value in a parenthesized expression can be taken from the file itself,
769using another set of parentheses.
770Note that this additional indirect offset is always relative to the
771start of the main indirect offset.
772.Bd -literal -offset indent
7730                 string       MZ
774\*[Gt]0x18             uleshort     \*[Gt]0x3f
775\*[Gt]\*[Gt](0x3c.l)        string       PE\e0\e0 PE executable (MS-Windows)
776# search for the PE section called ".idata"...
777\*[Gt]\*[Gt]\*[Gt]\*[Am]0xf4          search/0x140 .idata
778# ...and go to the end of it, calculated from start+length;
779# these are located 14 and 10 bytes after the section name
780\*[Gt]\*[Gt]\*[Gt]\*[Gt](\*[Am]0xe.l+(-4)) string       PK\e3\e4 \eb, ZIP self-extracting archive
781.Ed
782.Pp
783If you have a list of known values at a particular continuation level,
784and you want to provide a switch-like default case:
785.Bd -literal -offset indent
786# clear that continuation level match
787\*[Gt]18	clear	x
788\*[Gt]18	lelong	1	one
789\*[Gt]18	lelong	2	two
790\*[Gt]18	default	x
791# print default match
792\*[Gt]\*[Gt]18	lelong	x	unmatched 0x%x
793.Ed
794.Sh SEE ALSO
795.Xr file __CSECTION__
796\- the command that reads this file.
797.Sh BUGS
798The formats
799.Dv long ,
800.Dv belong ,
801.Dv lelong ,
802.Dv melong ,
803.Dv short ,
804.Dv beshort ,
805and
806.Dv leshort
807do not depend on the length of the C data types
808.Dv short
809and
810.Dv long
811on the platform, even though the Single
812.Ux
813Specification implies that they do.
814However, as OS X Mountain Lion has passed the Single
815.Ux
816Specification validation suite, and supplies a version of
817.Xr file __CSECTION__
818in which they do not depend on the sizes of the C data types and that is
819built for a 64-bit environment in which
820.Dv long
821is 8 bytes rather than 4 bytes, presumably the validation suite does not
822test whether, for example
823.Dv long
824refers to an item with the same size as the C data type
825.Dv long .
826There should probably be
827.Dv type
828names
829.Dv int8 ,
830.Dv uint8 ,
831.Dv int16 ,
832.Dv uint16 ,
833.Dv int32 ,
834.Dv uint32 ,
835.Dv int64 ,
836and
837.Dv uint64 ,
838and specified-byte-order variants of them,
839to make it clearer that those types have specified widths.
840.\"
841.\" From: guy@sun.uucp (Guy Harris)
842.\" Newsgroups: net.bugs.usg
843.\" Subject: /etc/magic's format isn't well documented
844.\" Message-ID: <2752@sun.uucp>
845.\" Date: 3 Sep 85 08:19:07 GMT
846.\" Organization: Sun Microsystems, Inc.
847.\" Lines: 136
848.\"
849.\" Here's a manual page for the format accepted by the "file" made by adding
850.\" the changes I posted to the S5R2 version.
851.\"
852.\" Modified for Ian Darwin's version of the file command.
853.\"
854.\" For emacs editor
855.\" Local Variables:
856.\" eval: (add-hook 'before-save-hook 'time-stamp)
857.\" time-stamp-start: ".Dd "
858.\" time-stamp-end: "$"
859.\" time-stamp-format: "%:B %02d, %:Y"
860.\" time-stamp-time-zone: "UTC0"
861.\" system-time-locale: "C"
862.\" eval:(setq compile-command (concat "groff -Tlatin1 -m man " (buffer-file-name)) )
863.\" End:
864.\"
865