xref: /titanic_50/usr/src/lib/libast/common/man/magic.3 (revision 50949b65f899967ea5560e87f773889a1d917b64)
.fp 5 CW .. .nr ;G \\n(.f .Af "\\$1" "\\$2" "\\$3" "\\$4" "\\$5" "\\$6" "\\$7" "\\$8" "\\$9" \\*(;G .. .aF 5 \\n(.f "\\$1" "\\$2" "\\$3" "\\$4" "\\$5" "\\$6" "\\$7" .. .aF 5 1 "\\$1" "\\$2" "\\$3" "\\$4" "\\$5" "\\$6" "\\$7" .. .aF 1 5 "\\$1" "\\$2" "\\$3" "\\$4" "\\$5" "\\$6" "\\$7" ..

0

..

..

MAGIC 3
NAME
magic - magic file interface
SYNOPSIS
.EX #include <magic.h> Magic_t { unsigned long flags; }; Magic_t* magicopen(unsigned long flags); void magicclose(Magic_t* magic); int magicload(Magic_t* magic, const char* path, unsigned long flags); int magiclist(Magic_t* magic, Sfio_t* sp); char* magictype(Magic_t* magic, const char* path, struct stat* st);
DESCRIPTION
These routines provide an interface to the file (1) command magic file. .L magicopen returns a magic session handle that is passed to all of the other routines. flags may be

.L MAGIC_MIME Return the MIME type string rather than the magic file description.

.L MAGIC_PHYSICAL Don't follow symbolic links.

.L MAGIC_STAT The stat structure st passed to magictype will contain valid stat (2) information. See .L magictype below.

.L MAGIC_VERBOSE Enable verbose error messages.

.L magicclose closes the magic session.

.L magicload loads the magic file named by path into the magic session. flags are the same as with .LR magicopen . More than one magic file can be loaded into a session; the files are searched in load order. If path is .L 0 then the default magic file is loaded.

.L magiclist lists the magic file contents on the sfio (3) stream sp . This is used for debugging magic entries.

.L magictype returns the type string for path with optional stat (2) information st . If "st == 0" then .L magictype calls .L stat on a private stat buffer, else if .L magicopen was called with the .L MAGIC_STAT flag then st is assumed to contain valid stat information, otherwise .L magictype calls .L stat on st . .L magictype always returns a non-null string. If errors are encounterd on path then the return value will contain information on those errors, e.g., .LR "cannot stat" .

FORMAT
The magic file format is a backwards compatible extension of an ancient System V file implementation. However, with the extended format it is possible to write a single magic file that works on all platforms. Most of the net magic files floating around work with .LR magic , but they usually double up on le and be entries that are automatically handled by .LR magic .

A magic file entry describes a procedure for determining a single file type based on the file pathname, stat (2) information, and the file data. An entry is a sequence of lines, each line being a record of space separated fields. The general record format is: .EX [op]offset type [mask]expression description [mimetype] .L # in the first column introduces a comment. The first record in an entry contains no .LR op ; the remaining records for an entry contain an .LR op . Integer constants are as in C: .L 0x* or .L 0X* for hexadecimal, .L 0* for octal and decimal otherwise.

The .L op field may be one of:

.L + The previous records must match but the current record is optional. .L > is an old-style synonym for .LR + .

.L & The previous and current records must match.

.L { Starts a nesting block that is terminated by .LR } . A nesting block pushes a new context for the .L + and .L & ops. The .L { and .L } records have no other fields.

id\f5{ A function declaration and call for the single character identifier id . The function return is a nesting block end record .LR } . Function may be redefined. Functions have no arguments or return value.

id\f5() A call to the function id .

The .L offset field is either the offset into the data upon which the current entry operates or a file metadata identifier. Offsets are either integer constants or offset expressions. An offset expression is contained in (...) and is a combination of integral arithmetic operators and the .L @ indirection operator. Indirections take the form .LI @ integer where integer is the data offset for the indirection value. The size of the indirection value is taken either from one of the suffixes .LR B (byte, 1 char), .LR H (short, 2 chars), .LR L (long, 4 chars), pr .LR Q (quead, 8 chars), or from the .L type field. Valid file metadata identifiers are:

.L atime The string representation of .LR stat.st_atime .

.L blocks .LR stat.st_blocks .

.L ctime The string representation of .LR stat.st_ctime .

.L fstype The string representation of .LR stat.st_fstype .

.L gid The string representation of .LR stat.st_gid .

The .L stat.st_mode file mode bits in modecanon (3) canonical representation (i.e., the good old octal values).

.L mtime The string representation of .LR stat.st_mtime .

.L nlink .LR stat.st_nlink .

.L size .LR stat.st_size .

.L name The file path name sans directory.

.L uid The string representation of .LR stat.st_uid .

The .L type field specifies the type of the data at .LR offset . Integral types may be prefixed by .L le or .L be for specifying exact little-endian or big-endian representation, but the internal algorithm automatically loops through the standard representations to find integral matches, so representation prefixes are rarely used. However, this looping may cause some magic entry conflicts; use the .L le or .L be prefix in these cases. Only one representation is used for all the records in an entry. Valid types are:

.L byte A 1 byte integer.

.L short A 2 byte integer.

.L long A 4 byte integer.

.L quad An 8 byte integer. Tests on this type may fail is the local compiler does not support an 8 byte integral type and the corresponding value overflows 4 bytes.

.L date The data at .L offset is interpreted as a 4 byte seconds-since-the-epoch date and converted to a string.

.L edit The .L expression field is an ed (1) style substitution expression del old del new del [ flags ] where the substituted value is made available to the .L description field .L %s format. In addition to the flags supported by ed (3) are .L l that converts the substituted value to lower case and .L u that converts the substituted value to upper case. If old does not match the string data at .L offset then the entry record fails.

.L match .L expression field is a strmatch (3) pattern that is matched against the string data at .LR offset .

.L string The .L expression field is a string that is compared with the string data at .LR offset .

The optional .L mask field takes the form .LI & number where number is anded with the integral value at .L offset before the .L expression is applied.

The contents of the expression field depends on the .LR type . String type expression are described in the .L type field entries above. .L * means any value and applies to all types. Integral .L type expression take the form [operator] operand\P where operand is compared with the data value at .L offset using operator . operator may be one of .LR < . .LR <= , .LR == , .LR >= or .LR > . operator defaults to .L == if omitted. operand may be an integral constant or one of the following builtin function calls:

.L magic() A recursive call to the magic algorithm starting with the data at .LR offset .

\f5loop(function,offset,increment) Call function starting at offset and increment offset by increment after each iteration. Iteration continues until the description text does not change.

The .L description field is the most important because it is this field that is presented to the outside world. When constructing description fields one must be very careful to follow the style layed out in the magic file, lest yet another layer of inconsistency creep into the system. The description for each matching record in an entry are concatenated to form the complete magic type. If the previous matching description in the current entry does not end with space and the current description is not empty and does not start with comma , dot or backspace then a space is placed between the descriptions (most optional descriptions start with comma .) The data value at .L offset can be referenced in the description using .L %s for the string types and .L %ld or .L %lu for the integral types.

The .L mimetype field specifies the MIME type, usually in the form a / b .

FILES
.L ../lib/file/magic located on .L $PATH
EXAMPLES
.EX 0 long 0x020c0108 hp s200 executable, pure o{ +36 long >0 , not stripped +4 short >0 , version %ld } 0 long 0x020c0107 hp s200 executable o() 0 long 0x020c010b hp s200 executable, demand-load o() The function .LR o() , shared by 3 entries, determines if the executable is stripped and also extracts the version number. .EX 0 long 0407 bsd 386 executable &mode long &0111!=0 +16 long >0 , not stripped This entry requires that the file also has execute permission.
"SEE ALSO"
file(1), mime(4), tw(1), modecanon(3)