1#------------------------------------------------------------------------------ 2# $File: archive,v 1.108 2017/08/30 13:45:10 christos Exp $ 3# archive: file(1) magic for archive formats (see also "msdos" for self- 4# extracting compressed archives) 5# 6# cpio, ar, arc, arj, hpack, lha/lharc, rar, squish, uc2, zip, zoo, etc. 7# pre-POSIX "tar" archives are handled in the C code. 8 9# POSIX tar archives 10257 string ustar\0 POSIX tar archive 11!:mime application/x-tar # encoding: posix 12257 string ustar\040\040\0 GNU tar archive 13!:mime application/x-tar # encoding: gnu 14 15# Incremental snapshot gnu-tar format from: 16# http://www.gnu.org/software/tar/manual/html_node/Snapshot-Files.html 170 string GNU\ tar- GNU tar incremental snapshot data 18>&0 regex [0-9]\.[0-9]+-[0-9]+ version %s 19 20# cpio archives 21# 22# Yes, the top two "cpio archive" formats *are* supposed to just be "short". 23# The idea is to indicate archives produced on machines with the same 24# byte order as the machine running "file" with "cpio archive", and 25# to indicate archives produced on machines with the opposite byte order 26# from the machine running "file" with "byte-swapped cpio archive". 27# 28# The SVR4 "cpio(4)" hints that there are additional formats, but they 29# are defined as "short"s; I think all the new formats are 30# character-header formats and thus are strings, not numbers. 310 short 070707 cpio archive 32!:mime application/x-cpio 330 short 0143561 byte-swapped cpio archive 34!:mime application/x-cpio # encoding: swapped 350 string 070707 ASCII cpio archive (pre-SVR4 or odc) 360 string 070701 ASCII cpio archive (SVR4 with no CRC) 370 string 070702 ASCII cpio archive (SVR4 with CRC) 38 39# 40# Various archive formats used by various versions of the "ar" 41# command. 42# 43 44# 45# Original UNIX archive formats. 46# They were written with binary values in host byte order, and 47# the magic number was a host "int", which might have been 16 bits 48# or 32 bits. We don't say "PDP-11" or "VAX", as there might have 49# been ports to little-endian 16-bit-int or 32-bit-int platforms 50# (x86?) using some of those formats; if none existed, feel free 51# to use "PDP-11" for little-endian 16-bit and "VAX" for little-endian 52# 32-bit. There might have been big-endian ports of that sort as 53# well. 54# 550 leshort 0177555 very old 16-bit-int little-endian archive 560 beshort 0177555 very old 16-bit-int big-endian archive 570 lelong 0177555 very old 32-bit-int little-endian archive 580 belong 0177555 very old 32-bit-int big-endian archive 59 600 leshort 0177545 old 16-bit-int little-endian archive 61>2 string __.SYMDEF random library 620 beshort 0177545 old 16-bit-int big-endian archive 63>2 string __.SYMDEF random library 640 lelong 0177545 old 32-bit-int little-endian archive 65>4 string __.SYMDEF random library 660 belong 0177545 old 32-bit-int big-endian archive 67>4 string __.SYMDEF random library 68 69# 70# From "pdp" (but why a 4-byte quantity?) 71# 720 lelong 0x39bed PDP-11 old archive 730 lelong 0x39bee PDP-11 4.0 archive 74 75# 76# XXX - what flavor of APL used this, and was it a variant of 77# some ar archive format? It's similar to, but not the same 78# as, the APL workspace magic numbers in pdp. 79# 800 long 0100554 apl workspace 81 82# 83# System V Release 1 portable(?) archive format. 84# 850 string =<ar> System V Release 1 ar archive 86!:mime application/x-archive 87 88# 89# Debian package; it's in the portable archive format, and needs to go 90# before the entry for regular portable archives, as it's recognized as 91# a portable archive whose first member has a name beginning with 92# "debian". 93# 940 string =!<arch>\ndebian 95>8 string debian-split part of multipart Debian package 96!:mime application/vnd.debian.binary-package 97>8 string debian-binary Debian binary package 98!:mime application/vnd.debian.binary-package 99>8 string !debian 100>68 string >\0 (format %s) 101# These next two lines do not work, because a bzip2 Debian archive 102# still uses gzip for the control.tar (first in the archive). Only 103# data.tar varies, and the location of its filename varies too. 104# file/libmagic does not current have support for ascii-string based 105# (offsets) as of 2005-09-15. 106#>81 string bz2 \b, uses bzip2 compression 107#>84 string gz \b, uses gzip compression 108#>136 ledate x created: %s 109 110# 111# MIPS archive; they're in the portable archive format, and need to go 112# before the entry for regular portable archives, as it's recognized as 113# a portable archive whose first member has a name beginning with 114# "__________E". 115# 1160 string =!<arch>\n__________E MIPS archive 117!:mime application/x-archive 118>20 string U with MIPS Ucode members 119>21 string L with MIPSEL members 120>21 string B with MIPSEB members 121>19 string L and an EL hash table 122>19 string B and an EB hash table 123>22 string X -- out of date 124 1250 search/1 -h- Software Tools format archive text 126 127# 128# BSD/SVR2-and-later portable archive formats. 129# 1300 string =!<arch> current ar archive 131!:mime application/x-archive 132>8 string __.SYMDEF random library 133>68 string __.SYMDEF\ SORTED random library 134 135# 136# "Thin" archive, as can be produced by GNU ar. 137# 1380 string =!<thin>\n thin archive with 139>68 belong 0 no symbol entries 140>68 belong 1 %d symbol entry 141>68 belong >1 %d symbol entries 142 143# ARC archiver, from Daniel Quinlan (quinlan@yggdrasil.com) 144# 145# The first byte is the magic (0x1a), byte 2 is the compression type for 146# the first file (0x01 through 0x09), and bytes 3 to 15 are the MS-DOS 147# filename of the first file (null terminated). Since some types collide 148# we only test some types on basis of frequency: 0x08 (83%), 0x09 (5%), 149# 0x02 (5%), 0x03 (3%), 0x04 (2%), 0x06 (2%). 0x01 collides with terminfo. 1500 lelong&0x8080ffff 0x0000081a ARC archive data, dynamic LZW 151!:mime application/x-arc 1520 lelong&0x8080ffff 0x0000091a ARC archive data, squashed 153!:mime application/x-arc 1540 lelong&0x8080ffff 0x0000021a ARC archive data, uncompressed 155!:mime application/x-arc 1560 lelong&0x8080ffff 0x0000031a ARC archive data, packed 157!:mime application/x-arc 1580 lelong&0x8080ffff 0x0000041a ARC archive data, squeezed 159!:mime application/x-arc 1600 lelong&0x8080ffff 0x0000061a ARC archive data, crunched 161!:mime application/x-arc 162# [JW] stuff taken from idarc, obviously ARC successors: 1630 lelong&0x8080ffff 0x00000a1a PAK archive data 164!:mime application/x-arc 1650 lelong&0x8080ffff 0x0000141a ARC+ archive data 166!:mime application/x-arc 1670 lelong&0x8080ffff 0x0000481a HYP archive data 168!:mime application/x-arc 169 170# Acorn archive formats (Disaster prone simpleton, m91dps@ecs.ox.ac.uk) 171# I can't create either SPARK or ArcFS archives so I have not tested this stuff 172# [GRR: the original entries collide with ARC, above; replaced with combined 173# version (not tested)] 174#0 byte 0x1a RISC OS archive (spark format) 1750 string \032archive RISC OS archive (ArcFS format) 1760 string Archive\000 RISC OS archive (ArcFS format) 177 178# All these were taken from idarc, many could not be verified. Unfortunately, 179# there were many low-quality sigs, i.e. easy to trigger false positives. 180# Please notify me of any real-world fishy/ambiguous signatures and I'll try 181# to get my hands on the actual archiver and see if I find something better. [JW] 182# probably many can be enhanced by finding some 0-byte or control char near the start 183 184# idarc calls this Crush/Uncompressed... *shrug* 1850 string CRUSH Crush archive data 186# Squeeze It (.sqz) 1870 string HLSQZ Squeeze It archive data 188# SQWEZ 1890 string SQWEZ SQWEZ archive data 190# HPack (.hpk) 1910 string HPAK HPack archive data 192# HAP 1930 string \x91\x33HF HAP archive data 194# MD/MDCD 1950 string MDmd MDCD archive data 196# LIM 1970 string LIM\x1a LIM archive data 198# SAR 1993 string LH5 SAR archive data 200# BSArc/BS2 2010 string \212\3SB\020\0 BSArc/BS2 archive data 202# Bethesda Softworks Archive (Oblivion) 2030 string BSA\0 BSArc archive data 204>4 lelong x version %d 205# MAR 2062 string =-ah MAR archive data 207# ACB 208#0 belong&0x00f800ff 0x00800000 ACB archive data 209# CPZ 210# TODO, this is what idarc says: 0 string \0\0\0 CPZ archive data 211# JRC 2120 string JRchive JRC archive data 213# Quantum 2140 string DS\0 Quantum archive data 215# ReSOF 2160 string PK\3\6 ReSOF archive data 217# QuArk 2180 string 7\4 QuArk archive data 219# YAC 22014 string YC YAC archive data 221# X1 2220 string X1 X1 archive data 2230 string XhDr X1 archive data 224# CDC Codec (.dqt) 2250 belong&0xffffe000 0x76ff2000 CDC Codec archive data 226# AMGC 2270 string \xad6" AMGC archive data 228# NuLIB 2290 string N\xc3\xb5F\xc3\xa9lx\xc3\xa5 NuLIB archive data 230# PakLeo 2310 string LEOLZW PAKLeo archive data 232# ChArc 2330 string SChF ChArc archive data 234# PSA 2350 string PSA PSA archive data 236# CrossePAC 2370 string DSIGDCC CrossePAC archive data 238# Freeze 2390 string \x1f\x9f\x4a\x10\x0a Freeze archive data 240# KBoom 2410 string \xc2\xa8MP\xc2\xa8 KBoom archive data 242# NSQ, must go after CDC Codec 2430 string \x76\xff NSQ archive data 244# DPA 2450 string Dirk\ Paehl DPA archive data 246# BA 247# TODO: idarc says "bytes 0-2 == bytes 3-5" 248# TTComp 249# URL: http://fileformats.archiveteam.org/wiki/TTComp_archive 250# Update: Joerg Jenderek 251# GRR: line below is too general as it matches also Panorama database "TCDB 2003-10 demo.pan", others 2520 string \0\6 253# look for first keyword of Panorama database *.pan 254>12 search/261 DESIGN 255# skip keyword with low entropy 256>12 default x TTComp archive, binary, 4K dictionary 257# (version 5.25) labeled the above entry as "TTComp archive data" 258# ESP, could this conflict with Easy Software Products' (e.g.ESP ghostscript) documentation? 2590 string ESP ESP archive data 260# ZPack 2610 string \1ZPK\1 ZPack archive data 262# Sky 2630 string \xbc\x40 Sky archive data 264# UFA 2650 string UFA UFA archive data 266# Dry 2670 string =-H2O DRY archive data 268# FoxSQZ 2690 string FOXSQZ FoxSQZ archive data 270# AR7 2710 string ,AR7 AR7 archive data 272# PPMZ 2730 string PPMZ PPMZ archive data 274# MS Compress 2754 string \x88\xf0\x27 MS Compress archive data 276# updated by Joerg Jenderek 277>9 string \0 278>>0 string KWAJ 279>>>7 string \321\003 MS Compress archive data 280>>>>14 ulong >0 \b, original size: %d bytes 281>>>>18 ubyte >0x65 282>>>>>18 string x \b, was %.8s 283>>>>>(10.b-4) string x \b.%.3s 284# MP3 (archiver, not lossy audio compression) 2850 string MP3\x1a MP3-Archiver archive data 286# ZET 2870 string OZ\xc3\x9d ZET archive data 288# TSComp 2890 string \x65\x5d\x13\x8c\x08\x01\x03\x00 TSComp archive data 290# ARQ 2910 string gW\4\1 ARQ archive data 292# Squash 2933 string OctSqu Squash archive data 294# Terse 2950 string \5\1\1\0 Terse archive data 296# PUCrunch 2970 string \x01\x08\x0b\x08\xef\x00\x9e\x32\x30\x36\x31 PUCrunch archive data 298# UHarc 2990 string UHA UHarc archive data 300# ABComp 3010 string \2AB ABComp archive data 3020 string \3AB2 ABComp archive data 303# CMP 3040 string CO\0 CMP archive data 305# Splint 3060 string \x93\xb9\x06 Splint archive data 307# InstallShield 3080 string \x13\x5d\x65\x8c InstallShield Z archive Data 309# Gather 3101 string GTH Gather archive data 311# BOA 3120 string BOA BOA archive data 313# RAX 3140 string ULEB\xa RAX archive data 315# Xtreme 3160 string ULEB\0 Xtreme archive data 317# Pack Magic 3180 string @\xc3\xa2\1\0 Pack Magic archive data 319# BTS 3200 belong&0xfeffffff 0x1a034465 BTS archive data 321# ELI 5750 3220 string Ora\ ELI 5750 archive data 323# QFC 3240 string \x1aFC\x1a QFC archive data 3250 string \x1aQF\x1a QFC archive data 326# PRO-PACK 3270 string RNC PRO-PACK archive data 328# 777 3290 string 777 777 archive data 330# LZS221 3310 string sTaC LZS221 archive data 332# HPA 3330 string HPA HPA archive data 334# Arhangel 3350 string LG Arhangel archive data 336# EXP1, uses bzip2 3370 string 0123456789012345BZh EXP1 archive data 338# IMP 3390 string IMP\xa IMP archive data 340# NRV 3410 string \x00\x9E\x6E\x72\x76\xFF NRV archive data 342# Squish 3430 string \x73\xb2\x90\xf4 Squish archive data 344# Par 3450 string PHILIPP Par archive data 3460 string PAR Par archive data 347# HIT 3480 string UB HIT archive data 349# SBX 3500 belong&0xfffff000 0x53423000 SBX archive data 351# NaShrink 3520 string NSK NaShrink archive data 353# SAPCAR 3540 string #\ CAR\ archive\ header SAPCAR archive data 3550 string CAR\ 2.00RG SAPCAR archive data 356# Disintegrator 3570 string DST Disintegrator archive data 358# ASD 3590 string ASD ASD archive data 360# InstallShield CAB 3610 string ISc( InstallShield CAB 362# TOP4 3630 string T4\x1a TOP4 archive data 364# BatComp left out: sig looks like COM executable 365# so TODO: get real 4dos batcomp file and find sig 366# BlakHole 3670 string BH\5\7 BlakHole archive data 368# BIX 3690 string BIX0 BIX archive data 370# ChiefLZA 3710 string ChfLZ ChiefLZA archive data 372# Blink 3730 string Blink Blink archive data 374# Logitech Compress 3750 string \xda\xfa Logitech Compress archive data 376# ARS-Sfx (FIXME: really a SFX? then goto COM/EXE) 3771 string (C)\ STEPANYUK ARS-Sfx archive data 378# AKT/AKT32 3790 string AKT32 AKT32 archive data 3800 string AKT AKT archive data 381# NPack 3820 string MSTSM NPack archive data 383# PFT 3840 string \0\x50\0\x14 PFT archive data 385# SemOne 3860 string SEM SemOne archive data 387# PPMD 3880 string \x8f\xaf\xac\x84 PPMD archive data 389# FIZ 3900 string FIZ FIZ archive data 391# MSXiE 3920 belong&0xfffff0f0 0x4d530000 MSXiE archive data 393# DeepFreezer 3940 belong&0xfffffff0 0x797a3030 DeepFreezer archive data 395# DC 3960 string =<DC- DC archive data 397# TPac 3980 string \4TPAC\3 TPac archive data 399# Ai 4000 string Ai\1\1\0 Ai archive data 4010 string Ai\1\0\0 Ai archive data 402# Ai32 4030 string Ai\2\0 Ai32 archive data 4040 string Ai\2\1 Ai32 archive data 405# SBC 4060 string SBC SBC archive data 407# Ybs 4080 string YBS Ybs archive data 409# DitPack 4100 string \x9e\0\0 DitPack archive data 411# DMS 4120 string DMS! DMS archive data 413# EPC 4140 string \x8f\xaf\xac\x8c EPC archive data 415# VSARC 4160 string VS\x1a VSARC archive data 417# PDZ 4180 string PDZ PDZ archive data 419# ReDuq 4200 string rdqx ReDuq archive data 421# GCA 4220 string GCAX GCA archive data 423# PPMN 4240 string pN PPMN archive data 425# WinImage 4263 string WINIMAGE WinImage archive data 427# Compressia 4280 string CMP0CMP Compressia archive data 429# UHBC 4300 string UHB UHBC archive data 431# WinHKI 4320 string \x61\x5C\x04\x05 WinHKI archive data 433# WWPack data file 4340 string WWP WWPack archive data 435# BSN (BSA, PTS-DOS) 4360 string \xffBSG BSN archive data 4371 string \xffBSG BSN archive data 4383 string \xffBSG BSN archive data 4391 string \0\xae\2 BSN archive data 4401 string \0\xae\3 BSN archive data 4411 string \0\xae\7 BSN archive data 442# AIN 4430 string \x33\x18 AIN archive data 4440 string \x33\x17 AIN archive data 445# XPA32 test moved and merged with XPA by Joerg Jenderek at Sep 2015 446# SZip (TODO: doesn't catch all versions) 4470 string SZ\x0a\4 SZip archive data 448# XPack DiskImage 449# *.XDI updated by Joerg Jenderek Sep 2015 450# ftp://ftp.sac.sk/pub/sac/pack/0index.txt 451# GRR: this test is still too general as it catches also text files starting with jm 4520 string jm 453# only found examples with this additional characteristic 2 bytes 454>2 string \x2\x4 Xpack DiskImage archive data 455#!:ext xdi 456# XPack Data 457# *.xpa updated by Joerg Jenderek Sep 2015 458# ftp://ftp.elf.stuba.sk/pub/pc/pack/ 4590 string xpa XPA 460!:ext xpa 461# XPA32 462# ftp://ftp.elf.stuba.sk/pub/pc/pack/xpa32.zip 463# created by XPA32.EXE version 1.0.2 for Windows 464>0 string xpa\0\1 \b32 archive data 465# created by XPACK.COM version 1.67m or 1.67r with short 0x1800 466>3 ubeshort !0x0001 \bck archive data 467# XPack Single Data 468# changed by Joerg Jenderek Sep 2015 back to like in version 5.12 469# letter 'I'+ acute accent is equivalent to \xcd 4700 string \xcd\ jm Xpack single archive data 471#!:mime application/x-xpa-compressed 472!:ext xpa 473 474# TODO: missing due to unknown magic/magic at end of file: 475#DWC 476#ARG 477#ZAR 478#PC/3270 479#InstallIt 480#RKive 481#RK 482#XPack Diskimage 483 484# These were inspired by idarc, but actually verified 485# Dzip archiver (.dz) 4860 string DZ Dzip archive data 487>2 byte x \b, version %i 488>3 byte x \b.%i 489# ZZip archiver (.zz) 4900 string ZZ\ \0\0 ZZip archive data 4910 string ZZ0 ZZip archive data 492# PAQ archiver (.paq) 4930 string \xaa\x40\x5f\x77\x1f\xe5\x82\x0d PAQ archive data 4940 string PAQ PAQ archive data 495>3 byte&0xf0 0x30 496>>3 byte x (v%c) 497# JAR archiver (.j), this is the successor to ARJ, not Java's JAR (which is essentially ZIP) 4980xe string \x1aJar\x1b JAR (ARJ Software, Inc.) archive data 4990 string JARCS JAR (ARJ Software, Inc.) archive data 500 501# ARJ archiver (jason@jarthur.Claremont.EDU) 5020 leshort 0xea60 ARJ archive data 503!:mime application/x-arj 504>5 byte x \b, v%d, 505>8 byte &0x04 multi-volume, 506>8 byte &0x10 slash-switched, 507>8 byte &0x20 backup, 508>34 string x original name: %s, 509>7 byte 0 os: MS-DOS 510>7 byte 1 os: PRIMOS 511>7 byte 2 os: Unix 512>7 byte 3 os: Amiga 513>7 byte 4 os: Macintosh 514>7 byte 5 os: OS/2 515>7 byte 6 os: Apple ][ GS 516>7 byte 7 os: Atari ST 517>7 byte 8 os: NeXT 518>7 byte 9 os: VAX/VMS 519>3 byte >0 %d] 520# [JW] idarc says this is also possible 5212 leshort 0xea60 ARJ archive data 522 523# HA archiver (Greg Roelofs, newt@uchicago.edu) 524# This is a really bad format. A file containing HAWAII will match this... 525#0 string HA HA archive data, 526#>2 leshort =1 1 file, 527#>2 leshort >1 %hu files, 528#>4 byte&0x0f =0 first is type CPY 529#>4 byte&0x0f =1 first is type ASC 530#>4 byte&0x0f =2 first is type HSC 531#>4 byte&0x0f =0x0e first is type DIR 532#>4 byte&0x0f =0x0f first is type SPECIAL 533# suggestion: at least identify small archives (<1024 files) 5340 belong&0xffff00fc 0x48410000 HA archive data 535>2 leshort =1 1 file, 536>2 leshort >1 %u files, 537>4 byte&0x0f =0 first is type CPY 538>4 byte&0x0f =1 first is type ASC 539>4 byte&0x0f =2 first is type HSC 540>4 byte&0x0f =0x0e first is type DIR 541>4 byte&0x0f =0x0f first is type SPECIAL 542 543# HPACK archiver (Peter Gutmann, pgut1@cs.aukuni.ac.nz) 5440 string HPAK HPACK archive data 545 546# JAM Archive volume format, by Dmitry.Kohmanyuk@UA.net 5470 string \351,\001JAM\ JAM archive, 548>7 string >\0 version %.4s 549>0x26 byte =0x27 - 550>>0x2b string >\0 label %.11s, 551>>0x27 lelong x serial %08x, 552>>0x36 string >\0 fstype %.8s 553 554# LHARC/LHA archiver (Greg Roelofs, newt@uchicago.edu) 555# Update: Joerg Jenderek 556# URL: https://en.wikipedia.org/wiki/LHA_(file_format) 557# Reference: http://web.archive.org/web/20021005080911/http://www.osirusoft.com/joejared/lzhformat.html 558# 559# check and display information of lharc (LHa,PMarc) file 5600 name lharc-file 561# check 1st character of method id like -lz4- -lh5- or -pm2- 562>2 string - 563# check 5th character of method id 564>>6 string - 565# check header level 0 1 2 3 566>>>20 ubyte <4 567# check 2nd, 3th and 4th character of method id 568>>>>3 regex \^(lh[0-9a-ex]|lz[s2-8]|pm[012]|pc1) \b 569!:mime application/x-lzh-compressed 570# creator type "LHA " 571!:apple ????LHA 572# display archive type name like "LHa/LZS archive data" or "LArc archive" 573>>>>>2 string -lz \b 574!:ext lzs 575# already known -lzs- -lz4- -lz5- with old names 576>>>>>>2 string -lzs LHa/LZS archive data 577>>>>>>3 regex \^lz[45] LHarc 1.x archive data 578# missing -lz?- with wikipedia names 579>>>>>>3 regex \^lz[2378] LArc archive 580# display archive type name like "LHa (2.x) archive data" 581>>>>>2 string -lh \b 582# already known -lh0- -lh1- -lh2- -lh3- -lh4- -lh5- -lh6- -lh7- -lhd- variants with old names 583>>>>>>3 regex \^lh[01] LHarc 1.x/ARX archive data 584# LHice archiver use ".ICE" as name extension instead usual one ".lzh" 585# FOOBAR archiver use ".foo" as name extension instead usual one 586# "Florain Orjanov's and Olga Bachetska's ARchiver" not found at the moment 587>>>>>>>2 string -lh1 \b 588!:ext lha/lzh/ice 589>>>>>>3 regex \^lh[23d] LHa 2.x? archive data 590>>>>>>3 regex \^lh[7] LHa (2.x)/LHark archive data 591>>>>>>3 regex \^lh[456] LHa (2.x) archive data 592>>>>>>>2 string -lh5 \b 593# https://en.wikipedia.org/wiki/BIOS 594# Some mainboard BIOS like Award use LHa compression. So archives with unusal extension are found like 595# bios.rom , kd7_v14.bin, 1010.004, ... 596!:ext lha/lzh/rom/bin 597# missing -lh?- variants (Joe Jared) 598>>>>>>3 regex \^lh[89a-ce] LHa (Joe Jared) archive 599# UNLHA32 2.67a 600>>>>>>2 string -lhx LHa (UNLHA32) archive 601# lha archives with standard file name extensions ".lha" ".lzh" 602>>>>>>3 regex !\^(lh1|lh5) \b 603!:ext lha/lzh 604# this should not happen if all -lh variants are described 605>>>>>>2 default x LHa (unknown) archive 606#!:ext lha 607# PMarc 608>>>>>3 regex \^pm[012] PMarc archive data 609!:ext pma 610# append method id without leading and trailing minus character 611>>>>>3 string x [%3.3s] 612>>>>>>0 use lharc-header 613# 614# check and display information of lharc header 6150 name lharc-header 616# header size 0x4 , 0x1b-0x61 617>0 ubyte x 618# compressed data size != compressed file size 619#>7 ulelong x \b, data size %d 620# attribute: 0x2~?? 0x10~symlink|target 0x20~normal 621#>19 ubyte x \b, 19_0x%x 622# level identifier 0 1 2 3 623#>20 ubyte x \b, level %d 624# time stamp 625#>15 ubelong x DATE 0x%8.8x 626# OS ID for level 1 627>20 ubyte 1 628# 0x20 types find for *.rom files 629>>(21.b+24) ubyte <0x21 \b, 0x%x OS 630# ascii type like M for MSDOS 631>>(21.b+24) ubyte >0x20 \b, '%c' OS 632# OS ID for level 2 633>20 ubyte 2 634#>>23 ubyte x \b, OS ID 0x%x 635>>23 ubyte <0x21 \b, 0x%x OS 636>>23 ubyte >0x20 \b, '%c' OS 637# filename only for level 0 and 1 638>20 ubyte <2 639# length of filename 640>>21 ubyte >0 \b, with 641# filename 642>>>21 pstring x "%s" 643# 644#2 string -lh0- LHarc 1.x/ARX archive data [lh0] 645#!:mime application/x-lharc 6462 string -lh0- 647>0 use lharc-file 648#2 string -lh1- LHarc 1.x/ARX archive data [lh1] 649#!:mime application/x-lharc 6502 string -lh1- 651>0 use lharc-file 652# NEW -lz2- ... -lz8- 6532 string -lz2- 654>0 use lharc-file 6552 string -lz3- 656>0 use lharc-file 6572 string -lz4- 658>0 use lharc-file 6592 string -lz5- 660>0 use lharc-file 6612 string -lz7- 662>0 use lharc-file 6632 string -lz8- 664>0 use lharc-file 665# [never seen any but the last; -lh4- reported in comp.compression:] 666#2 string -lzs- LHa/LZS archive data [lzs] 6672 string -lzs- 668>0 use lharc-file 669# According to wikipedia and others such a version does not exist 670#2 string -lh\40- LHa 2.x? archive data [lh ] 671#2 string -lhd- LHa 2.x? archive data [lhd] 6722 string -lhd- 673>0 use lharc-file 674#2 string -lh2- LHa 2.x? archive data [lh2] 6752 string -lh2- 676>0 use lharc-file 677#2 string -lh3- LHa 2.x? archive data [lh3] 6782 string -lh3- 679>0 use lharc-file 680#2 string -lh4- LHa (2.x) archive data [lh4] 6812 string -lh4- 682>0 use lharc-file 683#2 string -lh5- LHa (2.x) archive data [lh5] 6842 string -lh5- 685>0 use lharc-file 686#2 string -lh6- LHa (2.x) archive data [lh6] 6872 string -lh6- 688>0 use lharc-file 689#2 string -lh7- LHa (2.x)/LHark archive data [lh7] 6902 string -lh7- 691# !:mime application/x-lha 692# >20 byte x - header level %d 693>0 use lharc-file 694# NEW -lh8- ... -lhe- , -lhx- 6952 string -lh8- 696>0 use lharc-file 6972 string -lh9- 698>0 use lharc-file 6992 string -lha- 700>0 use lharc-file 7012 string -lhb- 702>0 use lharc-file 7032 string -lhc- 704>0 use lharc-file 7052 string -lhe- 706>0 use lharc-file 7072 string -lhx- 708>0 use lharc-file 709# taken from idarc [JW] 7102 string -lZ PUT archive data 711# already done by LHarc magics 712# this should never happen if all sub types of LZS archive are identified 713#2 string -lz LZS archive data 7142 string -sw1- Swag archive data 715 7160 name rar-file-header 717>24 byte 15 \b, v1.5 718>24 byte 20 \b, v2.0 719>24 byte 29 \b, v4 720>15 byte 0 \b, os: MS-DOS 721>15 byte 1 \b, os: OS/2 722>15 byte 2 \b, os: Win32 723>15 byte 3 \b, os: Unix 724>15 byte 4 \b, os: Mac OS 725>15 byte 5 \b, os: BeOS 726 7270 name rar-archive-header 728>3 leshort&0x1ff >0 \b, flags: 729>>3 leshort &0x01 ArchiveVolume 730>>3 leshort &0x02 Commented 731>>3 leshort &0x04 Locked 732>>3 leshort &0x10 NewVolumeNaming 733>>3 leshort &0x08 Solid 734>>3 leshort &0x20 Authenticated 735>>3 leshort &0x40 RecoveryRecordPresent 736>>3 leshort &0x80 EncryptedBlockHeader 737>>3 leshort &0x100 FirstVolume 738 739# RAR (Roshal Archive) archive 7400 string Rar!\x1a\7\0 RAR archive data 741!:mime application/x-rar 742!:ext rar/cbr 743# file header 744>(0xc.l+9) byte 0x74 745>>(0xc.l+7) use rar-file-header 746# subblock seems to share information with file header 747>(0xc.l+9) byte 0x7a 748>>(0xc.l+7) use rar-file-header 749>9 byte 0x73 750>>7 use rar-archive-header 751 7520 string Rar!\x1a\7\1\0 RAR archive data, v5 753!:mime application/x-rar 754!:ext rar 755 756# Very old RAR archive 757# http://jasonblanks.com/wp-includes/images/papers/KnowyourarchiveRAR.pdf 7580 string RE\x7e\x5e RAR archive data (<v1.5) 759!:mime application/x-rar 760!:ext rar/cbr 761 762# SQUISH archiver (Greg Roelofs, newt@uchicago.edu) 7630 string SQSH squished archive data (Acorn RISCOS) 764 765# UC2 archiver (Greg Roelofs, newt@uchicago.edu) 766# [JW] see exe section for self-extracting version 7670 string UC2\x1a UC2 archive data 768 769# PKZIP multi-volume archive 7700 string PK\x07\x08PK\x03\x04 Zip multi-volume archive data, at least PKZIP v2.50 to extract 771!:mime application/zip 772!:ext zip/cbz 773 774# Zip archives (Greg Roelofs, c/o zip-bugs@wkuvx1.wku.edu) 7750 string PK\005\006 Zip archive data (empty) 776!:mime application/zip 777!:ext zip/cbz 7780 string PK\003\004 779 780# Specialised zip formats which start with a member named 'mimetype' 781# (stored uncompressed, with no 'extra field') containing the file's MIME type. 782# Check for have 8-byte name, 0-byte extra field, name "mimetype", and 783# contents starting with "application/": 784>26 string \x8\0\0\0mimetypeapplication/ 785 786# KOffice / OpenOffice & StarOffice / OpenDocument formats 787# From: Abel Cheung <abel@oaka.org> 788 789# KOffice (1.2 or above) formats 790# (mimetype contains "application/vnd.kde.<SUBTYPE>") 791>>50 string vnd.kde. KOffice (>=1.2) 792>>>58 string karbon Karbon document 793>>>58 string kchart KChart document 794>>>58 string kformula KFormula document 795>>>58 string kivio Kivio document 796>>>58 string kontour Kontour document 797>>>58 string kpresenter KPresenter document 798>>>58 string kspread KSpread document 799>>>58 string kword KWord document 800 801# OpenOffice formats (for OpenOffice 1.x / StarOffice 6/7) 802# (mimetype contains "application/vnd.sun.xml.<SUBTYPE>") 803>>50 string vnd.sun.xml. OpenOffice.org 1.x 804>>>62 string writer Writer 805>>>>68 byte !0x2e document 806>>>>68 string .template template 807>>>>68 string .global global document 808>>>62 string calc Calc 809>>>>66 byte !0x2e spreadsheet 810>>>>66 string .template template 811>>>62 string draw Draw 812>>>>66 byte !0x2e document 813>>>>66 string .template template 814>>>62 string impress Impress 815>>>>69 byte !0x2e presentation 816>>>>69 string .template template 817>>>62 string math Math document 818>>>62 string base Database file 819 820# OpenDocument formats (for OpenOffice 2.x / StarOffice >= 8) 821# http://lists.oasis-open.org/archives/office/200505/msg00006.html 822# (mimetype contains "application/vnd.oasis.opendocument.<SUBTYPE>") 823>>50 string vnd.oasis.opendocument. OpenDocument 824>>>73 string text 825>>>>77 byte !0x2d Text 826!:mime application/vnd.oasis.opendocument.text 827>>>>77 string -template Text Template 828!:mime application/vnd.oasis.opendocument.text-template 829>>>>77 string -web HTML Document Template 830!:mime application/vnd.oasis.opendocument.text-web 831>>>>77 string -master Master Document 832!:mime application/vnd.oasis.opendocument.text-master 833>>>73 string graphics 834>>>>81 byte !0x2d Drawing 835!:mime application/vnd.oasis.opendocument.graphics 836>>>>81 string -template Template 837!:mime application/vnd.oasis.opendocument.graphics-template 838>>>73 string presentation 839>>>>85 byte !0x2d Presentation 840!:mime application/vnd.oasis.opendocument.presentation 841>>>>85 string -template Template 842!:mime application/vnd.oasis.opendocument.presentation-template 843>>>73 string spreadsheet 844>>>>84 byte !0x2d Spreadsheet 845!:mime application/vnd.oasis.opendocument.spreadsheet 846>>>>84 string -template Template 847!:mime application/vnd.oasis.opendocument.spreadsheet-template 848>>>73 string chart 849>>>>78 byte !0x2d Chart 850!:mime application/vnd.oasis.opendocument.chart 851>>>>78 string -template Template 852!:mime application/vnd.oasis.opendocument.chart-template 853>>>73 string formula 854>>>>80 byte !0x2d Formula 855!:mime application/vnd.oasis.opendocument.formula 856>>>>80 string -template Template 857!:mime application/vnd.oasis.opendocument.formula-template 858>>>73 string database Database 859!:mime application/vnd.oasis.opendocument.database 860>>>73 string image 861>>>>78 byte !0x2d Image 862!:mime application/vnd.oasis.opendocument.image 863>>>>78 string -template Template 864!:mime application/vnd.oasis.opendocument.image-template 865 866# EPUB (OEBPS) books using OCF (OEBPS Container Format) 867# http://www.idpf.org/ocf/ocf1.0/download/ocf10.htm, section 4. 868# From: Ralf Brown <ralf.brown@gmail.com> 869>>50 string epub+zip EPUB document 870!:mime application/epub+zip 871 872# Catch other ZIP-with-mimetype formats 873# In a ZIP file, the bytes immediately after a member's contents are 874# always "PK". The 2 regex rules here print the "mimetype" member's 875# contents up to the first 'P'. Luckily, most MIME types don't contain 876# any capital 'P's. This is a kludge. 877# (mimetype contains "application/<OTHER>") 878>>50 string !epub+zip 879>>>50 string !vnd.oasis.opendocument. 880>>>>50 string !vnd.sun.xml. 881>>>>>50 string !vnd.kde. 882>>>>>>38 regex [!-OQ-~]+ Zip data (MIME type "%s"?) 883!:mime application/zip 884# (mimetype contents other than "application/*") 885>26 string \x8\0\0\0mimetype 886>>38 string !application/ 887>>>38 regex [!-OQ-~]+ Zip data (MIME type "%s"?) 888!:mime application/zip 889 890# Java Jar files 891>(26.s+30) leshort 0xcafe Java archive data (JAR) 892!:mime application/java-archive 893 894# iOS App 895>(26.s+30) leshort !0xcafe 896>>26 string !\x8\0\0\0mimetype 897>>>30 string Payload/ 898>>>>38 search/64 .app/ iOS App 899!:mime application/x-ios-app 900 901 902# Generic zip archives (Greg Roelofs, c/o zip-bugs@wkuvx1.wku.edu) 903# Next line excludes specialized formats: 904>(26.s+30) leshort !0xcafe 905>>26 string !\x8\0\0\0mimetype Zip archive data 906!:mime application/zip 907>>>4 byte 0x09 \b, at least v0.9 to extract 908>>>4 byte 0x0a \b, at least v1.0 to extract 909>>>4 byte 0x0b \b, at least v1.1 to extract 910>>>4 byte 0x14 \b, at least v2.0 to extract 911>>>4 byte 0x15 \b, at least v2.1 to extract 912>>>4 byte 0x19 \b, at least v2.5 to extract 913>>>4 byte 0x1b \b, at least v2.7 to extract 914>>>4 byte 0x2d \b, at least v4.5 to extract 915>>>4 byte 0x2e \b, at least v4.6 to extract 916>>>4 byte 0x32 \b, at least v5.0 to extract 917>>>4 byte 0x33 \b, at least v5.1 to extract 918>>>4 byte 0x34 \b, at least v5.2 to extract 919>>>4 byte 0x3d \b, at least v6.1 to extract 920>>>4 byte 0x3e \b, at least v6.2 to extract 921>>>4 byte 0x3f \b, at least v6.3 to extract 922>>>0x161 string WINZIP \b, WinZIP self-extracting 923 924# StarView Metafile 925# From Pierre Ducroquet <pinaraf@pinaraf.info> 9260 string VCLMTF StarView MetaFile 927>6 beshort x \b, version %d 928>8 belong x \b, size %d 929 930# Zoo archiver 93120 lelong 0xfdc4a7dc Zoo archive data 932!:mime application/x-zoo 933>4 byte >48 \b, v%c. 934>>6 byte >47 \b%c 935>>>7 byte >47 \b%c 936>32 byte >0 \b, modify: v%d 937>>33 byte x \b.%d+ 938>42 lelong 0xfdc4a7dc \b, 939>>70 byte >0 extract: v%d 940>>>71 byte x \b.%d+ 941 942# Shell archives 94310 string #\ This\ is\ a\ shell\ archive shell archive text 944!:mime application/octet-stream 945 946# 947# LBR. NB: May conflict with the questionable 948# "binary Computer Graphics Metafile" format. 949# 9500 string \0\ \ \ \ \ \ \ \ \ \ \ \0\0 LBR archive data 951# 952# PMA (CP/M derivative of LHA) 953# Update: Joerg Jenderek 954# URL: https://en.wikipedia.org/wiki/LHA_(file_format) 955# 956#2 string -pm0- PMarc archive data [pm0] 9572 string -pm0- 958>0 use lharc-file 959#2 string -pm1- PMarc archive data [pm1] 9602 string -pm1- 961>0 use lharc-file 962#2 string -pm2- PMarc archive data [pm2] 9632 string -pm2- 964>0 use lharc-file 9652 string -pms- PMarc SFX archive (CP/M, DOS) 966#!:mime application/x-foobar-exec 967!:ext com 9685 string -pc1- PopCom compressed executable (CP/M) 969#!:mime application/x- 970#!:ext com 971 972# From Rafael Laboissiere <rafael@laboissiere.net> 973# The Project Revision Control System (see 974# http://prcs.sourceforge.net) generates a packaged project 975# file which is recognized by the following entry: 9760 leshort 0xeb81 PRCS packaged project 977 978# Microsoft cabinets 979# by David Necas (Yeti) <yeti@physics.muni.cz> 980#0 string MSCF\0\0\0\0 Microsoft cabinet file data, 981#>25 byte x v%d 982#>24 byte x \b.%d 983# MPi: All CABs have version 1.3, so this is pointless. 984# Better magic in debian-additions. 985 986# GTKtalog catalogs 987# by David Necas (Yeti) <yeti@physics.muni.cz> 9884 string gtktalog\ GTKtalog catalog data, 989>13 string 3 version 3 990>>14 beshort 0x677a (gzipped) 991>>14 beshort !0x677a (not gzipped) 992>13 string >3 version %s 993 994############################################################################ 995# Parity archive reconstruction file, the 'par' file format now used on Usenet. 9960 string PAR\0 PARity archive data 997>48 leshort =0 - Index file 998>48 leshort >0 - file number %d 999 1000# Felix von Leitner <felix-file@fefe.de> 10010 string d8:announce BitTorrent file 1002!:mime application/x-bittorrent 1003# Durval Menezes, <jmgthbfile at durval dot com> 10040 string d13:announce-list BitTorrent file 1005!:mime application/x-bittorrent 1006 1007# Atari MSA archive - Teemu Hukkanen <tjhukkan@iki.fi> 10080 beshort 0x0e0f Atari MSA archive data 1009>2 beshort x \b, %d sectors per track 1010>4 beshort 0 \b, 1 sided 1011>4 beshort 1 \b, 2 sided 1012>6 beshort x \b, starting track: %d 1013>8 beshort x \b, ending track: %d 1014 1015# Alternate ZIP string (amc@arwen.cs.berkeley.edu) 10160 string PK00PK\003\004 Zip archive data 1017 1018# ACE archive (from http://www.wotsit.org/download.asp?f=ace) 1019# by Stefan `Sec` Zehl <sec@42.org> 10207 string **ACE** ACE archive data 1021>15 byte >0 version %d 1022>16 byte =0x00 \b, from MS-DOS 1023>16 byte =0x01 \b, from OS/2 1024>16 byte =0x02 \b, from Win/32 1025>16 byte =0x03 \b, from Unix 1026>16 byte =0x04 \b, from MacOS 1027>16 byte =0x05 \b, from WinNT 1028>16 byte =0x06 \b, from Primos 1029>16 byte =0x07 \b, from AppleGS 1030>16 byte =0x08 \b, from Atari 1031>16 byte =0x09 \b, from Vax/VMS 1032>16 byte =0x0A \b, from Amiga 1033>16 byte =0x0B \b, from Next 1034>14 byte x \b, version %d to extract 1035>5 leshort &0x0080 \b, multiple volumes, 1036>>17 byte x \b (part %d), 1037>5 leshort &0x0002 \b, contains comment 1038>5 leshort &0x0200 \b, sfx 1039>5 leshort &0x0400 \b, small dictionary 1040>5 leshort &0x0800 \b, multi-volume 1041>5 leshort &0x1000 \b, contains AV-String 1042>>30 string \x16*UNREGISTERED\x20VERSION* (unregistered) 1043>5 leshort &0x2000 \b, with recovery record 1044>5 leshort &0x4000 \b, locked 1045>5 leshort &0x8000 \b, solid 1046# Date in MS-DOS format (whatever that is) 1047#>18 lelong x Created on 1048 1049# sfArk : compression program for Soundfonts (sf2) by Dirk Jagdmann 1050# <doj@cubic.org> 10510x1A string sfArk sfArk compressed Soundfont 1052>0x15 string 2 1053>>0x1 string >\0 Version %s 1054>>0x2A string >\0 : %s 1055 1056# DR-DOS 7.03 Packed File *.??_ 10570 string Packed\ File\ Personal NetWare Packed File 1058>12 string x \b, was "%.12s" 1059 1060# EET archive 1061# From: Tilman Sauerbeck <tilman@code-monkey.de> 10620 belong 0x1ee7ff00 EET archive 1063!:mime application/x-eet 1064 1065# rzip archives 10660 string RZIP rzip compressed data 1067>4 byte x - version %d 1068>5 byte x \b.%d 1069>6 belong x (%d bytes) 1070 1071# From: "Robert Dale" <robdale@gmail.com> 10720 belong 123 dar archive, 1073>4 belong x label "%.8x 1074>>8 belong x %.8x 1075>>>12 beshort x %.4x" 1076>14 byte 0x54 end slice 1077>14 beshort 0x4e4e multi-part 1078>14 beshort 0x4e53 multi-part, with -S 1079 1080# Symbian installation files 1081# http://www.thouky.co.uk/software/psifs/sis.html 1082# http://developer.symbian.com/main/downloads/papers/SymbianOSv91/softwareinstallsis.pdf 10838 lelong 0x10000419 Symbian installation file 1084!:mime application/vnd.symbian.install 1085>4 lelong 0x1000006D (EPOC release 3/4/5) 1086>4 lelong 0x10003A12 (EPOC release 6) 10870 lelong 0x10201A7A Symbian installation file (Symbian OS 9.x) 1088!:mime x-epoc/x-sisx-app 1089 1090# From "Nelson A. de Oliveira" <naoliv@gmail.com> 10910 string MPQ\032 MoPaQ (MPQ) archive 1092 1093# From: "Nelson A. de Oliveira" <naoliv@gmail.com> 1094# .kgb 10950 string KGB_arch KGB Archiver file 1096>10 string x with compression level %.1s 1097 1098# xar (eXtensible ARchiver) archive 1099# xar archive format: http://code.google.com/p/xar/ 1100# From: "David Remahl" <dremahl@apple.com> 11010 string xar! xar archive 1102!:mime application/x-xar 1103#>4 beshort x header size %d 1104>6 beshort x version %d, 1105#>8 quad x compressed TOC: %d, 1106#>16 quad x uncompressed TOC: %d, 1107>24 belong 0 no checksum 1108>24 belong 1 SHA-1 checksum 1109>24 belong 2 MD5 checksum 1110 1111# Type: Parity Archive 1112# From: Daniel van Eeden <daniel_e@dds.nl> 11130 string PAR2 Parity Archive Volume Set 1114 1115# Bacula volume format. (Volumes always start with a block header.) 1116# URL: http://bacula.org/3.0.x-manuals/en/developers/developers/Block_Header.html 1117# From: Adam Buchbinder <adam.buchbinder@gmail.com> 111812 string BB02 Bacula volume 1119>20 bedate x \b, started %s 1120 1121# ePub is XHTML + XML inside a ZIP archive. The first member of the 1122# archive must be an uncompressed file called 'mimetype' with contents 1123# 'application/epub+zip' 1124 1125 1126# From: "Michael Gorny" <mgorny@gentoo.org> 1127# ZPAQ: http://mattmahoney.net/dc/zpaq.html 11280 string zPQ ZPAQ stream 1129>3 byte x \b, level %d 1130# From: Barry Carter <carter.barry@gmail.com> 1131# http://encode.ru/threads/456-zpaq-updates/page32 11320 string 7kSt ZPAQ file 1133 1134# BBeB ebook, unencrypted (LRF format) 1135# URL: http://www.sven.de/librie/Librie/LrfFormat 1136# From: Adam Buchbinder <adam.buchbinder@gmail.com> 11370 string L\0R\0F\0\0\0 BBeB ebook data, unencrypted 1138>8 beshort x \b, version %d 1139>36 byte 1 \b, front-to-back 1140>36 byte 16 \b, back-to-front 1141>42 beshort x \b, (%dx, 1142>44 beshort x %d) 1143 1144# Symantec GHOST image by Joerg Jenderek at May 2014 1145# http://us.norton.com/ghost/ 1146# http://www.garykessler.net/library/file_sigs.html 11470 ubelong&0xFFFFf7f0 0xFEEF0100 Norton GHost image 1148# *.GHO 1149>2 ubyte&0x08 0x00 \b, first file 1150# *.GHS or *.[0-9] with cns program option 1151>2 ubyte&0x08 0x08 \b, split file 1152# part of split index interesting for *.ghs 1153>>4 ubyte x id=0x%x 1154# compression tag minus one equals numeric compression command line switch z[1-9] 1155>3 ubyte 0 \b, no compression 1156>3 ubyte 2 \b, fast compression (Z1) 1157>3 ubyte 3 \b, medium compression (Z2) 1158>3 ubyte >3 1159>>3 ubyte <11 \b, compression (Z%d-1) 1160>2 ubyte&0x08 0x00 1161# ~ 30 byte password field only for *.gho 1162>>12 ubequad !0 \b, password protected 1163>>44 ubyte !1 1164# 1~Image All, sector-by-sector only for *.gho 1165>>>10 ubyte 1 \b, sector copy 1166# 1~Image Boot track only for *.gho 1167>>>43 ubyte 1 \b, boot track 1168# 1~Image Disc only for *.gho implies Image Boot track and sector copy 1169>>44 ubyte 1 \b, disc sector copy 1170# optional image description only *.gho 1171>>0xff string >\0 "%-.254s" 1172# look for DOS sector end sequence 1173>0xE08 search/7776 \x55\xAA 1174>>&-512 indirect x \b; contains 1175 1176# Google Chrome extensions 1177# https://developer.chrome.com/extensions/crx 1178# https://developer.chrome.com/extensions/hosting 11790 string Cr24 Google Chrome extension 1180!:mime application/x-chrome-extension 1181>4 ulong x \b, version %u 1182 1183# SeqBox - Sequenced container 1184# ext: sbx, seqbox 1185# Marco Pontello marcopon@gmail.com 1186# reference: https://github.com/MarcoPon/SeqBox 11870 string SBx SeqBox, 1188>3 byte x version %d 1189