1b6cee71dSXin LI 2b6cee71dSXin LI#------------------------------------------------------------------------------ 3*43a5ec4eSXin LI# $File: rtf,v 1.9 2020/12/12 20:01:47 christos Exp $ 4b6cee71dSXin LI# rtf: file(1) magic for Rich Text Format (RTF) 5b6cee71dSXin LI# 6b6cee71dSXin LI# Duncan P. Simpson, D.P.Simpson@dcs.warwick.ac.uk 72726a701SXin LI# Update: Joerg Jenderek 82726a701SXin LI# URL: https://en.wikipedia.org/wiki/Rich_Text_Format 92726a701SXin LI# Reference: http://www.snake.net/software/RTF/RTF-Spec-1.7.rtf 102726a701SXin LI# http://www.kleinlercher.at/tools/Windows_Protocols/Word2007RTFSpec9.pdf 112726a701SXin LI0 string {\\rtf 122726a701SXin LI# skip DROID fmt-355-signature-id-522.rtf by looking for valid version 132726a701SXin LI>5 ubyte !0xAB 142726a701SXin LI# skip also \ in DROID fmt-50-signature-id-158.rtf by looking for valid version 152726a701SXin LI>>5 ubyte !0x5C Rich Text Format data 16b6cee71dSXin LI!:mime text/rtf 172726a701SXin LI!:apple ????RTF 182726a701SXin LI!:ext rtf 192726a701SXin LI>>>0 use rtf-info 202726a701SXin LI# display information like version, language and code page of RTF 212726a701SXin LI0 name rtf-info 222726a701SXin LI# 1 mostly, 2 for newer Pocket Word documents, space for test like fdo78502.rtf, { for some urtf 232726a701SXin LI>5 ubyte !0x7b \b, version %c 242726a701SXin LI# The word for character set must precede any text or most other control words 252726a701SXin LI>6 string \\mac \b, Apple Macintosh 262726a701SXin LI>6 string \\pc 272726a701SXin LI# control word \pca 282726a701SXin LI>>9 ubyte =0x61 \b, IBM PS/2, code page 850 292726a701SXin LI>>9 ubyte !0x61 \b, IBM PC, code page 437 302726a701SXin LI# unknown character set or ANSI later after control words like 312726a701SXin LI# \adeflang1025 \info \title \author \category \manager 322726a701SXin LI# "Burow, Steffanie - Im Tal des Schneeleoparden.rtf" 332726a701SXin LI#>6 search/105 \\ansi \b, ANSI 342726a701SXin LI>6 search/502 \\ansi \b, ANSI 352726a701SXin LI>6 default x \b, unknown character set 36*43a5ec4eSXin LI# look for explicit codepage keyword 372726a701SXin LI# "Burow, Steffanie - Im Tal des Schneeleoparden.rtf" 382726a701SXin LI#>5 search/110 \\ansicpg 392726a701SXin LI>5 search/500 \\ansicpg 402726a701SXin LI# skip unknown or buggy codepage string 0 like in fdo78502.rtf 412726a701SXin LI>>&0 ubyte !0x30 \b, code page 422726a701SXin LI# codepage string: 437~United States IBM, ..., 1252~WesternEuropean, ..., 57011~Punjabi 432726a701SXin LI>>>&-1 string x %-.3s 442726a701SXin LI# skip space or \ and display possible 4th digit of code page string 452726a701SXin LI>>>&2 ubyte >0x2F 462726a701SXin LI>>>>&-1 ubyte <0x3A \b%c 472726a701SXin LI# possible 5th digit of code page string 482726a701SXin LI>>>>>&0 ubyte >0x2F 492726a701SXin LI>>>>>>&-1 ubyte <0x3A \b%c 502726a701SXin LI# look again at version byte to use default clause 512726a701SXin LI>5 ubyte x 522726a701SXin LI# Default language ID for South Asian/Middle Eastern text 532726a701SXin LI# language ID: 1025, ..., 1065~Persian, ..., 2057~English_UnitedKingdom, ..., 58380~French_NorthAfrica 542726a701SXin LI# Readme-0.72-Persian.rtf 552726a701SXin LI#>6 search/1 \\adeflang \b, default middle east language ID 562726a701SXin LI>>6 search/497 \\adeflang \b, default middle east language ID 572726a701SXin LI# https://docs.microsoft.com/en-us/openspecs/office_standards/ms-oe376/6c085406-a698-4e12-9d4d-c3b0ee3dbc4a 582726a701SXin LI>>>&0 string x %.4s 592726a701SXin LI# skip \ and NL and show possible 5th digit of language string 602726a701SXin LI>>>&4 ubyte >0x2F 612726a701SXin LI>>>>&-1 ubyte <0x3A \b%c 622726a701SXin LI# else look for default language to be used when the \plain control word is encountered 632726a701SXin LI>>6 default x 642726a701SXin LI# "Burow, Steffanie - Im Tal des Schneeleoparden.rtf" 652726a701SXin LI#>>>6 search/127 \\deflang 662726a701SXin LI>>>6 search/505 \\deflang 672726a701SXin LI>>>>&0 string >0 \b, default language ID %-.4s 682726a701SXin LI# possible 5th digit of language string 692726a701SXin LI>>>>&4 ubyte >0x2F 702726a701SXin LI>>>>>&-1 ubyte <0x3A \b%c 712726a701SXin LI 722726a701SXin LI# Reference: http://latex2rtf.sourceforge.net/rtfspec_63.html 732726a701SXin LI# Note: no real world example found 742726a701SXin LI0 string {\\urtf Rich Text Format unicoded data 752726a701SXin LI!:mime text/rtf 762726a701SXin LI#!:apple ????RTF 772726a701SXin LI!:ext rtf 782726a701SXin LI>1 use rtf-info 792726a701SXin LI 802726a701SXin LI# URL: https://en.wikipedia.org/wiki/Microsoft_Word 812726a701SXin LI# Reference: http://fileformats.archiveteam.org/wiki/Microsoft_Word 822726a701SXin LI# Note: called by TrID "Pocket Word document" 832726a701SXin LI# by PlanMaker "Pocket Word-Handheld PC" for pwd 842726a701SXin LI# by PlanMaker "Pocket Word-Pocket PC" for psw 852726a701SXin LI0 string {\\pwd Pocket Word document or template 862726a701SXin LI# by SoftMaker Office http://extension.nirsoft.net/pwd 872726a701SXin LI#!:mime application/msword 882726a701SXin LI# https://reposcope.com/mimetype/application/x-pocket-word 892726a701SXin LI!:mime application/x-pocket-word 902726a701SXin LI# PWD for Handheld PC variant and PSW for Pocket PC variant 912726a701SXin LI# PWT for template 922726a701SXin LI!:ext pwd/psw/pwt 932726a701SXin LI>0 use rtf-info 942726a701SXin LI 95