1b6cee71dSXin LI 2b6cee71dSXin LI#------------------------------------------------------------------------------ 3*2726a701SXin LI# $File: rtf,v 1.8 2020/05/17 19:28:49 christos Exp $ 4b6cee71dSXin LI# rtf: file(1) magic for Rich Text Format (RTF) 5b6cee71dSXin LI# 6b6cee71dSXin LI# Duncan P. Simpson, D.P.Simpson@dcs.warwick.ac.uk 7*2726a701SXin LI# Update: Joerg Jenderek 8*2726a701SXin LI# URL: https://en.wikipedia.org/wiki/Rich_Text_Format 9*2726a701SXin LI# Reference: http://www.snake.net/software/RTF/RTF-Spec-1.7.rtf 10*2726a701SXin LI# http://www.kleinlercher.at/tools/Windows_Protocols/Word2007RTFSpec9.pdf 11*2726a701SXin LI0 string {\\rtf 12*2726a701SXin LI# skip DROID fmt-355-signature-id-522.rtf by looking for valid version 13*2726a701SXin LI>5 ubyte !0xAB 14*2726a701SXin LI# skip also \ in DROID fmt-50-signature-id-158.rtf by looking for valid version 15*2726a701SXin LI>>5 ubyte !0x5C Rich Text Format data 16b6cee71dSXin LI!:mime text/rtf 17*2726a701SXin LI!:apple ????RTF 18*2726a701SXin LI!:ext rtf 19*2726a701SXin LI>>>0 use rtf-info 20*2726a701SXin LI# display information like version, language and code page of RTF 21*2726a701SXin LI0 name rtf-info 22*2726a701SXin LI# 1 mostly, 2 for newer Pocket Word documents, space for test like fdo78502.rtf, { for some urtf 23*2726a701SXin LI>5 ubyte !0x7b \b, version %c 24*2726a701SXin LI# The word for character set must precede any text or most other control words 25*2726a701SXin LI>6 string \\mac \b, Apple Macintosh 26*2726a701SXin LI>6 string \\pc 27*2726a701SXin LI# control word \pca 28*2726a701SXin LI>>9 ubyte =0x61 \b, IBM PS/2, code page 850 29*2726a701SXin LI>>9 ubyte !0x61 \b, IBM PC, code page 437 30*2726a701SXin LI# unknown character set or ANSI later after control words like 31*2726a701SXin LI# \adeflang1025 \info \title \author \category \manager 32*2726a701SXin LI# "Burow, Steffanie - Im Tal des Schneeleoparden.rtf" 33*2726a701SXin LI#>6 search/105 \\ansi \b, ANSI 34*2726a701SXin LI>6 search/502 \\ansi \b, ANSI 35*2726a701SXin LI>6 default x \b, unknown character set 36*2726a701SXin LI# look for explict codepage keyword 37*2726a701SXin LI# "Burow, Steffanie - Im Tal des Schneeleoparden.rtf" 38*2726a701SXin LI#>5 search/110 \\ansicpg 39*2726a701SXin LI>5 search/500 \\ansicpg 40*2726a701SXin LI# skip unknown or buggy codepage string 0 like in fdo78502.rtf 41*2726a701SXin LI>>&0 ubyte !0x30 \b, code page 42*2726a701SXin LI# codepage string: 437~United States IBM, ..., 1252~WesternEuropean, ..., 57011~Punjabi 43*2726a701SXin LI>>>&-1 string x %-.3s 44*2726a701SXin LI# skip space or \ and display possible 4th digit of code page string 45*2726a701SXin LI>>>&2 ubyte >0x2F 46*2726a701SXin LI>>>>&-1 ubyte <0x3A \b%c 47*2726a701SXin LI# possible 5th digit of code page string 48*2726a701SXin LI>>>>>&0 ubyte >0x2F 49*2726a701SXin LI>>>>>>&-1 ubyte <0x3A \b%c 50*2726a701SXin LI# look again at version byte to use default clause 51*2726a701SXin LI>5 ubyte x 52*2726a701SXin LI# Default language ID for South Asian/Middle Eastern text 53*2726a701SXin LI# language ID: 1025, ..., 1065~Persian, ..., 2057~English_UnitedKingdom, ..., 58380~French_NorthAfrica 54*2726a701SXin LI# Readme-0.72-Persian.rtf 55*2726a701SXin LI#>6 search/1 \\adeflang \b, default middle east language ID 56*2726a701SXin LI>>6 search/497 \\adeflang \b, default middle east language ID 57*2726a701SXin LI# https://docs.microsoft.com/en-us/openspecs/office_standards/ms-oe376/6c085406-a698-4e12-9d4d-c3b0ee3dbc4a 58*2726a701SXin LI>>>&0 string x %.4s 59*2726a701SXin LI# skip \ and NL and show possible 5th digit of language string 60*2726a701SXin LI>>>&4 ubyte >0x2F 61*2726a701SXin LI>>>>&-1 ubyte <0x3A \b%c 62*2726a701SXin LI# else look for default language to be used when the \plain control word is encountered 63*2726a701SXin LI>>6 default x 64*2726a701SXin LI# "Burow, Steffanie - Im Tal des Schneeleoparden.rtf" 65*2726a701SXin LI#>>>6 search/127 \\deflang 66*2726a701SXin LI>>>6 search/505 \\deflang 67*2726a701SXin LI>>>>&0 string >0 \b, default language ID %-.4s 68*2726a701SXin LI# possible 5th digit of language string 69*2726a701SXin LI>>>>&4 ubyte >0x2F 70*2726a701SXin LI>>>>>&-1 ubyte <0x3A \b%c 71*2726a701SXin LI 72*2726a701SXin LI# Reference: http://latex2rtf.sourceforge.net/rtfspec_63.html 73*2726a701SXin LI# Note: no real world example found 74*2726a701SXin LI0 string {\\urtf Rich Text Format unicoded data 75*2726a701SXin LI!:mime text/rtf 76*2726a701SXin LI#!:apple ????RTF 77*2726a701SXin LI!:ext rtf 78*2726a701SXin LI>1 use rtf-info 79*2726a701SXin LI 80*2726a701SXin LI# URL: https://en.wikipedia.org/wiki/Microsoft_Word 81*2726a701SXin LI# Reference: http://fileformats.archiveteam.org/wiki/Microsoft_Word 82*2726a701SXin LI# Note: called by TrID "Pocket Word document" 83*2726a701SXin LI# by PlanMaker "Pocket Word-Handheld PC" for pwd 84*2726a701SXin LI# by PlanMaker "Pocket Word-Pocket PC" for psw 85*2726a701SXin LI0 string {\\pwd Pocket Word document or template 86*2726a701SXin LI# by SoftMaker Office http://extension.nirsoft.net/pwd 87*2726a701SXin LI#!:mime application/msword 88*2726a701SXin LI# https://reposcope.com/mimetype/application/x-pocket-word 89*2726a701SXin LI!:mime application/x-pocket-word 90*2726a701SXin LI# PWD for Handheld PC variant and PSW for Pocket PC variant 91*2726a701SXin LI# PWT for template 92*2726a701SXin LI!:ext pwd/psw/pwt 93*2726a701SXin LI>0 use rtf-info 94*2726a701SXin LI 95