19b50d902SRodney W. Grimes 29b50d902SRodney W. Grimes @(#)README 8.1 (Berkeley) 6/9/93 36dc4364cSPhilippe Charnier $FreeBSD$ 49b50d902SRodney W. Grimes 59b50d902SRodney W. GrimesCompress version 4.0 improvements over 3.0: 69b50d902SRodney W. Grimes o compress() speedup (10-50%) by changing division hash to xor 79b50d902SRodney W. Grimes o decompress() speedup (5-10%) 89b50d902SRodney W. Grimes o Memory requirements reduced (3-30%) 99b50d902SRodney W. Grimes o Stack requirements reduced to less than 4kb 109b50d902SRodney W. Grimes o Removed 'Big+Fast' compress code (FBITS) because of compress speedup 119b50d902SRodney W. Grimes o Portability mods for Z8000 and PC/XT (but not zeus 3.2) 129b50d902SRodney W. Grimes o Default to 'quiet' mode 139b50d902SRodney W. Grimes o Unification of 'force' flags 149b50d902SRodney W. Grimes o Manual page overhaul 159b50d902SRodney W. Grimes o Portability enhancement for M_XENIX 169b50d902SRodney W. Grimes o Removed text on #else and #endif 179b50d902SRodney W. Grimes o Added "-V" switch to print version and options 189b50d902SRodney W. Grimes o Added #defines for SIGNED_COMPARE_SLOW 199b50d902SRodney W. Grimes o Added Makefile and "usermem" program 209b50d902SRodney W. Grimes o Removed all floating point computations 219b50d902SRodney W. Grimes o New programs: [deleted] 229b50d902SRodney W. Grimes 239b50d902SRodney W. GrimesThe "usermem" script attempts to determine the maximum process size. Some 249b50d902SRodney W. Grimesediting of the script may be necessary (see the comments). [It should work 256dc4364cSPhilippe Charnierfine on 4.3 BSD.] If you can't get it to work at all, just create file 269b50d902SRodney W. Grimes"USERMEM" containing the maximum process size in decimal. 279b50d902SRodney W. Grimes 289b50d902SRodney W. GrimesThe following preprocessor symbols control the compilation of "compress.c": 299b50d902SRodney W. Grimes 309b50d902SRodney W. Grimes o USERMEM Maximum process memory on the system 316dc4364cSPhilippe Charnier o SACREDMEM Amount to reserve for other processes 329b50d902SRodney W. Grimes o SIGNED_COMPARE_SLOW Unsigned compare instructions are faster 339b50d902SRodney W. Grimes o NO_UCHAR Don't use "unsigned char" types 349b50d902SRodney W. Grimes o BITS Overrules default set by USERMEM-SACREDMEM 359b50d902SRodney W. Grimes o vax Generate inline assembler 369b50d902SRodney W. Grimes o interdata Defines SIGNED_COMPARE_SLOW 379b50d902SRodney W. Grimes o M_XENIX Makes arrays < 65536 bytes each 389b50d902SRodney W. Grimes o pdp11 BITS=12, NO_UCHAR 399b50d902SRodney W. Grimes o z8000 BITS=12 409b50d902SRodney W. Grimes o pcxt BITS=12 419b50d902SRodney W. Grimes o BSD4_2 Allow long filenames ( > 14 characters) & 429b50d902SRodney W. Grimes Call setlinebuf(stderr) 439b50d902SRodney W. Grimes 449b50d902SRodney W. GrimesThe difference "usermem-sacredmem" determines the maximum BITS that can be 459b50d902SRodney W. Grimesspecified with the "-b" flag. 469b50d902SRodney W. Grimes 479b50d902SRodney W. Grimesmemory: at least BITS 489b50d902SRodney W. Grimes------ -- ----- ---- 499b50d902SRodney W. Grimes 433,484 16 509b50d902SRodney W. Grimes 229,600 15 519b50d902SRodney W. Grimes 127,536 14 529b50d902SRodney W. Grimes 73,464 13 539b50d902SRodney W. Grimes 0 12 549b50d902SRodney W. Grimes 559b50d902SRodney W. GrimesThe default is BITS=16. 569b50d902SRodney W. Grimes 576dc4364cSPhilippe CharnierThe maximum bits can be overruled by specifying "-DBITS=bits" at 589b50d902SRodney W. Grimescompilation time. 599b50d902SRodney W. Grimes 609b50d902SRodney W. GrimesWARNING: files compressed on a large machine with more bits than allowed by 619b50d902SRodney W. Grimesa version of compress on a smaller machine cannot be decompressed! Use the 629b50d902SRodney W. Grimes"-b12" flag to generate a file on a large machine that can be uncompressed 639b50d902SRodney W. Grimeson a 16-bit machine. 649b50d902SRodney W. Grimes 659b50d902SRodney W. GrimesThe output of compress 4.0 is fully compatible with that of compress 3.0. 669b50d902SRodney W. GrimesIn other words, the output of compress 4.0 may be fed into uncompress 3.0 or 679b50d902SRodney W. Grimesthe output of compress 3.0 may be fed into uncompress 4.0. 689b50d902SRodney W. Grimes 699b50d902SRodney W. GrimesThe output of compress 4.0 not compatible with that of 709b50d902SRodney W. Grimescompress 2.0. However, compress 4.0 still accepts the output of 719b50d902SRodney W. Grimescompress 2.0. To generate output that is compatible with compress 729b50d902SRodney W. Grimes2.0, use the undocumented "-C" flag. 739b50d902SRodney W. Grimes 749b50d902SRodney W. Grimes -from mod.sources, submitted by vax135!petsd!joe (Joe Orost), 8/1/85 759b50d902SRodney W. Grimes-------------------------------- 769b50d902SRodney W. Grimes 779b50d902SRodney W. GrimesEnclosed is compress version 3.0 with the following changes: 789b50d902SRodney W. Grimes 799b50d902SRodney W. Grimes1. "Block" compression is performed. After the BITS run out, the 809b50d902SRodney W. Grimes compression ratio is checked every so often. If it is decreasing, 819b50d902SRodney W. Grimes the table is cleared and a new set of substrings are generated. 829b50d902SRodney W. Grimes 839b50d902SRodney W. Grimes This makes the output of compress 3.0 not compatible with that of 849b50d902SRodney W. Grimes compress 2.0. However, compress 3.0 still accepts the output of 859b50d902SRodney W. Grimes compress 2.0. To generate output that is compatible with compress 869b50d902SRodney W. Grimes 2.0, use the undocumented "-C" flag. 879b50d902SRodney W. Grimes 889b50d902SRodney W. Grimes2. A quiet "-q" flag has been added for use by the news system. 899b50d902SRodney W. Grimes 909b50d902SRodney W. Grimes3. The character chaining has been deleted and the program now uses 919b50d902SRodney W. Grimes hashing. This improves the speed of the program, especially 929b50d902SRodney W. Grimes during decompression. Other speed improvements have been made, 939b50d902SRodney W. Grimes such as using putc() instead of fwrite(). 949b50d902SRodney W. Grimes 959b50d902SRodney W. Grimes4. A large table is used on large machines when a relatively small 969b50d902SRodney W. Grimes number of bits is specified. This saves much time when compressing 979b50d902SRodney W. Grimes for a 16-bit machine on a 32-bit virtual machine. Note that the 989b50d902SRodney W. Grimes speed improvement only occurs when the input file is > 30000 999b50d902SRodney W. Grimes characters, and the -b BITS is less than or equal to the cutoff 1009b50d902SRodney W. Grimes described below. 1019b50d902SRodney W. Grimes 1029b50d902SRodney W. GrimesMost of these changes were made by James A. Woods (ames!jaw). Thank you 1039b50d902SRodney W. GrimesJames! 1049b50d902SRodney W. Grimes 1059b50d902SRodney W. GrimesTo compile compress: 1069b50d902SRodney W. Grimes 1079b50d902SRodney W. Grimes cc -O -DUSERMEM=usermem -o compress compress.c 1089b50d902SRodney W. Grimes 1099b50d902SRodney W. GrimesWhere "usermem" is the amount of physical user memory available (in bytes). 1109b50d902SRodney W. GrimesIf any physical memory is to be reserved for other processes, put in 1119b50d902SRodney W. Grimes"-DSACREDMEM sacredmem", where "sacredmem" is the amount to be reserved. 1129b50d902SRodney W. Grimes 1139b50d902SRodney W. GrimesThe difference "usermem-sacredmem" determines the maximum BITS that can be 1149b50d902SRodney W. Grimesspecified, and the cutoff bits where the large+fast table is used. 1159b50d902SRodney W. Grimes 1169b50d902SRodney W. Grimesmemory: at least BITS cutoff 1179b50d902SRodney W. Grimes------ -- ----- ---- ------ 1189b50d902SRodney W. Grimes 4,718,592 16 13 1199b50d902SRodney W. Grimes 2,621,440 16 12 1209b50d902SRodney W. Grimes 1,572,864 16 11 1219b50d902SRodney W. Grimes 1,048,576 16 10 1229b50d902SRodney W. Grimes 631,808 16 -- 1239b50d902SRodney W. Grimes 329,728 15 -- 1249b50d902SRodney W. Grimes 178,176 14 -- 1259b50d902SRodney W. Grimes 99,328 13 -- 1269b50d902SRodney W. Grimes 0 12 -- 1279b50d902SRodney W. Grimes 1289b50d902SRodney W. GrimesThe default memory size is 750,000 which gives a maximum BITS=16 and no 1299b50d902SRodney W. Grimeslarge+fast table. 1309b50d902SRodney W. Grimes 1319b50d902SRodney W. GrimesThe maximum bits can be overruled by specifying "-DBITS=bits" at 1329b50d902SRodney W. Grimescompilation time. 1339b50d902SRodney W. Grimes 1349b50d902SRodney W. GrimesIf your machine doesn't support unsigned characters, define "NO_UCHAR" 1359b50d902SRodney W. Grimeswhen compiling. 1369b50d902SRodney W. Grimes 1379b50d902SRodney W. GrimesIf your machine has "int" as 16-bits, define "SHORT_INT" when compiling. 1389b50d902SRodney W. Grimes 1399b50d902SRodney W. GrimesAfter compilation, move "compress" to a standard executable location, such 1409b50d902SRodney W. Grimesas /usr/local. Then: 1419b50d902SRodney W. Grimes cd /usr/local 1429b50d902SRodney W. Grimes ln compress uncompress 1439b50d902SRodney W. Grimes ln compress zcat 1449b50d902SRodney W. Grimes 1459b50d902SRodney W. GrimesOn machines that have a fixed stack size (such as Perkin-Elmer), set the 1469b50d902SRodney W. Grimesstack to at least 12kb. ("setstack compress 12" on Perkin-Elmer). 1479b50d902SRodney W. Grimes 1489b50d902SRodney W. GrimesNext, install the manual (compress.l). 1499b50d902SRodney W. Grimes cp compress.l /usr/man/manl 1509b50d902SRodney W. Grimes cd /usr/man/manl 1519b50d902SRodney W. Grimes ln compress.l uncompress.l 1529b50d902SRodney W. Grimes ln compress.l zcat.l 1539b50d902SRodney W. Grimes 1549b50d902SRodney W. Grimes - or - 1559b50d902SRodney W. Grimes 1569b50d902SRodney W. Grimes cp compress.l /usr/man/man1/compress.1 1579b50d902SRodney W. Grimes cd /usr/man/man1 1589b50d902SRodney W. Grimes ln compress.1 uncompress.1 1599b50d902SRodney W. Grimes ln compress.1 zcat.1 1609b50d902SRodney W. Grimes 1619b50d902SRodney W. Grimes regards, 1629b50d902SRodney W. Grimes petsd!joe 1639b50d902SRodney W. Grimes 1649b50d902SRodney W. GrimesHere is a note from the net: 1659b50d902SRodney W. Grimes 1669b50d902SRodney W. Grimes>From hplabs!pesnta!amd!turtlevax!ken Sat Jan 5 03:35:20 1985 1679b50d902SRodney W. GrimesPath: ames!hplabs!pesnta!amd!turtlevax!ken 1689b50d902SRodney W. GrimesFrom: ken@turtlevax.UUCP (Ken Turkowski) 1699b50d902SRodney W. GrimesNewsgroups: net.sources 1709b50d902SRodney W. GrimesSubject: Re: Compress release 3.0 : sample Makefile 1719b50d902SRodney W. GrimesOrganization: CADLINC, Inc. @ Menlo Park, CA 1729b50d902SRodney W. Grimes 1739b50d902SRodney W. GrimesIn the compress 3.0 source recently posted to mod.sources, there is a 1749b50d902SRodney W. Grimes#define variable which can be set for optimum performance on a machine 1759b50d902SRodney W. Grimeswith a large amount of memory. A program (usermem) to calculate the 1766dc4364cSPhilippe Charnierusable amount of physical user memory is enclosed, as well as a sample 1776dc4364cSPhilippe Charnier4.2BSD Vax Makefile for compress. 1789b50d902SRodney W. Grimes 1799b50d902SRodney W. GrimesHere is the README file from the previous version of compress (2.0): 1809b50d902SRodney W. Grimes 1819b50d902SRodney W. Grimes>Enclosed is compress.c version 2.0 with the following bugs fixed: 1829b50d902SRodney W. Grimes> 1839b50d902SRodney W. Grimes>1. The packed files produced by compress are different on different 1849b50d902SRodney W. Grimes> machines and dependent on the vax sysgen option. 1859b50d902SRodney W. Grimes> The bug was in the different byte/bit ordering on the 1869b50d902SRodney W. Grimes> various machines. This has been fixed. 1879b50d902SRodney W. Grimes> 1889b50d902SRodney W. Grimes> This version is NOT compatible with the original vax posting 1899b50d902SRodney W. Grimes> unless the '-DCOMPATIBLE' option is specified to the C 1909b50d902SRodney W. Grimes> compiler. The original posting has a bug which I fixed, 1919b50d902SRodney W. Grimes> causing incompatible files. I recommend you NOT to use this 1929b50d902SRodney W. Grimes> option unless you already have a lot of packed files from 1936dc4364cSPhilippe Charnier> the original posting by Thomas. 1949b50d902SRodney W. Grimes>2. The exit status is not well defined (on some machines) causing the 1959b50d902SRodney W. Grimes> scripts to fail. 1969b50d902SRodney W. Grimes> The exit status is now 0,1 or 2 and is documented in 1979b50d902SRodney W. Grimes> compress.l. 1989b50d902SRodney W. Grimes>3. The function getopt() is not available in all C libraries. 1999b50d902SRodney W. Grimes> The function getopt() is no longer referenced by the 2009b50d902SRodney W. Grimes> program. 2019b50d902SRodney W. Grimes>4. Error status is not being checked on the fwrite() and fflush() calls. 2029b50d902SRodney W. Grimes> Fixed. 2039b50d902SRodney W. Grimes> 2049b50d902SRodney W. Grimes>The following enhancements have been made: 2059b50d902SRodney W. Grimes> 2069b50d902SRodney W. Grimes>1. Added facilities of "compact" into the compress program. "Pack", 2079b50d902SRodney W. Grimes> "Unpack", and "Pcat" are no longer required (no longer supplied). 2089b50d902SRodney W. Grimes>2. Installed work around for C compiler bug with "-O". 2099b50d902SRodney W. Grimes>3. Added a magic number header (\037\235). Put the bits specified 2109b50d902SRodney W. Grimes> in the file. 2119b50d902SRodney W. Grimes>4. Added "-f" flag to force overwrite of output file. 2129b50d902SRodney W. Grimes>5. Added "-c" flag and "zcat" program. 'ln compress zcat' after you 2139b50d902SRodney W. Grimes> compile. 2149b50d902SRodney W. Grimes>6. The 'uncompress' script has been deleted; simply 2159b50d902SRodney W. Grimes> 'ln compress uncompress' after you compile and it will work. 2169b50d902SRodney W. Grimes>7. Removed extra bit masking for machines that support unsigned 2179b50d902SRodney W. Grimes> characters. If your machine doesn't support unsigned characters, 2189b50d902SRodney W. Grimes> define "NO_UCHAR" when compiling. 2199b50d902SRodney W. Grimes> 2209b50d902SRodney W. Grimes>Compile "compress.c" with "-O -o compress" flags. Move "compress" to a 2219b50d902SRodney W. Grimes>standard executable location, such as /usr/local. Then: 2229b50d902SRodney W. Grimes> cd /usr/local 2239b50d902SRodney W. Grimes> ln compress uncompress 2249b50d902SRodney W. Grimes> ln compress zcat 2259b50d902SRodney W. Grimes> 2269b50d902SRodney W. Grimes>On machines that have a fixed stack size (such as Perkin-Elmer), set the 2279b50d902SRodney W. Grimes>stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer). 2289b50d902SRodney W. Grimes> 2299b50d902SRodney W. Grimes>Next, install the manual (compress.l). 2309b50d902SRodney W. Grimes> cp compress.l /usr/man/manl - or - 2319b50d902SRodney W. Grimes> cp compress.l /usr/man/man1/compress.1 2329b50d902SRodney W. Grimes> 2339b50d902SRodney W. Grimes>Here is the README that I sent with my first posting: 2349b50d902SRodney W. Grimes> 2359b50d902SRodney W. Grimes>>Enclosed is a modified version of compress.c, along with scripts to make it 236d64ada50SJens Schweikhardt>>run identically to pack(1), unpack(1), and pcat(1). Here is what I 2379b50d902SRodney W. Grimes>>(petsd!joe) and a colleague (petsd!peora!srd) did: 2389b50d902SRodney W. Grimes>> 2399b50d902SRodney W. Grimes>>1. Removed VAX dependencies. 2409b50d902SRodney W. Grimes>>2. Changed the struct to separate arrays; saves mucho memory. 2419b50d902SRodney W. Grimes>>3. Did comparisons in unsigned, where possible. (Faster on Perkin-Elmer.) 2429b50d902SRodney W. Grimes>>4. Sorted the character next chain and changed the search to stop 2439b50d902SRodney W. Grimes>>prematurely. This saves a lot on the execution time when compressing. 2449b50d902SRodney W. Grimes>> 2459b50d902SRodney W. Grimes>>This version is totally compatible with the original version. Even though 2469b50d902SRodney W. Grimes>>lint(1) -p has no complaints about compress.c, it won't run on a 16-bit 2479b50d902SRodney W. Grimes>>machine, due to the size of the arrays. 2489b50d902SRodney W. Grimes>> 2499b50d902SRodney W. Grimes>>Here is the README file from the original author: 2509b50d902SRodney W. Grimes>> 2519b50d902SRodney W. Grimes>>>Well, with all this discussion about file compression (for news batching 2529b50d902SRodney W. Grimes>>>in particular) going around, I decided to implement the text compression 2539b50d902SRodney W. Grimes>>>algorithm described in the June Computer magazine. The author claimed 2549b50d902SRodney W. Grimes>>>blinding speed and good compression ratios. It's certainly faster than 2559b50d902SRodney W. Grimes>>>compact (but, then, what wouldn't be), but it's also the same speed as 2569b50d902SRodney W. Grimes>>>pack, and gets better compression than both of them. On 350K bytes of 2576dc4364cSPhilippe Charnier>>>Unix-wizards, compact took about 8 minutes of CPU, pack took about 80 2589b50d902SRodney W. Grimes>>>seconds, and compress (herein) also took 80 seconds. But, compact and 2599b50d902SRodney W. Grimes>>>pack got about 30% compression, whereas compress got over 50%. So, I 2609b50d902SRodney W. Grimes>>>decided I had something, and that others might be interested, too. 2619b50d902SRodney W. Grimes>>> 2629b50d902SRodney W. Grimes>>>As is probably true of compact and pack (although I haven't checked), 2639b50d902SRodney W. Grimes>>>the byte order within a word is probably relevant here, but as long as 2649b50d902SRodney W. Grimes>>>you stay on a single machine type, you should be ok. (Can anybody 2659b50d902SRodney W. Grimes>>>elucidate on this?) There are a couple of asm's in the code (extv and 2669b50d902SRodney W. Grimes>>>insv instructions), so anyone porting it to another machine will have to 2679b50d902SRodney W. Grimes>>>deal with this anyway (and could probably make it compatible with Vax 2689b50d902SRodney W. Grimes>>>byte order at the same time). Anyway, I've linted the code (both with 2699b50d902SRodney W. Grimes>>>and without -p), so it should run elsewhere. Note the longs in the 2709b50d902SRodney W. Grimes>>>code, you can take these out if you reduce BITS to <= 15. 2719b50d902SRodney W. Grimes>>> 2729b50d902SRodney W. Grimes>>>Have fun, and as always, if you make good enhancements, or bug fixes, 2739b50d902SRodney W. Grimes>>>I'd like to see them. 2749b50d902SRodney W. Grimes>>> 2759b50d902SRodney W. Grimes>>>=Spencer (thomas@utah-20, {harpo,hplabs,arizona}!utah-cs!thomas) 2769b50d902SRodney W. Grimes>> 2779b50d902SRodney W. Grimes>> regards, 2789b50d902SRodney W. Grimes>> joe 2799b50d902SRodney W. Grimes>> 2809b50d902SRodney W. Grimes>>-- 2819b50d902SRodney W. Grimes>>Full-Name: Joseph M. Orost 2829b50d902SRodney W. Grimes>>UUCP: ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe 2839b50d902SRodney W. Grimes>>US Mail: MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724 2849b50d902SRodney W. Grimes>>Phone: (201) 870-5844 285