19b50d902SRodney W. GrimesCompress version 4.0 improvements over 3.0: 29b50d902SRodney W. Grimes o compress() speedup (10-50%) by changing division hash to xor 39b50d902SRodney W. Grimes o decompress() speedup (5-10%) 49b50d902SRodney W. Grimes o Memory requirements reduced (3-30%) 59b50d902SRodney W. Grimes o Stack requirements reduced to less than 4kb 69b50d902SRodney W. Grimes o Removed 'Big+Fast' compress code (FBITS) because of compress speedup 79b50d902SRodney W. Grimes o Portability mods for Z8000 and PC/XT (but not zeus 3.2) 89b50d902SRodney W. Grimes o Default to 'quiet' mode 99b50d902SRodney W. Grimes o Unification of 'force' flags 109b50d902SRodney W. Grimes o Manual page overhaul 119b50d902SRodney W. Grimes o Portability enhancement for M_XENIX 129b50d902SRodney W. Grimes o Removed text on #else and #endif 139b50d902SRodney W. Grimes o Added "-V" switch to print version and options 149b50d902SRodney W. Grimes o Added #defines for SIGNED_COMPARE_SLOW 159b50d902SRodney W. Grimes o Added Makefile and "usermem" program 169b50d902SRodney W. Grimes o Removed all floating point computations 179b50d902SRodney W. Grimes o New programs: [deleted] 189b50d902SRodney W. Grimes 199b50d902SRodney W. GrimesThe "usermem" script attempts to determine the maximum process size. Some 209b50d902SRodney W. Grimesediting of the script may be necessary (see the comments). [It should work 216dc4364cSPhilippe Charnierfine on 4.3 BSD.] If you can't get it to work at all, just create file 229b50d902SRodney W. Grimes"USERMEM" containing the maximum process size in decimal. 239b50d902SRodney W. Grimes 249b50d902SRodney W. GrimesThe following preprocessor symbols control the compilation of "compress.c": 259b50d902SRodney W. Grimes 269b50d902SRodney W. Grimes o USERMEM Maximum process memory on the system 276dc4364cSPhilippe Charnier o SACREDMEM Amount to reserve for other processes 289b50d902SRodney W. Grimes o SIGNED_COMPARE_SLOW Unsigned compare instructions are faster 299b50d902SRodney W. Grimes o NO_UCHAR Don't use "unsigned char" types 309b50d902SRodney W. Grimes o BITS Overrules default set by USERMEM-SACREDMEM 319b50d902SRodney W. Grimes o vax Generate inline assembler 329b50d902SRodney W. Grimes o interdata Defines SIGNED_COMPARE_SLOW 339b50d902SRodney W. Grimes o M_XENIX Makes arrays < 65536 bytes each 349b50d902SRodney W. Grimes o pdp11 BITS=12, NO_UCHAR 359b50d902SRodney W. Grimes o z8000 BITS=12 369b50d902SRodney W. Grimes o pcxt BITS=12 379b50d902SRodney W. Grimes o BSD4_2 Allow long filenames ( > 14 characters) & 389b50d902SRodney W. Grimes Call setlinebuf(stderr) 399b50d902SRodney W. Grimes 409b50d902SRodney W. GrimesThe difference "usermem-sacredmem" determines the maximum BITS that can be 419b50d902SRodney W. Grimesspecified with the "-b" flag. 429b50d902SRodney W. Grimes 439b50d902SRodney W. Grimesmemory: at least BITS 449b50d902SRodney W. Grimes------ -- ----- ---- 459b50d902SRodney W. Grimes 433,484 16 469b50d902SRodney W. Grimes 229,600 15 479b50d902SRodney W. Grimes 127,536 14 489b50d902SRodney W. Grimes 73,464 13 499b50d902SRodney W. Grimes 0 12 509b50d902SRodney W. Grimes 519b50d902SRodney W. GrimesThe default is BITS=16. 529b50d902SRodney W. Grimes 536dc4364cSPhilippe CharnierThe maximum bits can be overruled by specifying "-DBITS=bits" at 549b50d902SRodney W. Grimescompilation time. 559b50d902SRodney W. Grimes 569b50d902SRodney W. GrimesWARNING: files compressed on a large machine with more bits than allowed by 579b50d902SRodney W. Grimesa version of compress on a smaller machine cannot be decompressed! Use the 589b50d902SRodney W. Grimes"-b12" flag to generate a file on a large machine that can be uncompressed 599b50d902SRodney W. Grimeson a 16-bit machine. 609b50d902SRodney W. Grimes 619b50d902SRodney W. GrimesThe output of compress 4.0 is fully compatible with that of compress 3.0. 629b50d902SRodney W. GrimesIn other words, the output of compress 4.0 may be fed into uncompress 3.0 or 639b50d902SRodney W. Grimesthe output of compress 3.0 may be fed into uncompress 4.0. 649b50d902SRodney W. Grimes 659b50d902SRodney W. GrimesThe output of compress 4.0 not compatible with that of 669b50d902SRodney W. Grimescompress 2.0. However, compress 4.0 still accepts the output of 679b50d902SRodney W. Grimescompress 2.0. To generate output that is compatible with compress 689b50d902SRodney W. Grimes2.0, use the undocumented "-C" flag. 699b50d902SRodney W. Grimes 709b50d902SRodney W. Grimes -from mod.sources, submitted by vax135!petsd!joe (Joe Orost), 8/1/85 719b50d902SRodney W. Grimes-------------------------------- 729b50d902SRodney W. Grimes 739b50d902SRodney W. GrimesEnclosed is compress version 3.0 with the following changes: 749b50d902SRodney W. Grimes 759b50d902SRodney W. Grimes1. "Block" compression is performed. After the BITS run out, the 769b50d902SRodney W. Grimes compression ratio is checked every so often. If it is decreasing, 779b50d902SRodney W. Grimes the table is cleared and a new set of substrings are generated. 789b50d902SRodney W. Grimes 799b50d902SRodney W. Grimes This makes the output of compress 3.0 not compatible with that of 809b50d902SRodney W. Grimes compress 2.0. However, compress 3.0 still accepts the output of 819b50d902SRodney W. Grimes compress 2.0. To generate output that is compatible with compress 829b50d902SRodney W. Grimes 2.0, use the undocumented "-C" flag. 839b50d902SRodney W. Grimes 849b50d902SRodney W. Grimes2. A quiet "-q" flag has been added for use by the news system. 859b50d902SRodney W. Grimes 869b50d902SRodney W. Grimes3. The character chaining has been deleted and the program now uses 879b50d902SRodney W. Grimes hashing. This improves the speed of the program, especially 889b50d902SRodney W. Grimes during decompression. Other speed improvements have been made, 899b50d902SRodney W. Grimes such as using putc() instead of fwrite(). 909b50d902SRodney W. Grimes 919b50d902SRodney W. Grimes4. A large table is used on large machines when a relatively small 929b50d902SRodney W. Grimes number of bits is specified. This saves much time when compressing 939b50d902SRodney W. Grimes for a 16-bit machine on a 32-bit virtual machine. Note that the 949b50d902SRodney W. Grimes speed improvement only occurs when the input file is > 30000 959b50d902SRodney W. Grimes characters, and the -b BITS is less than or equal to the cutoff 969b50d902SRodney W. Grimes described below. 979b50d902SRodney W. Grimes 989b50d902SRodney W. GrimesMost of these changes were made by James A. Woods (ames!jaw). Thank you 999b50d902SRodney W. GrimesJames! 1009b50d902SRodney W. Grimes 1019b50d902SRodney W. GrimesTo compile compress: 1029b50d902SRodney W. Grimes 1039b50d902SRodney W. Grimes cc -O -DUSERMEM=usermem -o compress compress.c 1049b50d902SRodney W. Grimes 1059b50d902SRodney W. GrimesWhere "usermem" is the amount of physical user memory available (in bytes). 1069b50d902SRodney W. GrimesIf any physical memory is to be reserved for other processes, put in 1079b50d902SRodney W. Grimes"-DSACREDMEM sacredmem", where "sacredmem" is the amount to be reserved. 1089b50d902SRodney W. Grimes 1099b50d902SRodney W. GrimesThe difference "usermem-sacredmem" determines the maximum BITS that can be 1109b50d902SRodney W. Grimesspecified, and the cutoff bits where the large+fast table is used. 1119b50d902SRodney W. Grimes 1129b50d902SRodney W. Grimesmemory: at least BITS cutoff 1139b50d902SRodney W. Grimes------ -- ----- ---- ------ 1149b50d902SRodney W. Grimes 4,718,592 16 13 1159b50d902SRodney W. Grimes 2,621,440 16 12 1169b50d902SRodney W. Grimes 1,572,864 16 11 1179b50d902SRodney W. Grimes 1,048,576 16 10 1189b50d902SRodney W. Grimes 631,808 16 -- 1199b50d902SRodney W. Grimes 329,728 15 -- 1209b50d902SRodney W. Grimes 178,176 14 -- 1219b50d902SRodney W. Grimes 99,328 13 -- 1229b50d902SRodney W. Grimes 0 12 -- 1239b50d902SRodney W. Grimes 1249b50d902SRodney W. GrimesThe default memory size is 750,000 which gives a maximum BITS=16 and no 1259b50d902SRodney W. Grimeslarge+fast table. 1269b50d902SRodney W. Grimes 1279b50d902SRodney W. GrimesThe maximum bits can be overruled by specifying "-DBITS=bits" at 1289b50d902SRodney W. Grimescompilation time. 1299b50d902SRodney W. Grimes 1309b50d902SRodney W. GrimesIf your machine doesn't support unsigned characters, define "NO_UCHAR" 1319b50d902SRodney W. Grimeswhen compiling. 1329b50d902SRodney W. Grimes 1339b50d902SRodney W. GrimesIf your machine has "int" as 16-bits, define "SHORT_INT" when compiling. 1349b50d902SRodney W. Grimes 1359b50d902SRodney W. GrimesAfter compilation, move "compress" to a standard executable location, such 1369b50d902SRodney W. Grimesas /usr/local. Then: 1379b50d902SRodney W. Grimes cd /usr/local 1389b50d902SRodney W. Grimes ln compress uncompress 1399b50d902SRodney W. Grimes ln compress zcat 1409b50d902SRodney W. Grimes 1419b50d902SRodney W. GrimesOn machines that have a fixed stack size (such as Perkin-Elmer), set the 1429b50d902SRodney W. Grimesstack to at least 12kb. ("setstack compress 12" on Perkin-Elmer). 1439b50d902SRodney W. Grimes 1449b50d902SRodney W. GrimesNext, install the manual (compress.l). 1459b50d902SRodney W. Grimes cp compress.l /usr/man/manl 1469b50d902SRodney W. Grimes cd /usr/man/manl 1479b50d902SRodney W. Grimes ln compress.l uncompress.l 1489b50d902SRodney W. Grimes ln compress.l zcat.l 1499b50d902SRodney W. Grimes 1509b50d902SRodney W. Grimes - or - 1519b50d902SRodney W. Grimes 1529b50d902SRodney W. Grimes cp compress.l /usr/man/man1/compress.1 1539b50d902SRodney W. Grimes cd /usr/man/man1 1549b50d902SRodney W. Grimes ln compress.1 uncompress.1 1559b50d902SRodney W. Grimes ln compress.1 zcat.1 1569b50d902SRodney W. Grimes 1579b50d902SRodney W. Grimes regards, 1589b50d902SRodney W. Grimes petsd!joe 1599b50d902SRodney W. Grimes 1609b50d902SRodney W. GrimesHere is a note from the net: 1619b50d902SRodney W. Grimes 1629b50d902SRodney W. Grimes>From hplabs!pesnta!amd!turtlevax!ken Sat Jan 5 03:35:20 1985 1639b50d902SRodney W. GrimesPath: ames!hplabs!pesnta!amd!turtlevax!ken 1649b50d902SRodney W. GrimesFrom: ken@turtlevax.UUCP (Ken Turkowski) 1659b50d902SRodney W. GrimesNewsgroups: net.sources 1669b50d902SRodney W. GrimesSubject: Re: Compress release 3.0 : sample Makefile 1679b50d902SRodney W. GrimesOrganization: CADLINC, Inc. @ Menlo Park, CA 1689b50d902SRodney W. Grimes 1699b50d902SRodney W. GrimesIn the compress 3.0 source recently posted to mod.sources, there is a 1709b50d902SRodney W. Grimes#define variable which can be set for optimum performance on a machine 1719b50d902SRodney W. Grimeswith a large amount of memory. A program (usermem) to calculate the 1726dc4364cSPhilippe Charnierusable amount of physical user memory is enclosed, as well as a sample 1736dc4364cSPhilippe Charnier4.2BSD Vax Makefile for compress. 1749b50d902SRodney W. Grimes 1759b50d902SRodney W. GrimesHere is the README file from the previous version of compress (2.0): 1769b50d902SRodney W. Grimes 1779b50d902SRodney W. Grimes>Enclosed is compress.c version 2.0 with the following bugs fixed: 1789b50d902SRodney W. Grimes> 1799b50d902SRodney W. Grimes>1. The packed files produced by compress are different on different 1809b50d902SRodney W. Grimes> machines and dependent on the vax sysgen option. 1819b50d902SRodney W. Grimes> The bug was in the different byte/bit ordering on the 1829b50d902SRodney W. Grimes> various machines. This has been fixed. 1839b50d902SRodney W. Grimes> 1849b50d902SRodney W. Grimes> This version is NOT compatible with the original vax posting 1859b50d902SRodney W. Grimes> unless the '-DCOMPATIBLE' option is specified to the C 1869b50d902SRodney W. Grimes> compiler. The original posting has a bug which I fixed, 1879b50d902SRodney W. Grimes> causing incompatible files. I recommend you NOT to use this 1889b50d902SRodney W. Grimes> option unless you already have a lot of packed files from 1896dc4364cSPhilippe Charnier> the original posting by Thomas. 1909b50d902SRodney W. Grimes>2. The exit status is not well defined (on some machines) causing the 1919b50d902SRodney W. Grimes> scripts to fail. 1929b50d902SRodney W. Grimes> The exit status is now 0,1 or 2 and is documented in 1939b50d902SRodney W. Grimes> compress.l. 1949b50d902SRodney W. Grimes>3. The function getopt() is not available in all C libraries. 1959b50d902SRodney W. Grimes> The function getopt() is no longer referenced by the 1969b50d902SRodney W. Grimes> program. 1979b50d902SRodney W. Grimes>4. Error status is not being checked on the fwrite() and fflush() calls. 1989b50d902SRodney W. Grimes> Fixed. 1999b50d902SRodney W. Grimes> 2009b50d902SRodney W. Grimes>The following enhancements have been made: 2019b50d902SRodney W. Grimes> 2029b50d902SRodney W. Grimes>1. Added facilities of "compact" into the compress program. "Pack", 2039b50d902SRodney W. Grimes> "Unpack", and "Pcat" are no longer required (no longer supplied). 2049b50d902SRodney W. Grimes>2. Installed work around for C compiler bug with "-O". 2059b50d902SRodney W. Grimes>3. Added a magic number header (\037\235). Put the bits specified 2069b50d902SRodney W. Grimes> in the file. 2079b50d902SRodney W. Grimes>4. Added "-f" flag to force overwrite of output file. 2089b50d902SRodney W. Grimes>5. Added "-c" flag and "zcat" program. 'ln compress zcat' after you 2099b50d902SRodney W. Grimes> compile. 2109b50d902SRodney W. Grimes>6. The 'uncompress' script has been deleted; simply 2119b50d902SRodney W. Grimes> 'ln compress uncompress' after you compile and it will work. 2129b50d902SRodney W. Grimes>7. Removed extra bit masking for machines that support unsigned 2139b50d902SRodney W. Grimes> characters. If your machine doesn't support unsigned characters, 2149b50d902SRodney W. Grimes> define "NO_UCHAR" when compiling. 2159b50d902SRodney W. Grimes> 2169b50d902SRodney W. Grimes>Compile "compress.c" with "-O -o compress" flags. Move "compress" to a 2179b50d902SRodney W. Grimes>standard executable location, such as /usr/local. Then: 2189b50d902SRodney W. Grimes> cd /usr/local 2199b50d902SRodney W. Grimes> ln compress uncompress 2209b50d902SRodney W. Grimes> ln compress zcat 2219b50d902SRodney W. Grimes> 2229b50d902SRodney W. Grimes>On machines that have a fixed stack size (such as Perkin-Elmer), set the 2239b50d902SRodney W. Grimes>stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer). 2249b50d902SRodney W. Grimes> 2259b50d902SRodney W. Grimes>Next, install the manual (compress.l). 2269b50d902SRodney W. Grimes> cp compress.l /usr/man/manl - or - 2279b50d902SRodney W. Grimes> cp compress.l /usr/man/man1/compress.1 2289b50d902SRodney W. Grimes> 2299b50d902SRodney W. Grimes>Here is the README that I sent with my first posting: 2309b50d902SRodney W. Grimes> 2319b50d902SRodney W. Grimes>>Enclosed is a modified version of compress.c, along with scripts to make it 232d64ada50SJens Schweikhardt>>run identically to pack(1), unpack(1), and pcat(1). Here is what I 2339b50d902SRodney W. Grimes>>(petsd!joe) and a colleague (petsd!peora!srd) did: 2349b50d902SRodney W. Grimes>> 2359b50d902SRodney W. Grimes>>1. Removed VAX dependencies. 2369b50d902SRodney W. Grimes>>2. Changed the struct to separate arrays; saves mucho memory. 2379b50d902SRodney W. Grimes>>3. Did comparisons in unsigned, where possible. (Faster on Perkin-Elmer.) 2389b50d902SRodney W. Grimes>>4. Sorted the character next chain and changed the search to stop 2399b50d902SRodney W. Grimes>>prematurely. This saves a lot on the execution time when compressing. 2409b50d902SRodney W. Grimes>> 2419b50d902SRodney W. Grimes>>This version is totally compatible with the original version. Even though 2429b50d902SRodney W. Grimes>>lint(1) -p has no complaints about compress.c, it won't run on a 16-bit 2439b50d902SRodney W. Grimes>>machine, due to the size of the arrays. 2449b50d902SRodney W. Grimes>> 2459b50d902SRodney W. Grimes>>Here is the README file from the original author: 2469b50d902SRodney W. Grimes>> 2479b50d902SRodney W. Grimes>>>Well, with all this discussion about file compression (for news batching 2489b50d902SRodney W. Grimes>>>in particular) going around, I decided to implement the text compression 2499b50d902SRodney W. Grimes>>>algorithm described in the June Computer magazine. The author claimed 2509b50d902SRodney W. Grimes>>>blinding speed and good compression ratios. It's certainly faster than 2519b50d902SRodney W. Grimes>>>compact (but, then, what wouldn't be), but it's also the same speed as 2529b50d902SRodney W. Grimes>>>pack, and gets better compression than both of them. On 350K bytes of 2536dc4364cSPhilippe Charnier>>>Unix-wizards, compact took about 8 minutes of CPU, pack took about 80 2549b50d902SRodney W. Grimes>>>seconds, and compress (herein) also took 80 seconds. But, compact and 2559b50d902SRodney W. Grimes>>>pack got about 30% compression, whereas compress got over 50%. So, I 2569b50d902SRodney W. Grimes>>>decided I had something, and that others might be interested, too. 2579b50d902SRodney W. Grimes>>> 2589b50d902SRodney W. Grimes>>>As is probably true of compact and pack (although I haven't checked), 2599b50d902SRodney W. Grimes>>>the byte order within a word is probably relevant here, but as long as 2609b50d902SRodney W. Grimes>>>you stay on a single machine type, you should be ok. (Can anybody 2619b50d902SRodney W. Grimes>>>elucidate on this?) There are a couple of asm's in the code (extv and 2629b50d902SRodney W. Grimes>>>insv instructions), so anyone porting it to another machine will have to 2639b50d902SRodney W. Grimes>>>deal with this anyway (and could probably make it compatible with Vax 2649b50d902SRodney W. Grimes>>>byte order at the same time). Anyway, I've linted the code (both with 2659b50d902SRodney W. Grimes>>>and without -p), so it should run elsewhere. Note the longs in the 2669b50d902SRodney W. Grimes>>>code, you can take these out if you reduce BITS to <= 15. 2679b50d902SRodney W. Grimes>>> 2689b50d902SRodney W. Grimes>>>Have fun, and as always, if you make good enhancements, or bug fixes, 2699b50d902SRodney W. Grimes>>>I'd like to see them. 2709b50d902SRodney W. Grimes>>> 2719b50d902SRodney W. Grimes>>>=Spencer (thomas@utah-20, {harpo,hplabs,arizona}!utah-cs!thomas) 2729b50d902SRodney W. Grimes>> 2739b50d902SRodney W. Grimes>> regards, 2749b50d902SRodney W. Grimes>> joe 2759b50d902SRodney W. Grimes>> 2769b50d902SRodney W. Grimes>>-- 2779b50d902SRodney W. Grimes>>Full-Name: Joseph M. Orost 2789b50d902SRodney W. Grimes>>UUCP: ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe 2799b50d902SRodney W. Grimes>>US Mail: MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724 2809b50d902SRodney W. Grimes>>Phone: (201) 870-5844 281