xref: /titanic_52/usr/src/boot/lib/libz/contrib/asm686/README.686 (revision 4a5d661a82b942b6538acd26209d959ce98b593a)
1*4a5d661aSToomas SoomeThis is a patched version of zlib, modified to use
2*4a5d661aSToomas SoomePentium-Pro-optimized assembly code in the deflation algorithm. The
3*4a5d661aSToomas Soomefiles changed/added by this patch are:
4*4a5d661aSToomas Soome
5*4a5d661aSToomas SoomeREADME.686
6*4a5d661aSToomas Soomematch.S
7*4a5d661aSToomas Soome
8*4a5d661aSToomas SoomeThe speedup that this patch provides varies, depending on whether the
9*4a5d661aSToomas Soomecompiler used to build the original version of zlib falls afoul of the
10*4a5d661aSToomas SoomePPro's speed traps. My own tests show a speedup of around 10-20% at
11*4a5d661aSToomas Soomethe default compression level, and 20-30% using -9, against a version
12*4a5d661aSToomas Soomecompiled using gcc 2.7.2.3. Your mileage may vary.
13*4a5d661aSToomas Soome
14*4a5d661aSToomas SoomeNote that this code has been tailored for the PPro/PII in particular,
15*4a5d661aSToomas Soomeand will not perform particuarly well on a Pentium.
16*4a5d661aSToomas Soome
17*4a5d661aSToomas SoomeIf you are using an assembler other than GNU as, you will have to
18*4a5d661aSToomas Soometranslate match.S to use your assembler's syntax. (Have fun.)
19*4a5d661aSToomas Soome
20*4a5d661aSToomas SoomeBrian Raiter
21*4a5d661aSToomas Soomebreadbox@muppetlabs.com
22*4a5d661aSToomas SoomeApril, 1998
23*4a5d661aSToomas Soome
24*4a5d661aSToomas Soome
25*4a5d661aSToomas SoomeAdded for zlib 1.1.3:
26*4a5d661aSToomas Soome
27*4a5d661aSToomas SoomeThe patches come from
28*4a5d661aSToomas Soomehttp://www.muppetlabs.com/~breadbox/software/assembly.html
29*4a5d661aSToomas Soome
30*4a5d661aSToomas SoomeTo compile zlib with this asm file, copy match.S to the zlib directory
31*4a5d661aSToomas Soomethen do:
32*4a5d661aSToomas Soome
33*4a5d661aSToomas SoomeCFLAGS="-O3 -DASMV" ./configure
34*4a5d661aSToomas Soomemake OBJA=match.o
35*4a5d661aSToomas Soome
36*4a5d661aSToomas Soome
37*4a5d661aSToomas SoomeUpdate:
38*4a5d661aSToomas Soome
39*4a5d661aSToomas SoomeI've been ignoring these assembly routines for years, believing that
40*4a5d661aSToomas Soomegcc's generated code had caught up with it sometime around gcc 2.95
41*4a5d661aSToomas Soomeand the major rearchitecting of the Pentium 4. However, I recently
42*4a5d661aSToomas Soomelearned that, despite what I believed, this code still has some life
43*4a5d661aSToomas Soomein it. On the Pentium 4 and AMD64 chips, it continues to run about 8%
44*4a5d661aSToomas Soomefaster than the code produced by gcc 4.1.
45*4a5d661aSToomas Soome
46*4a5d661aSToomas SoomeIn acknowledgement of its continuing usefulness, I've altered the
47*4a5d661aSToomas Soomelicense to match that of the rest of zlib. Share and Enjoy!
48*4a5d661aSToomas Soome
49*4a5d661aSToomas SoomeBrian Raiter
50*4a5d661aSToomas Soomebreadbox@muppetlabs.com
51*4a5d661aSToomas SoomeApril, 2007
52