1201360SrdivackyThis is a patched version of zlib, modified to use
2201360SrdivackyPentium-Pro-optimized assembly code in the deflation algorithm. The
3201360Srdivackyfiles changed/added by this patch are:
4201360Srdivacky
5201360SrdivackyREADME.686
6201360Srdivackymatch.S
7201360Srdivacky
8201360SrdivackyThe speedup that this patch provides varies, depending on whether the
9201360Srdivackycompiler used to build the original version of zlib falls afoul of the
10201360SrdivackyPPro's speed traps. My own tests show a speedup of around 10-20% at
11201360Srdivackythe default compression level, and 20-30% using -9, against a version
12201360Srdivackycompiled using gcc 2.7.2.3. Your mileage may vary.
13201360Srdivacky
14201360SrdivackyNote that this code has been tailored for the PPro/PII in particular,
15201360Srdivackyand will not perform particuarly well on a Pentium.
16201360Srdivacky
17249423SdimIf you are using an assembler other than GNU as, you will have to
18201360Srdivackytranslate match.S to use your assembler's syntax. (Have fun.)
19201360Srdivacky
20201360SrdivackyBrian Raiter
21201360Srdivackybreadbox@muppetlabs.com
22201360SrdivackyApril, 1998
23201360Srdivacky
24201360Srdivacky
25203954SrdivackyAdded for zlib 1.1.3:
26203954Srdivacky
27203954SrdivackyThe patches come from
28203954Srdivackyhttp://www.muppetlabs.com/~breadbox/software/assembly.html
29203954Srdivacky
30203954SrdivackyTo compile zlib with this asm file, copy match.S to the zlib directory
31203954Srdivackythen do:
32251662Sdim
33251662SdimCFLAGS="-O3 -DASMV" ./configure
34251662Sdimmake OBJA=match.o
35251662Sdim
36251662Sdim
37251662SdimUpdate:
38251662Sdim
39251662SdimI've been ignoring these assembly routines for years, believing that
40251662Sdimgcc's generated code had caught up with it sometime around gcc 2.95
41251662Sdimand the major rearchitecting of the Pentium 4. However, I recently
42251662Sdimlearned that, despite what I believed, this code still has some life
43251662Sdimin it. On the Pentium 4 and AMD64 chips, it continues to run about 8%
44251662Sdimfaster than the code produced by gcc 4.1.
45251662Sdim
46251662SdimIn acknowledgement of its continuing usefulness, I've altered the
47251662Sdimlicense to match that of the rest of zlib. Share and Enjoy!
48251662Sdim
49251662SdimBrian Raiter
50251662Sdimbreadbox@muppetlabs.com
51203954SrdivackyApril, 2007
52201360Srdivacky