• Nigel Tao's avatar
    compress/flate: optimize history-copy decoding. · 4de15a5c
    Nigel Tao authored
    The forwardCopy function could be re-written in asm, and the copyHuff
    method could probably be rolled into huffmanBlock and copyHist, but
    I'm leaving those changes for future CLs.
    
    compress/flate benchmarks:
    benchmark                                 old ns/op    new ns/op    delta
    BenchmarkDecoderBestSpeed1K                  385327       435140  +12.93%
    BenchmarkDecoderBestSpeed10K                1245190      1062112  -14.70%
    BenchmarkDecoderBestSpeed100K               8512365      5833680  -31.47%
    BenchmarkDecoderDefaultCompression1K         382225       421301  +10.22%
    BenchmarkDecoderDefaultCompression10K        867950       613890  -29.27%
    BenchmarkDecoderDefaultCompression100K      5658240      2466726  -56.40%
    BenchmarkDecoderBestCompression1K            383760       421634   +9.87%
    BenchmarkDecoderBestCompression10K           867743       614671  -29.16%
    BenchmarkDecoderBestCompression100K         5660160      2464996  -56.45%
    
    image/png benchmarks:
    benchmark                       old ns/op    new ns/op    delta
    BenchmarkDecodeGray               2540834      2389624   -5.95%
    BenchmarkDecodeNRGBAGradient     10052700      9534565   -5.15%
    BenchmarkDecodeNRGBAOpaque        8704710      8163430   -6.22%
    BenchmarkDecodePaletted           1458779      1325017   -9.17%
    BenchmarkDecodeRGB                7183606      6794668   -5.41%
    
    Wall time for Denis Cheremisov's PNG-decoding program given in
    https://groups.google.com/group/golang-nuts/browse_thread/thread/22aa8a05040fdd49
    Before: 3.07s
    After:  2.32s
    Delta:  -24%
    
    Before profile:
    Total: 304 samples
             159  52.3%  52.3%      251  82.6% compress/flate.(*decompressor).huffmanBlock
              58  19.1%  71.4%       76  25.0% compress/flate.(*decompressor).huffSym
              32  10.5%  81.9%       32  10.5% hash/adler32.update
              16   5.3%  87.2%       22   7.2% bufio.(*Reader).ReadByte
              16   5.3%  92.4%       37  12.2% compress/flate.(*decompressor).moreBits
               7   2.3%  94.7%        7   2.3% hash/crc32.update
               7   2.3%  97.0%        7   2.3% runtime.memmove
               5   1.6%  98.7%        5   1.6% scanblock
               2   0.7%  99.3%        9   3.0% runtime.copy
               1   0.3%  99.7%        1   0.3% compress/flate.(*huffmanDecoder).init
    
    After profile:
    Total: 230 samples
              59  25.7%  25.7%       70  30.4% compress/flate.(*decompressor).huffSym
              45  19.6%  45.2%       45  19.6% hash/adler32.update
              35  15.2%  60.4%       35  15.2% compress/flate.forwardCopy
              20   8.7%  69.1%      151  65.7% compress/flate.(*decompressor).huffmanBlock
              16   7.0%  76.1%       24  10.4% compress/flate.(*decompressor).moreBits
              15   6.5%  82.6%       15   6.5% runtime.memmove
              11   4.8%  87.4%       50  21.7% compress/flate.(*decompressor).copyHist
               7   3.0%  90.4%        7   3.0% hash/crc32.update
               6   2.6%  93.0%        9   3.9% bufio.(*Reader).ReadByte
               4   1.7%  94.8%        4   1.7% runtime.slicearray
    
    R=rsc, rogpeppe, dave
    CC=golang-dev, krasin
    https://golang.org/cl/6127064
    4de15a5c
Name
Last commit
Last update
api Loading commit data...
doc Loading commit data...
include Loading commit data...
lib Loading commit data...
misc Loading commit data...
src Loading commit data...
test Loading commit data...
.hgignore Loading commit data...
.hgtags Loading commit data...
AUTHORS Loading commit data...
CONTRIBUTORS Loading commit data...
LICENSE Loading commit data...
PATENTS Loading commit data...
README Loading commit data...
favicon.ico Loading commit data...
robots.txt Loading commit data...