1. 01 May, 2012 1 commit
    • Nigel Tao's avatar
      compress/flate: optimize history-copy decoding. · 4de15a5c
      Nigel Tao authored
      The forwardCopy function could be re-written in asm, and the copyHuff
      method could probably be rolled into huffmanBlock and copyHist, but
      I'm leaving those changes for future CLs.
      
      compress/flate benchmarks:
      benchmark                                 old ns/op    new ns/op    delta
      BenchmarkDecoderBestSpeed1K                  385327       435140  +12.93%
      BenchmarkDecoderBestSpeed10K                1245190      1062112  -14.70%
      BenchmarkDecoderBestSpeed100K               8512365      5833680  -31.47%
      BenchmarkDecoderDefaultCompression1K         382225       421301  +10.22%
      BenchmarkDecoderDefaultCompression10K        867950       613890  -29.27%
      BenchmarkDecoderDefaultCompression100K      5658240      2466726  -56.40%
      BenchmarkDecoderBestCompression1K            383760       421634   +9.87%
      BenchmarkDecoderBestCompression10K           867743       614671  -29.16%
      BenchmarkDecoderBestCompression100K         5660160      2464996  -56.45%
      
      image/png benchmarks:
      benchmark                       old ns/op    new ns/op    delta
      BenchmarkDecodeGray               2540834      2389624   -5.95%
      BenchmarkDecodeNRGBAGradient     10052700      9534565   -5.15%
      BenchmarkDecodeNRGBAOpaque        8704710      8163430   -6.22%
      BenchmarkDecodePaletted           1458779      1325017   -9.17%
      BenchmarkDecodeRGB                7183606      6794668   -5.41%
      
      Wall time for Denis Cheremisov's PNG-decoding program given in
      https://groups.google.com/group/golang-nuts/browse_thread/thread/22aa8a05040fdd49
      Before: 3.07s
      After:  2.32s
      Delta:  -24%
      
      Before profile:
      Total: 304 samples
               159  52.3%  52.3%      251  82.6% compress/flate.(*decompressor).huffmanBlock
                58  19.1%  71.4%       76  25.0% compress/flate.(*decompressor).huffSym
                32  10.5%  81.9%       32  10.5% hash/adler32.update
                16   5.3%  87.2%       22   7.2% bufio.(*Reader).ReadByte
                16   5.3%  92.4%       37  12.2% compress/flate.(*decompressor).moreBits
                 7   2.3%  94.7%        7   2.3% hash/crc32.update
                 7   2.3%  97.0%        7   2.3% runtime.memmove
                 5   1.6%  98.7%        5   1.6% scanblock
                 2   0.7%  99.3%        9   3.0% runtime.copy
                 1   0.3%  99.7%        1   0.3% compress/flate.(*huffmanDecoder).init
      
      After profile:
      Total: 230 samples
                59  25.7%  25.7%       70  30.4% compress/flate.(*decompressor).huffSym
                45  19.6%  45.2%       45  19.6% hash/adler32.update
                35  15.2%  60.4%       35  15.2% compress/flate.forwardCopy
                20   8.7%  69.1%      151  65.7% compress/flate.(*decompressor).huffmanBlock
                16   7.0%  76.1%       24  10.4% compress/flate.(*decompressor).moreBits
                15   6.5%  82.6%       15   6.5% runtime.memmove
                11   4.8%  87.4%       50  21.7% compress/flate.(*decompressor).copyHist
                 7   3.0%  90.4%        7   3.0% hash/crc32.update
                 6   2.6%  93.0%        9   3.9% bufio.(*Reader).ReadByte
                 4   1.7%  94.8%        4   1.7% runtime.slicearray
      
      R=rsc, rogpeppe, dave
      CC=golang-dev, krasin
      https://golang.org/cl/6127064
      4de15a5c
  2. 30 Apr, 2012 7 commits
  3. 29 Apr, 2012 1 commit
  4. 27 Apr, 2012 10 commits
  5. 26 Apr, 2012 13 commits
  6. 25 Apr, 2012 8 commits