• Raph Levien's avatar
    compress/flate: Performance improvement for inflate · ebf35167
    Raph Levien authored
    Decode as much as possible of a Huffman symbol in a single table
    lookup (much like the zlib implementation), filling more bits
    (conservatively, so we don't consume past the end of the stream)
    when the code prefix indicates more bits are needed. This
    results in about a 50% performance gain in speed benchmarks.
    The following set is benchcmp done on a retina MacBook Pro:
    
    benchmark                            old MB/s     new MB/s  speedup
    BenchmarkDecodeDigitsSpeed1e4           28.41        42.79    1.51x
    BenchmarkDecodeDigitsSpeed1e5           30.18        47.62    1.58x
    BenchmarkDecodeDigitsSpeed1e6           30.81        48.14    1.56x
    BenchmarkDecodeDigitsDefault1e4         30.28        44.61    1.47x
    BenchmarkDecodeDigitsDefault1e5         32.18        51.94    1.61x
    BenchmarkDecodeDigitsDefault1e6         35.57        53.28    1.50x
    BenchmarkDecodeDigitsCompress1e4        30.39        44.83    1.48x
    BenchmarkDecodeDigitsCompress1e5        33.05        51.64    1.56x
    BenchmarkDecodeDigitsCompress1e6        35.69        53.04    1.49x
    BenchmarkDecodeTwainSpeed1e4            25.90        43.04    1.66x
    BenchmarkDecodeTwainSpeed1e5            29.97        48.19    1.61x
    BenchmarkDecodeTwainSpeed1e6            31.36        49.43    1.58x
    BenchmarkDecodeTwainDefault1e4          28.79        45.02    1.56x
    BenchmarkDecodeTwainDefault1e5          37.12        55.65    1.50x
    BenchmarkDecodeTwainDefault1e6          39.28        58.16    1.48x
    BenchmarkDecodeTwainCompress1e4         28.64        44.90    1.57x
    BenchmarkDecodeTwainCompress1e5         37.40        55.98    1.50x
    BenchmarkDecodeTwainCompress1e6         39.35        58.06    1.48x
    
    R=rsc, dave, minux.ma, bradfitz, nigeltao
    CC=golang-dev
    https://golang.org/cl/6872063
    ebf35167
flate_test.go 716 Bytes