• Nick Craig-Wood's avatar
    crypto/md5: native arm assembler version · 085159da
    Nick Craig-Wood authored
    An ARM version of md5block.go with a big improvement in
    throughput (up to 2.5x) and a reduction in object size (21%).
    
    Code size
    
      Before 3100 bytes
      After 2424 bytes
      21% smaller
    
    Benchmarks on Rasperry Pi
    
    benchmark                       old ns/op    new ns/op    delta
    BenchmarkHash8Bytes                 11703         6636  -43.30%
    BenchmarkHash1K                     38057        21881  -42.50%
    BenchmarkHash8K                    208131       142735  -31.42%
    BenchmarkHash8BytesUnaligned        11457         6570  -42.66%
    BenchmarkHash1KUnaligned            69334        26841  -61.29%
    BenchmarkHash8KUnaligned           455120       182223  -59.96%
    
    benchmark                        old MB/s     new MB/s  speedup
    BenchmarkHash8Bytes                  0.68         1.21    1.78x
    BenchmarkHash1K                     26.91        46.80    1.74x
    BenchmarkHash8K                     39.36        57.39    1.46x
    BenchmarkHash8BytesUnaligned         0.70         1.22    1.74x
    BenchmarkHash1KUnaligned            14.77        38.15    2.58x
    BenchmarkHash8KUnaligned            18.00        44.96    2.50x
    
    benchmark                      old allocs   new allocs    delta
    BenchmarkHash8Bytes                     1            0  -100.00%
    BenchmarkHash1K                         2            0  -100.00%
    BenchmarkHash8K                         2            0  -100.00%
    BenchmarkHash8BytesUnaligned            1            0  -100.00%
    BenchmarkHash1KUnaligned                2            0  -100.00%
    BenchmarkHash8KUnaligned                2            0  -100.00%
    
    benchmark                       old bytes    new bytes    delta
    BenchmarkHash8Bytes                    64            0  -100.00%
    BenchmarkHash1K                       128            0  -100.00%
    BenchmarkHash8K                       128            0  -100.00%
    BenchmarkHash8BytesUnaligned           64            0  -100.00%
    BenchmarkHash1KUnaligned              128            0  -100.00%
    BenchmarkHash8KUnaligned              128            0  -100.00%
    
    This also adds another test which makes sure that the sums
    over larger blocks work properly. I wrote this test when I was
    worried about memory corruption.
    
    R=golang-dev, dave, bradfitz, rsc, ajstarks
    CC=golang-dev, minux.ma, remyoudompheng
    https://golang.org/cl/11648043
    085159da
gen.go 6.16 KB