• Meng Zhuo's avatar
    crypto/md5: add assembly implementation on arm64 · b834cd9a
    Meng Zhuo authored
    This change improves the performance of the block
    function used within crypto/md5 on arm64.  The following
    improvement was seen:
    
    name                 old time/op    new time/op     delta
    Hash8Bytes             1.62µs ± 0%     0.85µs ± 0%  -47.83%  (p=0.000 n=8+10)
    Hash1K                 8.82µs ± 0%     7.10µs ± 3%  -19.52%  (p=0.000 n=10+10)
    Hash8K                 59.5µs ± 2%     50.2µs ± 0%  -15.63%  (p=0.000 n=9+10)
    Hash8BytesUnaligned    1.63µs ± 0%     0.85µs ± 0%  -47.92%  (p=0.000 n=9+10)
    Hash1KUnaligned        14.1µs ± 0%      7.4µs ± 0%  -47.45%  (p=0.000 n=9+10)
    Hash8KUnaligned         101µs ± 0%       53µs ± 0%  -47.57%  (p=0.000 n=10+10)
    
    name                 old speed      new speed       delta
    Hash8Bytes           4.93MB/s ± 0%   9.44MB/s ± 0%  +91.61%  (p=0.000 n=9+10)
    Hash1K                116MB/s ± 0%    144MB/s ± 3%  +24.28%  (p=0.000 n=10+10)
    Hash8K                138MB/s ± 2%    163MB/s ± 0%  +18.52%  (p=0.000 n=9+10)
    Hash8BytesUnaligned  4.92MB/s ± 0%   9.44MB/s ± 0%  +92.04%  (p=0.000 n=9+10)
    Hash1KUnaligned      72.8MB/s ± 0%  138.6MB/s ± 0%  +90.29%  (p=0.000 n=9+8)
    Hash8KUnaligned      80.9MB/s ± 0%  154.2MB/s ± 0%  +90.71%  (p=0.000 n=10+10)
    
    Change-Id: I9e121a3132ff1b59e30f2de64e46106269065ecd
    Reviewed-on: https://go-review.googlesource.com/101875Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
    b834cd9a
md5block_arm64.s 4.11 KB