• templexxx's avatar
    crypto/cipher: use SIMD for xor on amd64 · 5168fcf6
    templexxx authored
    cpu: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
    
    Benchmark: xor
    
    name                   old time/op    new time/op     delta
    XORBytes/8Bytes-8        8.21ns ± 1%     6.35ns ± 3%   -22.66%  (p=0.008 n=5+5)
    XORBytes/128Bytes-8      17.9ns ± 1%     10.4ns ± 1%   -41.68%  (p=0.008 n=5+5)
    XORBytes/2048Bytes-8      187ns ± 1%       78ns ± 0%   -58.44%  (p=0.008 n=5+5)
    XORBytes/32768Bytes-8    2.87µs ± 1%     1.38µs ± 0%   -52.05%  (p=0.008 n=5+5)
    
    name                   old speed      new speed       delta
    XORBytes/8Bytes-8       974MB/s ± 1%   1260MB/s ± 2%   +29.33%  (p=0.008 n=5+5)
    XORBytes/128Bytes-8    7.15GB/s ± 0%  12.25GB/s ± 1%   +71.17%  (p=0.008 n=5+5)
    XORBytes/2048Bytes-8   10.9GB/s ± 1%   26.4GB/s ± 0%  +140.99%  (p=0.008 n=5+5)
    XORBytes/32768Bytes-8  11.4GB/s ± 1%   23.8GB/s ± 0%  +108.52%  (p=0.008 n=5+5)
    
    Benchmark: cipher
    
    name               old time/op    new time/op    delta
    AESGCMSeal1K-8        269ns ± 6%     261ns ± 2%     ~     (p=0.246 n=5+5)
    AESGCMOpen1K-8        242ns ± 1%     240ns ± 2%     ~     (p=0.190 n=5+5)
    AESGCMSign8K-8        869ns ± 0%     870ns ± 1%     ~     (p=0.683 n=5+5)
    AESGCMSeal8K-8       1.64µs ± 6%    1.59µs ± 7%     ~     (p=0.151 n=5+5)
    AESGCMOpen8K-8       1.48µs ± 2%    1.46µs ± 0%   -1.39%  (p=0.008 n=5+5)
    AESCFBEncrypt1K-8    1.88µs ± 5%    1.62µs ± 1%  -13.52%  (p=0.008 n=5+5)
    AESCFBDecrypt1K-8    1.76µs ± 1%    1.58µs ± 1%  -10.24%  (p=0.016 n=4+5)
    AESOFB1K-8           1.10µs ± 4%    1.03µs ± 2%   -6.36%  (p=0.008 n=5+5)
    AESCTR1K-8           1.24µs ± 1%    1.17µs ± 0%   -5.96%  (p=0.008 n=5+5)
    AESCBCEncrypt1K-8    1.74µs ± 0%    1.14µs ± 1%  -34.36%  (p=0.008 n=5+5)
    AESCBCDecrypt1K-8    1.28µs ± 1%    1.10µs ± 1%  -14.04%  (p=0.008 n=5+5)
    
    name               old speed      new speed      delta
    AESGCMSeal1K-8     3.81GB/s ± 6%  3.91GB/s ± 2%     ~     (p=0.310 n=5+5)
    AESGCMOpen1K-8     4.23GB/s ± 1%  4.27GB/s ± 2%     ~     (p=0.222 n=5+5)
    AESGCMSign8K-8     9.43GB/s ± 0%  9.41GB/s ± 1%     ~     (p=0.841 n=5+5)
    AESGCMSeal8K-8     5.01GB/s ± 6%  5.16GB/s ± 6%     ~     (p=0.151 n=5+5)
    AESGCMOpen8K-8     5.54GB/s ± 2%  5.62GB/s ± 0%   +1.41%  (p=0.008 n=5+5)
    AESCFBEncrypt1K-8   543MB/s ± 5%   627MB/s ± 1%  +15.55%  (p=0.008 n=5+5)
    AESCFBDecrypt1K-8   580MB/s ± 1%   646MB/s ± 1%  +11.40%  (p=0.016 n=4+5)
    AESOFB1K-8          925MB/s ± 4%   988MB/s ± 2%   +6.73%  (p=0.008 n=5+5)
    AESCTR1K-8          821MB/s ± 1%   873MB/s ± 1%   +6.34%  (p=0.008 n=5+5)
    AESCBCEncrypt1K-8   588MB/s ± 1%   897MB/s ± 1%  +52.36%  (p=0.008 n=5+5)
    AESCBCDecrypt1K-8   799MB/s ± 1%   929MB/s ± 1%  +16.32%  (p=0.008 n=5+5)
    
    Change-Id: I42e6ba66c23dad853d33c924fca7b0ed805cefdd
    Reviewed-on: https://go-review.googlesource.com/c/125316Reviewed-by: 's avatarIlya Tocar <ilya.tocar@intel.com>
    Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    5168fcf6
xor_amd64.s 1.31 KB