• Balaram Makam's avatar
    cmd/compile: intrinsify math/big.mulWW on ARM64 · d7c7d88b
    Balaram Makam authored
    Performance numbers on amberwing:
    
    pkg: math/big
    name                            old time/op    new time/op    delta
    QuoRem                            3.08µs ± 0%    2.93µs ± 1%   -4.89%  (p=0.008 n=5+5)
    ModSqrt225_Tonelli                 721µs ± 0%     718µs ± 0%   -0.46%  (p=0.008 n=5+5)
    ModSqrt224_3Mod4                   218µs ± 0%     217µs ± 0%   -0.27%  (p=0.008 n=5+5)
    ModSqrt5430_Tonelli                2.91s ± 0%     2.91s ± 0%     ~     (p=0.222 n=5+5)
    ModSqrt5430_3Mod4                  970ms ± 0%     970ms ± 0%     ~     (p=0.151 n=5+5)
    Sqrt                              45.9µs ± 0%    43.8µs ± 0%   -4.63%  (p=0.008 n=5+5)
    IntSqr/1                          19.9ns ± 0%    17.3ns ± 0%  -13.07%  (p=0.008 n=5+5)
    IntSqr/2                          52.6ns ± 0%    50.8ns ± 0%   -3.35%  (p=0.008 n=5+5)
    IntSqr/3                          70.4ns ± 0%    69.4ns ± 0%     ~     (p=0.079 n=4+5)
    IntSqr/5                           103ns ± 0%      99ns ± 0%   -3.98%  (p=0.008 n=5+5)
    IntSqr/8                           179ns ± 0%     178ns ± 0%   -0.56%  (p=0.008 n=5+5)
    IntSqr/10                          272ns ± 0%     272ns ± 0%     ~     (all equal)
    IntSqr/20                          763ns ± 0%     787ns ± 0%   +3.15%  (p=0.016 n=5+4)
    IntSqr/30                         1.25µs ± 1%    1.29µs ± 1%   +3.27%  (p=0.008 n=5+5)
    IntSqr/50                         2.64µs ± 0%    2.71µs ± 0%   +2.61%  (p=0.008 n=5+5)
    IntSqr/80                         5.67µs ± 0%    5.72µs ± 0%   +0.88%  (p=0.008 n=5+5)
    IntSqr/100                        8.05µs ± 0%    8.09µs ± 0%   +0.45%  (p=0.008 n=5+5)
    IntSqr/200                        28.0µs ± 0%    28.1µs ± 0%     ~     (p=0.151 n=5+5)
    IntSqr/300                        59.4µs ± 0%    59.6µs ± 0%   +0.36%  (p=0.008 n=5+5)
    IntSqr/500                         141µs ± 0%     141µs ± 0%   +0.08%  (p=0.008 n=5+5)
    IntSqr/800                         280µs ± 0%     280µs ± 0%   -0.12%  (p=0.008 n=5+5)
    IntSqr/1000                        429µs ± 0%     428µs ± 0%   -0.27%  (p=0.008 n=5+5)
    
    pkg: crypto-ecdsa
    name      old time/op    new time/op    delta
    SignP384    7.85ms ± 1%    7.61ms ± 1%  -3.12%  (p=0.008 n=5+5)
    
    Change-Id: I1ab30856cc0e570f6312f0bd8914779b55adbc16
    Reviewed-on: https://go-review.googlesource.com/104135Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
    Run-TryBot: Cherry Zhang <cherryyz@google.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    d7c7d88b
rewriteARM64.go 472 KB