• Carlos Eduardo Seo's avatar
    cmd/compile: intrinsify math/big.mulWW on ppc64x · 1e8ecefc
    Carlos Eduardo Seo authored
    This change implements mulWW as an intrinsic for ppc64x. Performance
    numbers below:
    
    name                            old time/op    new time/op    delta
    QuoRem                            4.54µs ±45%    3.22µs ± 0%  -29.22%  (p=0.029 n=4+4)
    ModSqrt225_Tonelli                 765µs ± 3%     757µs ± 0%   -1.02%  (p=0.029 n=4+4)
    ModSqrt225_3Mod4                   231µs ± 0%     231µs ± 0%   -0.10%  (p=0.029 n=4+4)
    ModSqrt231_Tonelli                 789µs ± 0%     788µs ± 0%   -0.14%  (p=0.029 n=4+4)
    ModSqrt231_5Mod8                   267µs ± 0%     267µs ± 0%   -0.13%  (p=0.029 n=4+4)
    Sqrt                              49.5µs ±17%    45.3µs ± 0%   -8.48%  (p=0.029 n=4+4)
    IntSqr/1                          32.2ns ±22%    24.2ns ± 0%  -24.79%  (p=0.029 n=4+4)
    IntSqr/2                          60.6ns ± 0%    60.9ns ± 0%   +0.50%  (p=0.029 n=4+4)
    IntSqr/3                          82.8ns ± 0%    83.3ns ± 0%   +0.51%  (p=0.029 n=4+4)
    IntSqr/5                           122ns ± 0%     121ns ± 0%   -1.22%  (p=0.029 n=4+4)
    IntSqr/8                           227ns ± 0%     226ns ± 0%   -0.44%  (p=0.029 n=4+4)
    IntSqr/10                          300ns ± 0%     298ns ± 0%   -0.67%  (p=0.029 n=4+4)
    IntSqr/20                         1.02µs ± 0%    0.89µs ± 0%  -13.08%  (p=0.029 n=4+4)
    IntSqr/30                         1.73µs ± 0%    1.51µs ± 0%  -12.73%  (p=0.029 n=4+4)
    IntSqr/50                         3.69µs ± 1%    3.29µs ± 0%  -10.70%  (p=0.029 n=4+4)
    IntSqr/80                         7.64µs ± 0%    7.04µs ± 0%   -7.91%  (p=0.029 n=4+4)
    IntSqr/100                        11.1µs ± 0%    10.3µs ± 0%   -7.04%  (p=0.029 n=4+4)
    IntSqr/200                        37.9µs ± 0%    36.4µs ± 0%   -4.13%  (p=0.029 n=4+4)
    IntSqr/300                        69.4µs ± 0%    66.0µs ± 0%   -4.94%  (p=0.029 n=4+4)
    IntSqr/500                         174µs ± 0%     168µs ± 0%   -3.10%  (p=0.029 n=4+4)
    IntSqr/800                         347µs ± 0%     333µs ± 0%   -4.06%  (p=0.029 n=4+4)
    IntSqr/1000                        524µs ± 0%     507µs ± 0%   -3.21%  (p=0.029 n=4+4)
    
    Change-Id: If067452f5b6579ad3a2e9daa76a7ffe6fceae1bb
    Reviewed-on: https://go-review.googlesource.com/c/143217
    Run-TryBot: Giovanni Bajo <rasky@develer.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: 's avatarGiovanni Bajo <rasky@develer.com>
    1e8ecefc
ssa.go 182 KB