• Carlos Eduardo Seo's avatar
    math/big: improve performance on ppc64x by unrolling loops · fc8967e3
    Carlos Eduardo Seo authored
    This change improves performance of addVV, subVV and mulAddVWW
    by unrolling the loops, with improvements up to 1.45x.
    
    benchmark                    old ns/op     new ns/op     delta
    BenchmarkAddVV/1-16          5.79          5.85          +1.04%
    BenchmarkAddVV/2-16          6.41          6.62          +3.28%
    BenchmarkAddVV/3-16          6.89          7.35          +6.68%
    BenchmarkAddVV/4-16          7.47          8.26          +10.58%
    BenchmarkAddVV/5-16          8.04          8.18          +1.74%
    BenchmarkAddVV/10-16         10.9          11.2          +2.75%
    BenchmarkAddVV/100-16        81.7          57.0          -30.23%
    BenchmarkAddVV/1000-16       714           500           -29.97%
    BenchmarkAddVV/10000-16      7088          4946          -30.22%
    BenchmarkAddVV/100000-16     71514         49364         -30.97%
    BenchmarkSubVV/1-16          5.94          5.89          -0.84%
    BenchmarkSubVV/2-16          12.9          6.82          -47.13%
    BenchmarkSubVV/3-16          7.03          7.34          +4.41%
    BenchmarkSubVV/4-16          7.58          8.23          +8.58%
    BenchmarkSubVV/5-16          8.15          8.19          +0.49%
    BenchmarkSubVV/10-16         11.2          11.4          +1.79%
    BenchmarkSubVV/100-16        82.4          57.0          -30.83%
    BenchmarkSubVV/1000-16       715           499           -30.21%
    BenchmarkSubVV/10000-16      7089          4947          -30.22%
    BenchmarkSubVV/100000-16     71568         49378         -31.01%
    
    benchmark                    old MB/s     new MB/s      speedup
    BenchmarkAddVV/1-16          11048.49     10939.92      0.99x
    BenchmarkAddVV/2-16          19973.41     19323.60      0.97x
    BenchmarkAddVV/3-16          27847.09     26123.06      0.94x
    BenchmarkAddVV/4-16          34276.46     30976.54      0.90x
    BenchmarkAddVV/5-16          39781.92     39140.68      0.98x
    BenchmarkAddVV/10-16         58559.29     56894.68      0.97x
    BenchmarkAddVV/100-16        78354.88     112243.69     1.43x
    BenchmarkAddVV/1000-16       89592.74     127889.04     1.43x
    BenchmarkAddVV/10000-16      90292.39     129387.06     1.43x
    BenchmarkAddVV/100000-16     89492.92     129647.78     1.45x
    BenchmarkSubVV/1-16          10781.03     10861.22      1.01x
    BenchmarkSubVV/2-16          9949.27      18760.21      1.89x
    BenchmarkSubVV/3-16          27319.40     26166.01      0.96x
    BenchmarkSubVV/4-16          33764.35     31123.02      0.92x
    BenchmarkSubVV/5-16          39272.40     39050.31      0.99x
    BenchmarkSubVV/10-16         57262.87     56206.33      0.98x
    BenchmarkSubVV/100-16        77641.78     112280.86     1.45x
    BenchmarkSubVV/1000-16       89486.27     128064.08     1.43x
    BenchmarkSubVV/10000-16      90274.37     129356.59     1.43x
    BenchmarkSubVV/100000-16     89424.42     129610.50     1.45x
    
    Change-Id: I2795a82134d1e3b75e2634c76b8ca165a723ec7b
    Reviewed-on: https://go-review.googlesource.com/103495
    Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: 's avatarLynn Boger <laboger@linux.vnet.ibm.com>
    fc8967e3
Name
Last commit
Last update
..
big Loading commit data...
bits Loading commit data...
cmplx Loading commit data...
rand Loading commit data...
abs.go Loading commit data...
acos_s390x.s Loading commit data...
acosh.go Loading commit data...
acosh_s390x.s Loading commit data...
all_test.go Loading commit data...
arith_s390x.go Loading commit data...
arith_s390x_test.go Loading commit data...
asin.go Loading commit data...
asin_386.s Loading commit data...
asin_amd64.s Loading commit data...
asin_amd64p32.s Loading commit data...
asin_arm.s Loading commit data...
asin_s390x.s Loading commit data...
asinh.go Loading commit data...
asinh_s390x.s Loading commit data...
asinh_stub.s Loading commit data...
atan.go Loading commit data...
atan2.go Loading commit data...
atan2_386.s Loading commit data...
atan2_amd64.s Loading commit data...
atan2_amd64p32.s Loading commit data...
atan2_arm.s Loading commit data...
atan2_s390x.s Loading commit data...
atan_386.s Loading commit data...
atan_amd64.s Loading commit data...
atan_amd64p32.s Loading commit data...
atan_arm.s Loading commit data...
atan_s390x.s Loading commit data...
atanh.go Loading commit data...
atanh_s390x.s Loading commit data...
bits.go Loading commit data...
cbrt.go Loading commit data...
cbrt_s390x.s Loading commit data...
cbrt_stub.s Loading commit data...
const.go Loading commit data...
copysign.go Loading commit data...
cosh_s390x.s Loading commit data...
dim.go Loading commit data...
dim_386.s Loading commit data...
dim_amd64.s Loading commit data...
dim_amd64p32.s Loading commit data...
dim_arm.s Loading commit data...
dim_arm64.s Loading commit data...
dim_s390x.s Loading commit data...
erf.go Loading commit data...
erf_s390x.s Loading commit data...
erf_stub.s Loading commit data...
erfc_s390x.s Loading commit data...
erfinv.go Loading commit data...
example_test.go Loading commit data...
exp.go Loading commit data...
exp2_386.s Loading commit data...
exp2_amd64.s Loading commit data...
exp2_amd64p32.s Loading commit data...
exp2_arm.s Loading commit data...
exp_386.s Loading commit data...
exp_amd64.s Loading commit data...
exp_amd64p32.s Loading commit data...
exp_arm.s Loading commit data...
exp_arm64.s Loading commit data...
exp_asm.go Loading commit data...
exp_s390x.s Loading commit data...
expm1.go Loading commit data...
expm1_386.s Loading commit data...
expm1_amd64.s Loading commit data...
expm1_amd64p32.s Loading commit data...
expm1_arm.s Loading commit data...
expm1_s390x.s Loading commit data...
export_s390x_test.go Loading commit data...
export_test.go Loading commit data...
floor.go Loading commit data...
floor_386.s Loading commit data...
floor_amd64.s Loading commit data...
floor_amd64p32.s Loading commit data...
floor_arm.s Loading commit data...
floor_arm64.s Loading commit data...
floor_ppc64x.s Loading commit data...
floor_s390x.s Loading commit data...
frexp.go Loading commit data...
frexp_386.s Loading commit data...
frexp_amd64.s Loading commit data...
frexp_amd64p32.s Loading commit data...
frexp_arm.s Loading commit data...
gamma.go Loading commit data...
hypot.go Loading commit data...
hypot_386.s Loading commit data...
hypot_amd64.s Loading commit data...
hypot_amd64p32.s Loading commit data...
hypot_arm.s Loading commit data...
j0.go Loading commit data...
j1.go Loading commit data...
jn.go Loading commit data...
ldexp.go Loading commit data...
ldexp_386.s Loading commit data...
ldexp_amd64.s Loading commit data...
ldexp_amd64p32.s Loading commit data...
ldexp_arm.s Loading commit data...
lgamma.go Loading commit data...
log.go Loading commit data...
log10.go Loading commit data...
log10_386.s Loading commit data...
log10_amd64.s Loading commit data...
log10_amd64p32.s Loading commit data...
log10_arm.s Loading commit data...
log10_s390x.s Loading commit data...
log1p.go Loading commit data...
log1p_386.s Loading commit data...
log1p_amd64.s Loading commit data...
log1p_amd64p32.s Loading commit data...
log1p_arm.s Loading commit data...
log1p_s390x.s Loading commit data...
log_386.s Loading commit data...
log_amd64.s Loading commit data...
log_amd64p32.s Loading commit data...
log_arm.s Loading commit data...
log_s390x.s Loading commit data...
logb.go Loading commit data...
mod.go Loading commit data...
mod_386.s Loading commit data...
mod_amd64.s Loading commit data...
mod_amd64p32.s Loading commit data...
mod_arm.s Loading commit data...
modf.go Loading commit data...
modf_386.s Loading commit data...
modf_amd64.s Loading commit data...
modf_amd64p32.s Loading commit data...
modf_arm.s Loading commit data...
modf_arm64.s Loading commit data...
modf_ppc64x.s Loading commit data...
nextafter.go Loading commit data...
pow.go Loading commit data...
pow10.go Loading commit data...
pow_s390x.s Loading commit data...
pow_stub.s Loading commit data...
remainder.go Loading commit data...
remainder_386.s Loading commit data...
remainder_amd64.s Loading commit data...
remainder_amd64p32.s Loading commit data...
remainder_arm.s Loading commit data...
signbit.go Loading commit data...
sin.go Loading commit data...
sin_386.s Loading commit data...
sin_amd64.s Loading commit data...
sin_amd64p32.s Loading commit data...
sin_arm.s Loading commit data...
sin_s390x.s Loading commit data...
sincos.go Loading commit data...
sincos_386.go Loading commit data...
sincos_386.s Loading commit data...
sinh.go Loading commit data...
sinh_s390x.s Loading commit data...
sinh_stub.s Loading commit data...
sqrt.go Loading commit data...
sqrt_386.s Loading commit data...
sqrt_amd64.s Loading commit data...
sqrt_amd64p32.s Loading commit data...
sqrt_arm.s Loading commit data...
sqrt_arm64.s Loading commit data...
sqrt_mipsx.s Loading commit data...
sqrt_ppc64x.s Loading commit data...
sqrt_s390x.s Loading commit data...
stubs_arm64.s Loading commit data...
stubs_mips64x.s Loading commit data...
stubs_mipsx.s Loading commit data...
stubs_ppc64x.s Loading commit data...
stubs_s390x.s Loading commit data...
tan.go Loading commit data...
tan_386.s Loading commit data...
tan_amd64.s Loading commit data...
tan_amd64p32.s Loading commit data...
tan_arm.s Loading commit data...
tan_s390x.s Loading commit data...
tanh.go Loading commit data...
tanh_s390x.s Loading commit data...
unsafe.go Loading commit data...