• Russ Cox's avatar
    math/big: make division faster · 3a907282
    Russ Cox authored
    - Add new BenchmarkQuoRem.
    - Eliminate allocation in divLarge nat pool
    - Unroll mulAddVWW body 4x
    - Remove some redundant slice loads in divLarge
    
    name      old time/op  new time/op  delta
    QuoRem-8  2.18µs ± 1%  1.93µs ± 1%  -11.38%  (p=0.000 n=19+18)
    
    The starting point in the comparison here is Cherry's
    pending CL to turn mulWW and divWW into intrinsics.
    The optimizations in divLarge work best because all
    the function calls are gone. The effect of this CL is not
    as large if you don't assume Cherry's CL.
    
    Change-Id: Ia6138907489c5b9168497912e43705634e163b35
    Reviewed-on: https://go-review.googlesource.com/30613
    Run-TryBot: Russ Cox <rsc@golang.org>
    Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
    3a907282
arith_amd64.s 8.55 KB