• Brian Kessler's avatar
    math/big: implement Lehmer's extended GCD algorithm · 50649a96
    Brian Kessler authored
    Updates #15833
    
    The extended GCD algorithm can be implemented using
    Lehmer's algorithm with additional updates for the
    cosequences following Algorithm 10.45 from Cohen et al.
    "Handbook of Elliptic and Hyperelliptic Curve Cryptography" pp 192.
    This brings the speed of the extended GCD calculation within
    ~2x of the base GCD calculation.  There is a slight degradation in
    the non-extended GCD speed for small inputs (1-2 words) due to the
    additional code to handle the extended updates.
    
    name                          old time/op    new time/op    delta
    GCD10x10/WithoutXY-4             262ns ± 1%     266ns ± 2%     ~     (p=0.333 n=5+5)
    GCD10x10/WithXY-4               1.42µs ± 2%    0.74µs ± 3%  -47.90%  (p=0.008 n=5+5)
    GCD10x100/WithoutXY-4            520ns ± 2%     539ns ± 1%   +3.81%  (p=0.008 n=5+5)
    GCD10x100/WithXY-4              2.32µs ± 1%    1.67µs ± 0%  -27.80%  (p=0.008 n=5+5)
    GCD10x1000/WithoutXY-4          1.40µs ± 1%    1.45µs ± 2%   +3.26%  (p=0.016 n=4+5)
    GCD10x1000/WithXY-4             4.78µs ± 1%    3.43µs ± 1%  -28.37%  (p=0.008 n=5+5)
    GCD10x10000/WithoutXY-4         10.0µs ± 0%    10.2µs ± 3%   +1.80%  (p=0.008 n=5+5)
    GCD10x10000/WithXY-4            20.9µs ± 3%    17.9µs ± 1%  -14.20%  (p=0.008 n=5+5)
    GCD10x100000/WithoutXY-4        96.8µs ± 0%    96.3µs ± 1%     ~     (p=0.310 n=5+5)
    GCD10x100000/WithXY-4            196µs ± 3%     159µs ± 2%  -18.61%  (p=0.008 n=5+5)
    GCD100x100/WithoutXY-4          2.53µs ±15%    2.34µs ± 0%   -7.35%  (p=0.008 n=5+5)
    GCD100x100/WithXY-4             19.3µs ± 0%     3.9µs ± 1%  -79.58%  (p=0.008 n=5+5)
    GCD100x1000/WithoutXY-4         4.23µs ± 0%    4.17µs ± 3%     ~     (p=0.127 n=5+5)
    GCD100x1000/WithXY-4            22.8µs ± 1%     7.5µs ±10%  -67.00%  (p=0.008 n=5+5)
    GCD100x10000/WithoutXY-4        19.1µs ± 0%    19.0µs ± 0%     ~     (p=0.095 n=5+5)
    GCD100x10000/WithXY-4           75.1µs ± 2%    30.5µs ± 2%  -59.38%  (p=0.008 n=5+5)
    GCD100x100000/WithoutXY-4        170µs ± 5%     167µs ± 1%     ~     (p=1.000 n=5+5)
    GCD100x100000/WithXY-4           542µs ± 2%     267µs ± 2%  -50.79%  (p=0.008 n=5+5)
    GCD1000x1000/WithoutXY-4        28.0µs ± 0%    27.1µs ± 0%   -3.29%  (p=0.008 n=5+5)
    GCD1000x1000/WithXY-4            329µs ± 0%      42µs ± 1%  -87.12%  (p=0.008 n=5+5)
    GCD1000x10000/WithoutXY-4       47.2µs ± 0%    46.4µs ± 0%   -1.65%  (p=0.016 n=5+4)
    GCD1000x10000/WithXY-4           607µs ± 9%     123µs ± 1%  -79.70%  (p=0.008 n=5+5)
    GCD1000x100000/WithoutXY-4       260µs ±17%     245µs ± 0%     ~     (p=0.056 n=5+5)
    GCD1000x100000/WithXY-4         3.64ms ± 1%    0.93ms ± 1%  -74.41%  (p=0.016 n=4+5)
    GCD10000x10000/WithoutXY-4       513µs ± 0%     507µs ± 0%   -1.22%  (p=0.008 n=5+5)
    GCD10000x10000/WithXY-4         7.44ms ± 1%    1.00ms ± 0%  -86.58%  (p=0.008 n=5+5)
    GCD10000x100000/WithoutXY-4     1.23ms ± 0%    1.23ms ± 1%     ~     (p=0.056 n=5+5)
    GCD10000x100000/WithXY-4        37.3ms ± 0%     7.3ms ± 1%  -80.45%  (p=0.008 n=5+5)
    GCD100000x100000/WithoutXY-4    24.2ms ± 0%    24.2ms ± 0%     ~     (p=0.841 n=5+5)
    GCD100000x100000/WithXY-4        505ms ± 1%      56ms ± 1%  -88.92%  (p=0.008 n=5+5)
    
    Change-Id: I25f42ab8c55033acb83cc32bb03c12c1963925e8
    Reviewed-on: https://go-review.googlesource.com/78755Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
    Run-TryBot: Robert Griesemer <gri@golang.org>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    50649a96
rat.go 12.6 KB