• Josh Bleecher Snyder's avatar
    cmd/compile: fuse before branchelim · a55f3ee4
    Josh Bleecher Snyder authored
    The branchelim pass works better after fuse.
    Running fuse before branchelim also increases
    the stability of generated code amidst other compiler changes,
    which was the original motivation behind this change.
    
    The fuse pass is not cheap enough to run in its entirety
    before branchelim, but the most important half of it is.
    This change makes it possible to run "plain fuse" independently
    and does so before branchelim.
    
    During make.bash, elimIf occurrences increase from 4244 to 4288 (1%),
    and elimIfElse occurrences increase from 989 to 1079 (9%).
    
    Toolspeed impact is marginal; plain fuse pays for itself.
    
    name        old time/op       new time/op       delta
    Template          189ms ± 2%        189ms ± 2%    ~     (p=0.890 n=45+46)
    Unicode          93.2ms ± 5%       93.4ms ± 7%    ~     (p=0.790 n=48+48)
    GoTypes           662ms ± 4%        660ms ± 4%    ~     (p=0.186 n=48+49)
    Compiler          2.89s ± 4%        2.91s ± 3%  +0.89%  (p=0.050 n=49+44)
    SSA               8.23s ± 2%        8.21s ± 1%    ~     (p=0.165 n=46+44)
    Flate             123ms ± 4%        123ms ± 3%  +0.58%  (p=0.031 n=47+49)
    GoParser          154ms ± 4%        154ms ± 4%    ~     (p=0.492 n=49+48)
    Reflect           430ms ± 4%        429ms ± 4%    ~     (p=1.000 n=48+48)
    Tar               171ms ± 3%        170ms ± 4%    ~     (p=0.122 n=48+48)
    XML               232ms ± 3%        232ms ± 2%    ~     (p=0.850 n=46+49)
    [Geo mean]        394ms             394ms       +0.02%
    
    name        old user-time/op  new user-time/op  delta
    Template          236ms ± 5%        236ms ± 4%    ~     (p=0.934 n=50+50)
    Unicode           132ms ± 7%        130ms ± 9%    ~     (p=0.087 n=50+50)
    GoTypes           861ms ± 3%        867ms ± 4%    ~     (p=0.124 n=48+50)
    Compiler          3.93s ± 4%        3.94s ± 3%    ~     (p=0.584 n=49+44)
    SSA               12.2s ± 2%        12.3s ± 1%    ~     (p=0.610 n=46+45)
    Flate             149ms ± 4%        150ms ± 4%    ~     (p=0.194 n=48+49)
    GoParser          193ms ± 5%        191ms ± 6%    ~     (p=0.239 n=49+50)
    Reflect           553ms ± 5%        556ms ± 5%    ~     (p=0.091 n=49+49)
    Tar               218ms ± 5%        218ms ± 5%    ~     (p=0.359 n=49+50)
    XML               299ms ± 5%        298ms ± 4%    ~     (p=0.482 n=50+49)
    [Geo mean]        516ms             516ms       -0.01%
    
    name        old alloc/op      new alloc/op      delta
    Template         36.3MB ± 0%       36.3MB ± 0%  -0.02%  (p=0.000 n=49+49)
    Unicode          29.7MB ± 0%       29.7MB ± 0%    ~     (p=0.270 n=50+50)
    GoTypes           126MB ± 0%        126MB ± 0%  -0.34%  (p=0.000 n=50+49)
    Compiler          534MB ± 0%        531MB ± 0%  -0.50%  (p=0.000 n=50+50)
    SSA              1.98GB ± 0%       1.98GB ± 0%  -0.06%  (p=0.000 n=49+49)
    Flate            24.6MB ± 0%       24.6MB ± 0%  -0.29%  (p=0.000 n=50+50)
    GoParser         29.5MB ± 0%       29.4MB ± 0%  -0.15%  (p=0.000 n=49+50)
    Reflect          87.3MB ± 0%       87.2MB ± 0%  -0.13%  (p=0.000 n=49+50)
    Tar              35.6MB ± 0%       35.5MB ± 0%  -0.17%  (p=0.000 n=50+50)
    XML              48.2MB ± 0%       48.0MB ± 0%  -0.30%  (p=0.000 n=48+50)
    [Geo mean]       83.1MB            82.9MB       -0.20%
    
    name        old allocs/op     new allocs/op     delta
    Template           352k ± 0%         352k ± 0%  -0.01%  (p=0.004 n=49+49)
    Unicode            341k ± 0%         341k ± 0%    ~     (p=0.341 n=48+50)
    GoTypes           1.28M ± 0%        1.28M ± 0%  -0.03%  (p=0.000 n=50+49)
    Compiler          4.96M ± 0%        4.96M ± 0%  -0.05%  (p=0.000 n=50+49)
    SSA               15.5M ± 0%        15.5M ± 0%  -0.01%  (p=0.000 n=50+49)
    Flate              233k ± 0%         233k ± 0%  +0.01%  (p=0.032 n=49+49)
    GoParser           294k ± 0%         294k ± 0%    ~     (p=0.052 n=46+48)
    Reflect           1.04M ± 0%        1.04M ± 0%    ~     (p=0.171 n=50+47)
    Tar                343k ± 0%         343k ± 0%  -0.03%  (p=0.000 n=50+50)
    XML                429k ± 0%         429k ± 0%  -0.04%  (p=0.000 n=50+50)
    [Geo mean]         812k              812k       -0.02%
    
    Object files grow slightly; branchelim often increases binary size, at least on amd64.
    
    name        old object-bytes  new object-bytes  delta
    Template          509kB ± 0%        509kB ± 0%  -0.01%  (p=0.008 n=5+5)
    Unicode           224kB ± 0%        224kB ± 0%    ~     (all equal)
    GoTypes          1.84MB ± 0%       1.84MB ± 0%  +0.00%  (p=0.008 n=5+5)
    Compiler         6.71MB ± 0%       6.71MB ± 0%  +0.01%  (p=0.008 n=5+5)
    SSA              21.2MB ± 0%       21.2MB ± 0%  +0.01%  (p=0.008 n=5+5)
    Flate             324kB ± 0%        324kB ± 0%  -0.00%  (p=0.008 n=5+5)
    GoParser          404kB ± 0%        404kB ± 0%  -0.02%  (p=0.008 n=5+5)
    Reflect          1.40MB ± 0%       1.40MB ± 0%  +0.09%  (p=0.008 n=5+5)
    Tar               452kB ± 0%        452kB ± 0%  +0.06%  (p=0.008 n=5+5)
    XML               596kB ± 0%        596kB ± 0%  +0.00%  (p=0.008 n=5+5)
    [Geo mean]       1.04MB            1.04MB       +0.01%
    
    Change-Id: I535c711b85380ff657fc0f022bebd9cb14ddd07f
    Reviewed-on: https://go-review.googlesource.com/c/129378
    Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: 's avatarKeith Randall <khr@golang.org>
    a55f3ee4
nilcheck_test.go 12.1 KB