• Philip Hofer's avatar
    cmd/internal/obj/arm: improve static branch prediction for wrapper prologue · a143f5d6
    Philip Hofer authored
    This is a follow-up to CL 36893.
    
    Move the unlikely branch in the wrapper prologue to the end
    of the function, where it has minimal impact on the instruction
    cache. Static branch prediction is also less likely to choose
    a forward branch.
    
    Updates #19042
    
    sort benchmarks:
    name                  old time/op  new time/op  delta
    SearchWrappers-4      1.44µs ± 0%  1.45µs ± 0%  +1.15%  (p=0.000 n=9+10)
    SortString1K-4        1.02ms ± 0%  1.04ms ± 0%  +2.39%  (p=0.000 n=10+10)
    SortString1K_Slice-4   960µs ± 0%   989µs ± 0%  +2.95%  (p=0.000 n=9+10)
    StableString1K-4       218µs ± 0%   213µs ± 0%  -2.13%  (p=0.000 n=10+10)
    SortInt1K-4            541µs ± 0%   543µs ± 0%  +0.30%  (p=0.003 n=9+10)
    StableInt1K-4          760µs ± 1%   763µs ± 1%  +0.38%  (p=0.011 n=10+10)
    StableInt1K_Slice-4    840µs ± 1%   779µs ± 0%  -7.31%  (p=0.000 n=9+10)
    SortInt64K-4          55.2ms ± 0%  55.4ms ± 1%  +0.34%  (p=0.012 n=10+8)
    SortInt64K_Slice-4    56.2ms ± 0%  55.6ms ± 1%  -1.16%  (p=0.000 n=10+10)
    StableInt64K-4        70.9ms ± 1%  71.0ms ± 0%    ~     (p=0.315 n=10+7)
    Sort1e2-4              250µs ± 0%   249µs ± 1%    ~     (p=0.315 n=9+10)
    Stable1e2-4            600µs ± 0%   594µs ± 0%  -1.09%  (p=0.000 n=9+10)
    Sort1e4-4             51.2ms ± 0%  51.4ms ± 1%  +0.40%  (p=0.001 n=9+10)
    Stable1e4-4            204ms ± 1%   199ms ± 1%  -2.27%  (p=0.000 n=10+10)
    Sort1e6-4              8.42s ± 0%   8.44s ± 0%  +0.28%  (p=0.000 n=8+9)
    Stable1e6-4            43.3s ± 0%   42.5s ± 1%  -1.89%  (p=0.000 n=9+9)
    
    Change-Id: I827559aa557fdba211a38ce3f77137b471c5c67e
    Reviewed-on: https://go-review.googlesource.com/37611
    Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
    Reviewed-by: 's avatarJosh Bleecher Snyder <josharian@gmail.com>
    a143f5d6
Name
Last commit
Last update
..
addr2line Loading commit data...
api Loading commit data...
asm Loading commit data...
cgo Loading commit data...
compile Loading commit data...
cover Loading commit data...
dist Loading commit data...
doc Loading commit data...
fix Loading commit data...
go Loading commit data...
gofmt Loading commit data...
internal Loading commit data...
link Loading commit data...
nm Loading commit data...
objdump Loading commit data...
pack Loading commit data...
pprof Loading commit data...
trace Loading commit data...
vendor Loading commit data...
vet Loading commit data...