• fanzha02's avatar
    cmd/internal/obj/arm64: encode large constants into MOVZ/MOVN and MOVK instructions · 644ddaa8
    fanzha02 authored
    Current assembler gets large constants from constant pool, this CL
    gets rid of the pool by using MOVZ/MOVN and MOVK to load large
    constants.
    
    This CL changes the assembler behavior as follows.
    
    1. go assembly  1, MOVD $0x1111222233334444, R1
                    2, MOVD $0x1111ffff1111ffff, R1
       previous version: MOVD 0x9a4, R1 (loads constant from pool).
       optimized version: 1, MOVD $0x4444, R1; MOVK $(0x3333<<16), R1; MOVK $(0x2222<<32), R1;
       MOVK $(0x1111<<48), R1. 2, MOVN $(0xeeee<<16), R1; MOVK $(0x1111<<48), R1.
    
    Add test cases, and below are binary size comparison and bechmark results.
    
    1. Binary size before/after
    binary                 size change
    pkg/linux_arm64        +25.4KB
    pkg/tool/linux_arm64   -2.9KB
    go                     -2KB
    gofmt                  no change
    
    2. compiler benchmark.
    name       old time/op       new time/op       delta
    Template         574ms ±21%        577ms ±14%     ~     (p=0.853 n=10+10)
    Unicode          327ms ±29%        353ms ±23%     ~     (p=0.360 n=10+8)
    GoTypes          1.97s ± 8%        2.04s ±11%     ~     (p=0.143 n=10+10)
    Compiler         9.13s ± 9%        9.25s ± 8%     ~     (p=0.684 n=10+10)
    SSA              29.2s ± 5%        27.0s ± 4%   -7.40%  (p=0.000 n=10+10)
    Flate            402ms ±40%        308ms ± 6%  -23.29%  (p=0.004 n=10+10)
    GoParser         470ms ±26%        382ms ±10%  -18.82%  (p=0.000 n=9+10)
    Reflect          1.36s ±16%        1.17s ± 7%  -13.92%  (p=0.001 n=9+10)
    Tar              561ms ±19%        466ms ±15%  -17.08%  (p=0.000 n=9+10)
    XML              745ms ±20%        679ms ±20%     ~     (p=0.123 n=10+10)
    StdCmd           35.5s ± 6%        37.2s ± 3%   +4.81%  (p=0.001 n=9+8)
    
    name       old user-time/op  new user-time/op  delta
    Template         625ms ±14%        660ms ±18%     ~     (p=0.343 n=10+10)
    Unicode          355ms ±10%        373ms ±20%     ~     (p=0.346 n=9+10)
    GoTypes          2.39s ± 8%        2.37s ± 5%     ~     (p=0.897 n=10+10)
    Compiler         11.1s ± 4%        11.4s ± 2%   +2.63%  (p=0.010 n=10+9)
    SSA              35.4s ± 3%        34.9s ± 2%     ~     (p=0.113 n=10+9)
    Flate            402ms ±13%        371ms ±30%     ~     (p=0.089 n=10+9)
    GoParser         513ms ± 8%        489ms ±24%   -4.76%  (p=0.039 n=9+9)
    Reflect          1.52s ±12%        1.41s ± 5%   -7.32%  (p=0.001 n=9+10)
    Tar              607ms ±10%        558ms ± 8%   -7.96%  (p=0.009 n=9+10)
    XML              828ms ±10%        789ms ±12%     ~     (p=0.059 n=10+10)
    
    name       old text-bytes    new text-bytes    delta
    HelloSize        714kB ± 0%        712kB ± 0%   -0.23%  (p=0.000 n=10+10)
    CmdGoSize       8.26MB ± 0%       8.25MB ± 0%   -0.14%  (p=0.000 n=10+10)
    
    name       old data-bytes    new data-bytes    delta
    HelloSize       10.5kB ± 0%       10.5kB ± 0%     ~     (all equal)
    CmdGoSize        258kB ± 0%        258kB ± 0%     ~     (all equal)
    
    name       old bss-bytes     new bss-bytes     delta
    HelloSize        125kB ± 0%        125kB ± 0%     ~     (all equal)
    CmdGoSize        146kB ± 0%        146kB ± 0%     ~     (all equal)
    
    name       old exe-bytes     new exe-bytes     delta
    HelloSize       1.18MB ± 0%       1.18MB ± 0%     ~     (all equal)
    CmdGoSize       11.2MB ± 0%       11.2MB ± 0%   -0.13%  (p=0.000 n=10+10)
    
    3. go1 benckmark.
    name                   old time/op    new time/op    delta
    BinaryTree17              6.60s ±18%     7.36s ±22%    ~     (p=0.222 n=5+5)
    Fannkuch11                4.04s ± 0%     4.05s ± 0%    ~     (p=0.421 n=5+5)
    FmtFprintfEmpty          91.8ns ±14%    91.2ns ± 9%    ~     (p=0.667 n=5+5)
    FmtFprintfString          145ns ± 0%     151ns ± 6%    ~     (p=0.397 n=4+5)
    FmtFprintfInt             169ns ± 0%     176ns ± 5%  +4.14%  (p=0.016 n=4+5)
    FmtFprintfIntInt          229ns ± 2%     243ns ± 6%    ~     (p=0.143 n=5+5)
    FmtFprintfPrefixedInt     343ns ± 0%     350ns ± 3%  +1.92%  (p=0.048 n=5+5)
    FmtFprintfFloat           400ns ± 3%     394ns ± 3%    ~     (p=0.063 n=5+5)
    FmtManyArgs              1.04µs ± 0%    1.05µs ± 0%  +1.62%  (p=0.029 n=4+4)
    GobDecode                13.9ms ± 4%    13.9ms ± 5%    ~     (p=1.000 n=5+5)
    GobEncode                10.6ms ± 4%    10.6ms ± 5%    ~     (p=0.421 n=5+5)
    Gzip                      567ms ± 1%     563ms ± 4%    ~     (p=0.548 n=5+5)
    Gunzip                   60.2ms ± 1%    60.4ms ± 0%    ~     (p=0.056 n=5+5)
    HTTPClientServer          114µs ± 4%     108µs ± 7%    ~     (p=0.095 n=5+5)
    JSONEncode               18.4ms ± 2%    17.8ms ± 2%  -3.06%  (p=0.016 n=5+5)
    JSONDecode                105ms ± 1%     103ms ± 2%    ~     (p=0.056 n=5+5)
    Mandelbrot200            5.48ms ± 0%    5.49ms ± 0%    ~     (p=0.841 n=5+5)
    GoParse                  6.05ms ± 1%    6.05ms ± 2%    ~     (p=1.000 n=5+5)
    RegexpMatchEasy0_32       143ns ± 1%     146ns ± 4%  +2.10%  (p=0.048 n=4+5)
    RegexpMatchEasy0_1K       499ns ± 1%     492ns ± 2%    ~     (p=0.079 n=5+5)
    RegexpMatchEasy1_32       137ns ± 0%     136ns ± 1%  -0.73%  (p=0.016 n=4+5)
    RegexpMatchEasy1_1K       826ns ± 4%     823ns ± 2%    ~     (p=0.841 n=5+5)
    RegexpMatchMedium_32      224ns ± 5%     233ns ± 8%    ~     (p=0.119 n=5+5)
    RegexpMatchMedium_1K     59.6µs ± 0%    59.3µs ± 1%  -0.66%  (p=0.016 n=4+5)
    RegexpMatchHard_32       3.29µs ± 3%    3.26µs ± 1%    ~     (p=0.889 n=5+5)
    RegexpMatchHard_1K       98.8µs ± 2%    99.0µs ± 0%    ~     (p=0.690 n=5+5)
    Revcomp                   1.02s ± 1%     1.01s ± 1%    ~     (p=0.095 n=5+5)
    Template                  135ms ± 5%     131ms ± 1%    ~     (p=0.151 n=5+5)
    TimeParse                 591ns ± 0%     593ns ± 0%  +0.20%  (p=0.048 n=5+5)
    TimeFormat                655ns ± 2%     607ns ± 0%  -7.42%  (p=0.016 n=5+4)
    [Geo mean]               93.5µs         93.8µs       +0.23%
    
    name                   old speed      new speed      delta
    GobDecode              55.1MB/s ± 4%  55.1MB/s ± 4%    ~     (p=1.000 n=5+5)
    GobEncode              72.4MB/s ± 4%  72.3MB/s ± 5%    ~     (p=0.421 n=5+5)
    Gzip                   34.2MB/s ± 1%  34.5MB/s ± 4%    ~     (p=0.548 n=5+5)
    Gunzip                  322MB/s ± 1%   321MB/s ± 0%    ~     (p=0.056 n=5+5)
    JSONEncode              106MB/s ± 2%   109MB/s ± 2%  +3.16%  (p=0.016 n=5+5)
    JSONDecode             18.5MB/s ± 1%  18.8MB/s ± 2%    ~     (p=0.056 n=5+5)
    GoParse                9.57MB/s ± 1%  9.57MB/s ± 2%    ~     (p=0.952 n=5+5)
    RegexpMatchEasy0_32     223MB/s ± 1%   221MB/s ± 0%  -1.10%  (p=0.029 n=4+4)
    RegexpMatchEasy0_1K    2.05GB/s ± 1%  2.08GB/s ± 2%    ~     (p=0.095 n=5+5)
    RegexpMatchEasy1_32     232MB/s ± 0%   234MB/s ± 1%  +0.76%  (p=0.016 n=4+5)
    RegexpMatchEasy1_1K    1.24GB/s ± 4%  1.24GB/s ± 2%    ~     (p=0.841 n=5+5)
    RegexpMatchMedium_32   4.45MB/s ± 5%  4.20MB/s ± 1%  -5.63%  (p=0.000 n=5+4)
    RegexpMatchMedium_1K   17.2MB/s ± 0%  17.3MB/s ± 1%  +0.66%  (p=0.016 n=4+5)
    RegexpMatchHard_32     9.73MB/s ± 3%  9.83MB/s ± 1%    ~     (p=0.889 n=5+5)
    RegexpMatchHard_1K     10.4MB/s ± 2%  10.3MB/s ± 0%    ~     (p=0.635 n=5+5)
    Revcomp                 249MB/s ± 1%   252MB/s ± 1%    ~     (p=0.095 n=5+5)
    Template               14.4MB/s ± 4%  14.8MB/s ± 1%    ~     (p=0.151 n=5+5)
    [Geo mean]             62.1MB/s       62.3MB/s       +0.34%
    
    Fixes #10108
    
    Change-Id: I79038f3c4c2ff874c136053d1a2b1c8a5a9cfac5
    Reviewed-on: https://go-review.googlesource.com/c/118796Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
    Run-TryBot: Cherry Zhang <cherryyz@google.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    644ddaa8
Name
Last commit
Last update
..
archive Loading commit data...
bufio Loading commit data...
builtin Loading commit data...
bytes Loading commit data...
cmd Loading commit data...
compress Loading commit data...
container Loading commit data...
context Loading commit data...
crypto Loading commit data...
database/sql Loading commit data...
debug Loading commit data...
encoding Loading commit data...
errors Loading commit data...
expvar Loading commit data...
flag Loading commit data...
fmt Loading commit data...
go Loading commit data...
hash Loading commit data...
html Loading commit data...
image Loading commit data...
index/suffixarray Loading commit data...
internal Loading commit data...
io Loading commit data...
log Loading commit data...
math Loading commit data...
mime Loading commit data...
net Loading commit data...
os Loading commit data...
path Loading commit data...
plugin Loading commit data...
reflect Loading commit data...
regexp Loading commit data...
runtime Loading commit data...
sort Loading commit data...
strconv Loading commit data...
strings Loading commit data...
sync Loading commit data...
syscall Loading commit data...
testdata Loading commit data...
testing Loading commit data...
text Loading commit data...
time Loading commit data...
unicode Loading commit data...
unsafe Loading commit data...
vendor/golang_org/x Loading commit data...
Make.dist Loading commit data...
all.bash Loading commit data...
all.bat Loading commit data...
all.rc Loading commit data...
androidtest.bash Loading commit data...
bootstrap.bash Loading commit data...
buildall.bash Loading commit data...
clean.bash Loading commit data...
clean.bat Loading commit data...
clean.rc Loading commit data...
cmp.bash Loading commit data...
iostest.bash Loading commit data...
make.bash Loading commit data...
make.bat Loading commit data...
make.rc Loading commit data...
naclmake.bash Loading commit data...
nacltest.bash Loading commit data...
race.bash Loading commit data...
race.bat Loading commit data...
run.bash Loading commit data...
run.bat Loading commit data...
run.rc Loading commit data...