• Ben Shi's avatar
    cmd/compile: optimize MOVBS/MOVBU/MOVHS/MOVHU on ARMv6 and ARMv7 · 78ea9a71
    Ben Shi authored
    MOVBS/MOVBU/MOVHS/MOVHU can be optimized with a single instruction
    on ARMv6 and ARMv7, instead of a pair of left/right shifts.
    
    The benchmark tests show big improvement in special cases and a little
    improvement in total.
    
    1. A special case gets about 29% improvement.
    name                     old time/op    new time/op    delta
    TypePro-4                  3.81ms ± 1%    2.71ms ± 1%  -28.97%  (p=0.000 n=26+25)
    The source code of this case can be found at
    https://github.com/benshi001/ugo1/blob/master/typepromotion_test.go
    
    2. There is a little improvement in the go1 benchmark, excluding the noise.
    name                     old time/op    new time/op    delta
    BinaryTree17-4              42.1s ± 3%     42.1s ± 2%    ~     (p=0.883 n=28+30)
    Fannkuch11-4                24.3s ± 4%     24.7s ± 7%  +1.64%  (p=0.026 n=30+30)
    FmtFprintfEmpty-4           833ns ± 2%     835ns ± 2%    ~     (p=0.371 n=26+28)
    FmtFprintfString-4         1.36µs ± 3%    1.35µs ± 1%    ~     (p=0.202 n=26+23)
    FmtFprintfInt-4            1.42µs ± 3%    1.43µs ± 1%  +0.66%  (p=0.000 n=26+27)
    FmtFprintfIntInt-4         2.10µs ± 1%    2.10µs ± 2%    ~     (p=0.104 n=25+26)
    FmtFprintfPrefixedInt-4    2.37µs ± 2%    2.33µs ± 1%  -1.75%  (p=0.000 n=25+28)
    FmtFprintfFloat-4          4.50µs ± 0%    4.37µs ± 1%  -2.81%  (p=0.000 n=23+25)
    FmtManyArgs-4              8.08µs ± 0%    8.13µs ± 3%    ~     (p=0.160 n=23+26)
    GobDecode-4                 102ms ± 4%     103ms ± 4%  +1.08%  (p=0.001 n=28+26)
    GobEncode-4                96.0ms ± 2%    95.2ms ± 3%  -0.81%  (p=0.000 n=24+25)
    Gzip-4                      4.17s ± 3%     4.11s ± 2%  -1.45%  (p=0.000 n=25+25)
    Gunzip-4                    597ms ± 2%     594ms ± 2%  -0.57%  (p=0.000 n=24+26)
    HTTPClientServer-4          708µs ± 4%     708µs ± 4%    ~     (p=0.852 n=28+28)
    JSONEncode-4                241ms ± 1%     245ms ± 3%  +1.62%  (p=0.000 n=27+28)
    JSONDecode-4                906ms ± 3%     889ms ± 3%  -1.85%  (p=0.000 n=23+24)
    Mandelbrot200-4            41.8ms ± 1%    41.8ms ± 1%    ~     (p=0.929 n=25+24)
    GoParse-4                  47.1ms ± 2%    45.3ms ± 4%  -3.80%  (p=0.000 n=28+24)
    RegexpMatchEasy0_32-4      1.27µs ± 2%    1.28µs ± 1%  +0.77%  (p=0.000 n=26+28)
    RegexpMatchEasy0_1K-4      8.08µs ± 9%    7.83µs ±10%  -3.10%  (p=0.012 n=26+26)
    RegexpMatchEasy1_32-4      1.29µs ± 5%    1.29µs ± 2%    ~     (p=0.301 n=26+29)
    RegexpMatchEasy1_1K-4      10.5µs ± 4%    10.3µs ± 5%  -1.95%  (p=0.003 n=26+26)
    RegexpMatchMedium_32-4     1.94µs ± 1%    1.95µs ± 1%    ~     (p=0.251 n=24+27)
    RegexpMatchMedium_1K-4      502µs ± 2%     502µs ± 2%    ~     (p=0.336 n=25+28)
    RegexpMatchHard_32-4       26.7µs ± 1%    26.6µs ± 3%    ~     (p=0.454 n=27+26)
    RegexpMatchHard_1K-4        801µs ± 3%     799µs ± 2%    ~     (p=0.097 n=24+26)
    Revcomp-4                  73.5ms ± 5%    73.2ms ± 3%    ~     (p=0.240 n=26+26)
    Template-4                  1.07s ± 2%     1.05s ± 1%  -2.39%  (p=0.000 n=26+24)
    TimeParse-4                6.87µs ± 1%    6.85µs ± 1%    ~     (p=0.094 n=28+23)
    TimeFormat-4               13.4µs ± 1%    13.4µs ± 1%    ~     (p=0.664 n=25+29)
    [Geo mean]                  717µs          713µs       -0.54%
    
    name                     old speed      new speed      delta
    GobDecode-4              7.52MB/s ± 4%  7.44MB/s ± 4%  -1.10%  (p=0.001 n=28+26)
    GobEncode-4              7.99MB/s ± 2%  8.06MB/s ± 3%  +0.81%  (p=0.000 n=24+25)
    Gzip-4                   4.66MB/s ± 3%  4.72MB/s ± 2%  +1.43%  (p=0.000 n=25+25)
    Gunzip-4                 32.5MB/s ± 2%  32.7MB/s ± 2%  +0.56%  (p=0.001 n=24+26)
    JSONEncode-4             8.04MB/s ± 1%  7.92MB/s ± 3%  -1.59%  (p=0.000 n=27+28)
    JSONDecode-4             2.14MB/s ± 3%  2.18MB/s ± 3%  +1.90%  (p=0.000 n=23+24)
    GoParse-4                1.23MB/s ± 3%  1.28MB/s ± 4%  +4.23%  (p=0.000 n=30+24)
    RegexpMatchEasy0_32-4    25.2MB/s ± 2%  25.0MB/s ± 1%  -0.76%  (p=0.000 n=26+28)
    RegexpMatchEasy0_1K-4     127MB/s ± 8%   131MB/s ± 9%  +3.29%  (p=0.012 n=26+26)
    RegexpMatchEasy1_32-4    24.8MB/s ± 5%  24.8MB/s ± 2%    ~     (p=0.339 n=26+29)
    RegexpMatchEasy1_1K-4    97.9MB/s ± 4%  99.8MB/s ± 5%  +1.98%  (p=0.004 n=26+26)
    RegexpMatchMedium_32-4    514kB/s ± 3%   515kB/s ± 3%    ~     (p=0.391 n=28+28)
    RegexpMatchMedium_1K-4   2.04MB/s ± 2%  2.04MB/s ± 2%    ~     (p=0.517 n=25+28)
    RegexpMatchHard_32-4     1.20MB/s ± 3%  1.20MB/s ± 3%    ~     (p=0.203 n=28+28)
    RegexpMatchHard_1K-4     1.28MB/s ± 3%  1.28MB/s ± 2%    ~     (p=0.499 n=24+26)
    Revcomp-4                34.6MB/s ± 4%  34.7MB/s ± 3%    ~     (p=0.245 n=26+26)
    Template-4               1.81MB/s ± 2%  1.85MB/s ± 3%  +2.30%  (p=0.000 n=26+25)
    [Geo mean]               6.82MB/s       6.88MB/s       +0.84%
    
    fixes #20653
    
    Change-Id: Ief0d6e726e517e51ae511325b21ee72598e759ff
    Reviewed-on: https://go-review.googlesource.com/71992Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
    Run-TryBot: Cherry Zhang <cherryyz@google.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    78ea9a71
Name
Last commit
Last update
.github Loading commit data...
api Loading commit data...
doc Loading commit data...
lib/time Loading commit data...
misc Loading commit data...
src Loading commit data...
test Loading commit data...
.gitattributes Loading commit data...
.gitignore Loading commit data...
AUTHORS Loading commit data...
CONTRIBUTING.md Loading commit data...
CONTRIBUTORS Loading commit data...
LICENSE Loading commit data...
PATENTS Loading commit data...
README.md Loading commit data...
favicon.ico Loading commit data...
robots.txt Loading commit data...