• Geoff Berry's avatar
    cmd/compile/internal/ssa: add patterns for arm64 bitfield opcodes · e244a7a7
    Geoff Berry authored
    Add patterns to match common idioms for EXTR, BFI, BFXIL, SBFIZ, SBFX,
    UBFIZ and UBFX opcodes.
    
    go1 benchmarks results on Amberwing:
    name                   old time/op    new time/op    delta
    FmtManyArgs               786ns ± 2%     714ns ± 1%  -9.20%  (p=0.000 n=10+10)
    Gzip                      437ms ± 0%     402ms ± 0%  -7.99%  (p=0.000 n=10+10)
    FmtFprintfIntInt          196ns ± 0%     182ns ± 0%  -7.28%  (p=0.000 n=10+9)
    FmtFprintfPrefixedInt     207ns ± 0%     199ns ± 0%  -3.86%  (p=0.000 n=10+10)
    FmtFprintfFloat           324ns ± 0%     316ns ± 0%  -2.47%  (p=0.000 n=10+8)
    FmtFprintfInt             119ns ± 0%     117ns ± 0%  -1.68%  (p=0.000 n=10+9)
    GobDecode                12.8ms ± 2%    12.6ms ± 1%  -1.62%  (p=0.002 n=10+10)
    JSONDecode               94.4ms ± 1%    93.4ms ± 0%  -1.10%  (p=0.000 n=10+10)
    RegexpMatchEasy0_32       247ns ± 0%     245ns ± 0%  -0.65%  (p=0.000 n=10+10)
    RegexpMatchMedium_32      314ns ± 0%     312ns ± 0%  -0.64%  (p=0.000 n=10+10)
    RegexpMatchEasy0_1K       541ns ± 0%     538ns ± 0%  -0.55%  (p=0.000 n=10+9)
    TimeParse                 450ns ± 1%     448ns ± 1%  -0.42%  (p=0.035 n=9+9)
    RegexpMatchEasy1_32       244ns ± 0%     243ns ± 0%  -0.41%  (p=0.000 n=10+10)
    GoParse                  6.03ms ± 0%    6.00ms ± 0%  -0.40%  (p=0.002 n=10+10)
    RegexpMatchEasy1_1K       779ns ± 0%     777ns ± 0%  -0.26%  (p=0.000 n=10+10)
    RegexpMatchHard_32       2.75µs ± 0%    2.74µs ± 1%  -0.06%  (p=0.026 n=9+9)
    BinaryTree17              11.7s ± 0%     11.6s ± 0%    ~     (p=0.089 n=10+10)
    HTTPClientServer         89.1µs ± 1%    89.5µs ± 2%    ~     (p=0.436 n=10+10)
    RegexpMatchHard_1K       78.9µs ± 0%    79.5µs ± 2%    ~     (p=0.469 n=10+10)
    FmtFprintfEmpty          58.5ns ± 0%    58.5ns ± 0%    ~     (all equal)
    GobEncode                12.0ms ± 1%    12.1ms ± 0%    ~     (p=0.075 n=10+10)
    Revcomp                   669ms ± 0%     668ms ± 0%    ~     (p=0.091 n=7+9)
    Mandelbrot200            5.35ms ± 0%    5.36ms ± 0%  +0.07%  (p=0.000 n=9+9)
    RegexpMatchMedium_1K     52.1µs ± 0%    52.1µs ± 0%  +0.10%  (p=0.000 n=9+9)
    Fannkuch11                3.25s ± 0%     3.26s ± 0%  +0.36%  (p=0.000 n=9+10)
    FmtFprintfString          114ns ± 1%     115ns ± 0%  +0.52%  (p=0.011 n=10+10)
    JSONEncode               20.2ms ± 0%    20.3ms ± 0%  +0.65%  (p=0.000 n=10+10)
    Template                 91.3ms ± 0%    92.3ms ± 0%  +1.08%  (p=0.000 n=10+10)
    TimeFormat                484ns ± 0%     495ns ± 1%  +2.30%  (p=0.000 n=9+10)
    
    There are some opportunities to improve this change further by adding
    patterns to match the "extended register" versions of ADD/SUB/CMP, but I
    think that should be evaluated on its own.  The regressions in Template
    and TimeFormat would likely be recovered by this, as they seem to be due
    to generating:
    
        ubfiz x0, x0, #3, #8
        add x1, x2, x0
    
    instead of
    
        add x1, x2, x0, lsl #3
    
    Change-Id: I5644a8d70ac7a98e784a377a2b76ab47a3415a4b
    Reviewed-on: https://go-review.googlesource.com/88355Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
    Run-TryBot: Cherry Zhang <cherryyz@google.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    e244a7a7
rewrite.go 21.1 KB