• Giovanni Bajo's avatar
    cmd/compile: add patterns for bit set/clear/complement on amd64 · 79112707
    Giovanni Bajo authored
    This patch completes implementation of BT(Q|L), and adds support
    for BT(S|R|C)(Q|L).
    
    Example of code changes from time.(*Time).addSec:
    
            if t.wall&hasMonotonic != 0 {
      0x1073465               488b08                  MOVQ 0(AX), CX
      0x1073468               4889ca                  MOVQ CX, DX
      0x107346b               48c1e93f                SHRQ $0x3f, CX
      0x107346f               48c1e13f                SHLQ $0x3f, CX
      0x1073473               48f7c1ffffffff          TESTQ $-0x1, CX
      0x107347a               746b                    JE 0x10734e7
    
            if t.wall&hasMonotonic != 0 {
      0x1073435               488b08                  MOVQ 0(AX), CX
      0x1073438               480fbae13f              BTQ $0x3f, CX
      0x107343d               7363                    JAE 0x10734a2
    
    Another example:
    
                            t.wall = t.wall&nsecMask | uint64(dsec)<<nsecShift | hasMonotonic
      0x10734c8               4881e1ffffff3f          ANDQ $0x3fffffff, CX
      0x10734cf               48c1e61e                SHLQ $0x1e, SI
      0x10734d3               4809ce                  ORQ CX, SI
      0x10734d6               48b90000000000000080    MOVQ $0x8000000000000000, CX
      0x10734e0               4809f1                  ORQ SI, CX
      0x10734e3               488908                  MOVQ CX, 0(AX)
    
                            t.wall = t.wall&nsecMask | uint64(dsec)<<nsecShift | hasMonotonic
      0x107348b		4881e2ffffff3f		ANDQ $0x3fffffff, DX
      0x1073492		48c1e61e		SHLQ $0x1e, SI
      0x1073496		4809f2			ORQ SI, DX
      0x1073499		480fbaea3f		BTSQ $0x3f, DX
      0x107349e		488910			MOVQ DX, 0(AX)
    
    Go1 benchmarks seem unaffected, and I would be surprised
    otherwise:
    
    name                     old time/op    new time/op     delta
    BinaryTree17-4              2.64s ± 4%      2.56s ± 9%  -2.92%  (p=0.008 n=9+9)
    Fannkuch11-4                2.90s ± 1%      2.95s ± 3%  +1.76%  (p=0.010 n=10+9)
    FmtFprintfEmpty-4          35.3ns ± 1%     34.5ns ± 2%  -2.34%  (p=0.004 n=9+8)
    FmtFprintfString-4         57.0ns ± 1%     58.4ns ± 5%  +2.52%  (p=0.029 n=9+10)
    FmtFprintfInt-4            59.8ns ± 3%     59.8ns ± 6%    ~     (p=0.565 n=10+10)
    FmtFprintfIntInt-4         93.9ns ± 3%     91.2ns ± 5%  -2.94%  (p=0.014 n=10+9)
    FmtFprintfPrefixedInt-4     107ns ± 6%      104ns ± 6%    ~     (p=0.099 n=10+10)
    FmtFprintfFloat-4           187ns ± 3%      188ns ± 3%    ~     (p=0.505 n=10+9)
    FmtManyArgs-4               410ns ± 1%      415ns ± 6%    ~     (p=0.649 n=8+10)
    GobDecode-4                5.30ms ± 3%     5.27ms ± 3%    ~     (p=0.436 n=10+10)
    GobEncode-4                4.62ms ± 5%     4.47ms ± 2%  -3.24%  (p=0.001 n=9+10)
    Gzip-4                      197ms ± 4%      193ms ± 3%    ~     (p=0.123 n=10+10)
    Gunzip-4                   30.4ms ± 3%     30.1ms ± 3%    ~     (p=0.481 n=10+10)
    HTTPClientServer-4         76.3µs ± 1%     76.0µs ± 1%    ~     (p=0.236 n=8+9)
    JSONEncode-4               10.5ms ± 9%     10.3ms ± 3%    ~     (p=0.280 n=10+10)
    JSONDecode-4               42.3ms ±10%     41.3ms ± 2%    ~     (p=0.053 n=9+10)
    Mandelbrot200-4            3.80ms ± 2%     3.72ms ± 2%  -2.15%  (p=0.001 n=9+10)
    GoParse-4                  2.88ms ±10%     2.81ms ± 2%    ~     (p=0.247 n=10+10)
    RegexpMatchEasy0_32-4      69.5ns ± 4%     68.6ns ± 2%    ~     (p=0.171 n=10+10)
    RegexpMatchEasy0_1K-4       165ns ± 3%      162ns ± 3%    ~     (p=0.137 n=10+10)
    RegexpMatchEasy1_32-4      65.7ns ± 6%     64.4ns ± 2%  -2.02%  (p=0.037 n=10+10)
    RegexpMatchEasy1_1K-4       278ns ± 2%      279ns ± 3%    ~     (p=0.991 n=8+9)
    RegexpMatchMedium_32-4     99.3ns ± 3%     98.5ns ± 4%    ~     (p=0.457 n=10+9)
    RegexpMatchMedium_1K-4     30.1µs ± 1%     30.4µs ± 2%    ~     (p=0.173 n=8+10)
    RegexpMatchHard_32-4       1.40µs ± 2%     1.41µs ± 4%    ~     (p=0.565 n=10+10)
    RegexpMatchHard_1K-4       42.5µs ± 1%     41.5µs ± 3%  -2.13%  (p=0.002 n=8+9)
    Revcomp-4                   332ms ± 4%      328ms ± 5%    ~     (p=0.720 n=9+10)
    Template-4                 48.3ms ± 2%     49.6ms ± 3%  +2.56%  (p=0.002 n=8+10)
    TimeParse-4                 252ns ± 2%      249ns ± 3%    ~     (p=0.116 n=9+10)
    TimeFormat-4                262ns ± 4%      252ns ± 3%  -4.01%  (p=0.000 n=9+10)
    
    name                     old speed      new speed       delta
    GobDecode-4               145MB/s ± 3%    146MB/s ± 3%    ~     (p=0.436 n=10+10)
    GobEncode-4               166MB/s ± 5%    172MB/s ± 2%  +3.28%  (p=0.001 n=9+10)
    Gzip-4                   98.6MB/s ± 4%  100.4MB/s ± 3%    ~     (p=0.123 n=10+10)
    Gunzip-4                  639MB/s ± 3%    645MB/s ± 3%    ~     (p=0.481 n=10+10)
    JSONEncode-4              185MB/s ± 8%    189MB/s ± 3%    ~     (p=0.280 n=10+10)
    JSONDecode-4             46.0MB/s ± 9%   47.0MB/s ± 2%  +2.21%  (p=0.046 n=9+10)
    GoParse-4                20.1MB/s ± 9%   20.6MB/s ± 2%    ~     (p=0.239 n=10+10)
    RegexpMatchEasy0_32-4     460MB/s ± 4%    467MB/s ± 2%    ~     (p=0.165 n=10+10)
    RegexpMatchEasy0_1K-4    6.19GB/s ± 3%   6.28GB/s ± 3%    ~     (p=0.165 n=10+10)
    RegexpMatchEasy1_32-4     487MB/s ± 5%    497MB/s ± 2%  +2.00%  (p=0.043 n=10+10)
    RegexpMatchEasy1_1K-4    3.67GB/s ± 2%   3.67GB/s ± 3%    ~     (p=0.963 n=8+9)
    RegexpMatchMedium_32-4   10.1MB/s ± 3%   10.1MB/s ± 4%    ~     (p=0.435 n=10+9)
    RegexpMatchMedium_1K-4   34.0MB/s ± 1%   33.7MB/s ± 2%    ~     (p=0.173 n=8+10)
    RegexpMatchHard_32-4     22.9MB/s ± 2%   22.7MB/s ± 4%    ~     (p=0.565 n=10+10)
    RegexpMatchHard_1K-4     24.0MB/s ± 3%   24.7MB/s ± 3%  +2.64%  (p=0.001 n=9+9)
    Revcomp-4                 766MB/s ± 4%    775MB/s ± 5%    ~     (p=0.720 n=9+10)
    Template-4               40.2MB/s ± 2%   39.2MB/s ± 3%  -2.47%  (p=0.002 n=8+10)
    
    The rules match ~1800 times during all.bash.
    
    Fixes #18943
    
    Change-Id: I64be1ada34e89c486dfd935bf429b35652117ed4
    Reviewed-on: https://go-review.googlesource.com/94766
    Run-TryBot: Giovanni Bajo <rasky@develer.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: 's avatarKeith Randall <khr@golang.org>
    79112707
Name
Last commit
Last update
.github Loading commit data...
api Loading commit data...
doc Loading commit data...
lib/time Loading commit data...
misc Loading commit data...
src Loading commit data...
test Loading commit data...
.gitattributes Loading commit data...
.gitignore Loading commit data...
AUTHORS Loading commit data...
CONTRIBUTING.md Loading commit data...
CONTRIBUTORS Loading commit data...
LICENSE Loading commit data...
PATENTS Loading commit data...
README.md Loading commit data...
favicon.ico Loading commit data...
robots.txt Loading commit data...