• Austin Clements's avatar
    cmd/compile: compiler support for buffered write barrier · 7e343134
    Austin Clements authored
    This CL implements the compiler support for calling the buffered write
    barrier added by the previous CL.
    
    Since the buffered write barrier is only implemented on amd64 right
    now, this still supports the old, eager write barrier as well. There's
    little overhead to supporting both and this way a few tests in
    test/fixedbugs that expect to have liveness maps at write barrier
    calls can easily opt-in to the old, eager barrier.
    
    This significantly improves the performance of the write barrier:
    
    name             old time/op  new time/op  delta
    WriteBarrier-12  73.5ns ±20%  19.2ns ±27%  -73.90%  (p=0.000 n=19+18)
    
    It also reduces the size of binaries because the write barrier call is
    more compact:
    
    name        old object-bytes  new object-bytes  delta
    Template           398k ± 0%         393k ± 0%  -1.14%  (p=0.008 n=5+5)
    Unicode            208k ± 0%         206k ± 0%  -1.00%  (p=0.008 n=5+5)
    GoTypes           1.18M ± 0%        1.15M ± 0%  -2.00%  (p=0.008 n=5+5)
    Compiler          4.05M ± 0%        3.88M ± 0%  -4.26%  (p=0.008 n=5+5)
    SSA               8.25M ± 0%        8.11M ± 0%  -1.59%  (p=0.008 n=5+5)
    Flate              228k ± 0%         224k ± 0%  -1.83%  (p=0.008 n=5+5)
    GoParser           295k ± 0%         284k ± 0%  -3.62%  (p=0.008 n=5+5)
    Reflect           1.00M ± 0%        0.99M ± 0%  -0.70%  (p=0.008 n=5+5)
    Tar                339k ± 0%         333k ± 0%  -1.67%  (p=0.008 n=5+5)
    XML                404k ± 0%         395k ± 0%  -2.10%  (p=0.008 n=5+5)
    [Geo mean]         704k              690k       -2.00%
    
    name        old exe-bytes     new exe-bytes     delta
    HelloSize         1.05M ± 0%        1.04M ± 0%  -1.55%  (p=0.008 n=5+5)
    
    https://perf.golang.org/search?q=upload:20171027.1
    
    (Amusingly, this also reduces compiler allocations by 0.75%, which,
    combined with the better write barrier, speeds up the compiler overall
    by 2.10%. See the perf link.)
    
    It slightly improves the performance of most of the go1 benchmarks and
    improves the performance of the x/benchmarks:
    
    name                      old time/op    new time/op    delta
    BinaryTree17-12              2.40s ± 1%     2.47s ± 1%  +2.69%  (p=0.000 n=19+19)
    Fannkuch11-12                2.95s ± 0%     2.95s ± 0%  +0.21%  (p=0.000 n=20+19)
    FmtFprintfEmpty-12          41.8ns ± 4%    41.4ns ± 2%  -1.03%  (p=0.014 n=20+20)
    FmtFprintfString-12         68.7ns ± 2%    67.5ns ± 1%  -1.75%  (p=0.000 n=20+17)
    FmtFprintfInt-12            79.0ns ± 3%    77.1ns ± 1%  -2.40%  (p=0.000 n=19+17)
    FmtFprintfIntInt-12          127ns ± 1%     123ns ± 3%  -3.42%  (p=0.000 n=20+20)
    FmtFprintfPrefixedInt-12     152ns ± 1%     150ns ± 1%  -1.02%  (p=0.000 n=18+17)
    FmtFprintfFloat-12           211ns ± 1%     209ns ± 0%  -0.99%  (p=0.000 n=20+16)
    FmtManyArgs-12               500ns ± 0%     496ns ± 0%  -0.73%  (p=0.000 n=17+20)
    GobDecode-12                6.44ms ± 1%    6.53ms ± 0%  +1.28%  (p=0.000 n=20+19)
    GobEncode-12                5.46ms ± 0%    5.46ms ± 1%    ~     (p=0.550 n=19+20)
    Gzip-12                      220ms ± 1%     216ms ± 0%  -1.75%  (p=0.000 n=19+19)
    Gunzip-12                   38.8ms ± 0%    38.6ms ± 0%  -0.30%  (p=0.000 n=18+19)
    HTTPClientServer-12         79.0µs ± 1%    78.2µs ± 1%  -1.01%  (p=0.000 n=20+20)
    JSONEncode-12               11.9ms ± 0%    11.9ms ± 0%  -0.29%  (p=0.000 n=20+19)
    JSONDecode-12               52.6ms ± 0%    52.2ms ± 0%  -0.68%  (p=0.000 n=19+20)
    Mandelbrot200-12            3.69ms ± 0%    3.68ms ± 0%  -0.36%  (p=0.000 n=20+20)
    GoParse-12                  3.13ms ± 1%    3.18ms ± 1%  +1.67%  (p=0.000 n=19+20)
    RegexpMatchEasy0_32-12      73.2ns ± 1%    72.3ns ± 1%  -1.19%  (p=0.000 n=19+18)
    RegexpMatchEasy0_1K-12       241ns ± 0%     239ns ± 0%  -0.83%  (p=0.000 n=17+16)
    RegexpMatchEasy1_32-12      68.6ns ± 1%    69.0ns ± 1%  +0.47%  (p=0.015 n=18+16)
    RegexpMatchEasy1_1K-12       364ns ± 0%     361ns ± 0%  -0.67%  (p=0.000 n=16+17)
    RegexpMatchMedium_32-12      104ns ± 1%     103ns ± 1%  -0.79%  (p=0.001 n=20+15)
    RegexpMatchMedium_1K-12     33.8µs ± 3%    34.0µs ± 2%    ~     (p=0.267 n=20+19)
    RegexpMatchHard_32-12       1.64µs ± 1%    1.62µs ± 2%  -1.25%  (p=0.000 n=19+18)
    RegexpMatchHard_1K-12       49.2µs ± 0%    48.7µs ± 1%  -0.93%  (p=0.000 n=19+18)
    Revcomp-12                   391ms ± 5%     396ms ± 7%    ~     (p=0.154 n=19+19)
    Template-12                 63.1ms ± 0%    59.5ms ± 0%  -5.76%  (p=0.000 n=18+19)
    TimeParse-12                 307ns ± 0%     306ns ± 0%  -0.39%  (p=0.000 n=19+17)
    TimeFormat-12                325ns ± 0%     323ns ± 0%  -0.50%  (p=0.000 n=19+19)
    [Geo mean]                  47.3µs         46.9µs       -0.67%
    
    https://perf.golang.org/search?q=upload:20171026.1
    
    name                       old time/op  new time/op  delta
    Garbage/benchmem-MB=64-12  2.25ms ± 1%  2.20ms ± 1%  -2.31%  (p=0.000 n=18+18)
    HTTP-12                    12.6µs ± 0%  12.6µs ± 0%  -0.72%  (p=0.000 n=18+17)
    JSON-12                    11.0ms ± 0%  11.0ms ± 1%  -0.68%  (p=0.000 n=17+19)
    
    https://perf.golang.org/search?q=upload:20171026.2
    
    Updates #14951.
    Updates #22460.
    
    Change-Id: Id4c0932890a1d41020071bec73b8522b1367d3e7
    Reviewed-on: https://go-review.googlesource.com/73712
    Run-TryBot: Austin Clements <austin@google.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
    7e343134
issue15747.go 1.42 KB