• Austin Clements's avatar
    runtime: allow 5% mutator assist over 25% background mark · af192a3e
    Austin Clements authored
    Currently, both the background mark worker and the goal GC CPU are
    both fixed at 25%. The trigger controller's goal is to achieve the
    goal CPU usage, and with the previous commit it can actually achieve
    this. But this means there are *no* assists, which sounds ideal but
    actually causes problems for the trigger controller. Since the
    controller can't lower CPU usage below the background mark worker CPU,
    it saturates at the CPU goal and no longer gets feedback, which
    translates into higher variability in heap growth.
    
    This commit fixes this by allowing assists 5% CPU beyond the 25% fixed
    background mark. This avoids saturating the trigger controller, since
    it can now get feedback from both sides of the CPU goal. This leads to
    low variability in both CPU usage and heap growth, at the cost of
    reintroducing a low rate of mark assists.
    
    We also experimented with 20% background plus 5% assist, but 25%+5%
    clearly performed better in benchmarks.
    
    Updates #14951.
    Updates #14812.
    Updates #18534.
    
    Combined with the previous CL, this significantly improves tail
    mutator utilization in the x/bechmarks garbage benchmark. On a sample
    trace, it increased the 99.9%ile mutator utilization at 10ms from 26%
    to 59%, and at 5ms from 17% to 52%. It reduced the 99.9%ile zero
    utilization window from 2ms to 700µs. It also helps the mean mutator
    utilization: it increased the 10s mutator utilization from 83% to 94%.
    The minimum mutator utilization is also somewhat improved, though
    there is still some unknown artifact that causes a miniscule fraction
    of mutator assists to take 5--10ms (in fact, there was exactly one
    10ms mutator assist in my sample trace).
    
    This has no significant effect on the throughput of the
    github.com/dr2chase/bent benchmarks-50.
    
    This has little effect on the go1 benchmarks (and the slight overall
    improvement makes up for the slight overall slowdown from the previous
    commit):
    
    name                      old time/op    new time/op    delta
    BinaryTree17-12              2.40s ± 0%     2.41s ± 1%  +0.26%  (p=0.010 n=18+19)
    Fannkuch11-12                2.95s ± 0%     2.93s ± 0%  -0.62%  (p=0.000 n=18+15)
    FmtFprintfEmpty-12          42.2ns ± 0%    42.3ns ± 1%  +0.37%  (p=0.001 n=15+14)
    FmtFprintfString-12         67.9ns ± 2%    67.2ns ± 3%  -1.03%  (p=0.002 n=20+18)
    FmtFprintfInt-12            75.6ns ± 3%    76.8ns ± 2%  +1.59%  (p=0.000 n=19+17)
    FmtFprintfIntInt-12          123ns ± 1%     124ns ± 1%  +0.77%  (p=0.000 n=17+14)
    FmtFprintfPrefixedInt-12     148ns ± 1%     150ns ± 1%  +1.28%  (p=0.000 n=20+20)
    FmtFprintfFloat-12           212ns ± 0%     211ns ± 1%  -0.67%  (p=0.000 n=16+17)
    FmtManyArgs-12               499ns ± 1%     500ns ± 0%  +0.23%  (p=0.004 n=19+16)
    GobDecode-12                6.49ms ± 1%    6.51ms ± 1%  +0.32%  (p=0.008 n=19+19)
    GobEncode-12                5.47ms ± 0%    5.43ms ± 1%  -0.68%  (p=0.000 n=19+20)
    Gzip-12                      220ms ± 1%     216ms ± 1%  -1.66%  (p=0.000 n=20+19)
    Gunzip-12                   38.8ms ± 0%    38.5ms ± 0%  -0.80%  (p=0.000 n=19+20)
    HTTPClientServer-12         78.5µs ± 1%    78.1µs ± 1%  -0.53%  (p=0.008 n=20+19)
    JSONEncode-12               12.2ms ± 0%    11.9ms ± 0%  -2.38%  (p=0.000 n=17+19)
    JSONDecode-12               52.3ms ± 0%    53.3ms ± 0%  +1.84%  (p=0.000 n=19+20)
    Mandelbrot200-12            3.69ms ± 0%    3.69ms ± 0%  -0.19%  (p=0.000 n=19+19)
    GoParse-12                  3.17ms ± 1%    3.19ms ± 1%  +0.61%  (p=0.000 n=20+20)
    RegexpMatchEasy0_32-12      73.7ns ± 0%    73.2ns ± 1%  -0.66%  (p=0.000 n=17+20)
    RegexpMatchEasy0_1K-12       238ns ± 0%     239ns ± 0%  +0.32%  (p=0.000 n=17+16)
    RegexpMatchEasy1_32-12      69.1ns ± 1%    69.2ns ± 1%    ~     (p=0.669 n=19+13)
    RegexpMatchEasy1_1K-12       365ns ± 1%     367ns ± 1%  +0.49%  (p=0.000 n=19+19)
    RegexpMatchMedium_32-12      104ns ± 1%     105ns ± 1%  +1.33%  (p=0.000 n=16+20)
    RegexpMatchMedium_1K-12     33.6µs ± 3%    34.1µs ± 4%  +1.67%  (p=0.001 n=20+20)
    RegexpMatchHard_32-12       1.67µs ± 1%    1.62µs ± 1%  -2.78%  (p=0.000 n=18+17)
    RegexpMatchHard_1K-12       50.3µs ± 2%    48.7µs ± 1%  -3.09%  (p=0.000 n=19+18)
    Revcomp-12                   384ms ± 0%     386ms ± 0%  +0.59%  (p=0.000 n=19+19)
    Template-12                 61.1ms ± 1%    60.5ms ± 1%  -1.02%  (p=0.000 n=19+20)
    TimeParse-12                 307ns ± 0%     303ns ± 1%  -1.23%  (p=0.000 n=19+15)
    TimeFormat-12                323ns ± 0%     323ns ± 0%  -0.12%  (p=0.011 n=15+20)
    [Geo mean]                  47.1µs         47.0µs       -0.20%
    
    https://perf.golang.org/search?q=upload:20171030.4
    
    It slightly improve the performance the x/benchmarks:
    
    name                         old time/op  new time/op  delta
    Garbage/benchmem-MB=1024-12  2.29ms ± 3%  2.22ms ± 2%  -2.97%  (p=0.000 n=18+18)
    Garbage/benchmem-MB=64-12    2.24ms ± 2%  2.21ms ± 2%  -1.64%  (p=0.000 n=18+18)
    HTTP-12                      12.6µs ± 1%  12.6µs ± 1%    ~     (p=0.690 n=19+17)
    JSON-12                      11.3ms ± 2%  11.3ms ± 1%    ~     (p=0.163 n=17+18)
    
    and fixes some of the heap size bloat caused by the previous commit:
    
    name                         old peak-RSS-bytes  new peak-RSS-bytes  delta
    Garbage/benchmem-MB=1024-12          1.88G ± 2%          1.77G ± 2%  -5.52%  (p=0.000 n=20+18)
    Garbage/benchmem-MB=64-12             248M ± 8%           226M ± 5%  -8.93%  (p=0.000 n=20+20)
    HTTP-12                              47.0M ±27%          47.2M ±12%    ~     (p=0.512 n=20+20)
    JSON-12                               206M ±11%           206M ±10%    ~     (p=0.841 n=20+20)
    
    https://perf.golang.org/search?q=upload:20171030.5
    
    Combined with the change to add a soft goal in the previous commit,
    the achieves a decent performance improvement on the garbage
    benchmark:
    
    name                         old time/op  new time/op  delta
    Garbage/benchmem-MB=1024-12  2.40ms ± 4%  2.22ms ± 2%  -7.40%  (p=0.000 n=19+18)
    Garbage/benchmem-MB=64-12    2.23ms ± 1%  2.21ms ± 2%  -1.06%  (p=0.000 n=19+18)
    HTTP-12                      12.5µs ± 1%  12.6µs ± 1%    ~     (p=0.330 n=20+17)
    JSON-12                      11.1ms ± 1%  11.3ms ± 1%  +1.87%  (p=0.000 n=16+18)
    
    https://perf.golang.org/search?q=upload:20171030.6
    
    Change-Id: If04ddb57e1e58ef2fb9eec54c290eb4ae4bea121
    Reviewed-on: https://go-review.googlesource.com/59971
    Run-TryBot: Austin Clements <austin@google.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
    af192a3e
mgc.go 72.5 KB