• Austin Clements's avatar
    runtime: deduct correct sweep credit · 6383fb61
    Austin Clements authored
    deductSweepCredit expects the size in bytes of the span being
    allocated, but mCentral_CacheSpan passes the size of a single object
    in the span. As a result, we don't sweep enough on that call and when
    mCentral_CacheSpan later calls reimburseSweepCredit, it's very likely
    to underflow mheap_.spanBytesAlloc, which causes the next call to
    deductSweepCredit to think it owes a huge number of pages and finish
    off the whole sweep.
    
    In addition to causing the occasional allocation that triggers the
    full sweep to be potentially extremely expensive relative to other
    allocations, this can indirectly slow down many other allocations.
    deductSweepCredit uses sweepone to sweep spans, which returns
    fully-unused spans to the heap, where these spans are freed and
    coalesced with neighboring free spans. On the other hand, when
    mCentral_CacheSpan sweeps a span, it does so with the intent to
    immediately reuse that span and, as a result, will not return the span
    to the heap even if it is fully unused. This saves on the cost of
    locking the heap, finding a span, and initializing that span. For
    example, before this change, with GOMAXPROCS=1 (or the background
    sweeper disabled) BinaryTree17 returned roughly 220K spans to the heap
    and allocated new spans from the heap roughly 232K times. After this
    change, it returns 1.3K spans to the heap and allocates new spans from
    the heap 39K times. (With background sweeping these numbers are
    effectively unchanged because the background sweeper sweeps almost all
    of the spans with sweepone; however, parallel sweeping saves more than
    the cost of allocating spans from the heap.)
    
    Fixes #13535.
    Fixes #13589.
    
    name                      old time/op    new time/op    delta
    BinaryTree17-12              3.03s ± 1%     2.86s ± 4%  -5.61%  (p=0.000 n=18+20)
    Fannkuch11-12                2.48s ± 1%     2.49s ± 1%    ~     (p=0.060 n=17+20)
    FmtFprintfEmpty-12          50.7ns ± 1%    50.9ns ± 1%  +0.43%  (p=0.025 n=15+16)
    FmtFprintfString-12          174ns ± 2%     174ns ± 2%    ~     (p=0.539 n=19+20)
    FmtFprintfInt-12             158ns ± 1%     158ns ± 1%    ~     (p=0.300 n=18+20)
    FmtFprintfIntInt-12          269ns ± 2%     269ns ± 2%    ~     (p=0.784 n=20+18)
    FmtFprintfPrefixedInt-12     233ns ± 1%     234ns ± 1%    ~     (p=0.389 n=18+18)
    FmtFprintfFloat-12           309ns ± 1%     310ns ± 1%  +0.25%  (p=0.048 n=18+18)
    FmtManyArgs-12              1.10µs ± 1%    1.10µs ± 1%    ~     (p=0.259 n=18+19)
    GobDecode-12                7.81ms ± 1%    7.72ms ± 1%  -1.17%  (p=0.000 n=19+19)
    GobEncode-12                6.56ms ± 0%    6.55ms ± 1%    ~     (p=0.433 n=17+19)
    Gzip-12                      318ms ± 2%     317ms ± 1%    ~     (p=0.578 n=19+18)
    Gunzip-12                   42.1ms ± 2%    42.0ms ± 0%  -0.45%  (p=0.007 n=18+16)
    HTTPClientServer-12         63.9µs ± 1%    64.0µs ± 1%    ~     (p=0.146 n=17+19)
    JSONEncode-12               16.4ms ± 1%    16.4ms ± 1%    ~     (p=0.271 n=19+19)
    JSONDecode-12               58.1ms ± 1%    58.0ms ± 1%    ~     (p=0.152 n=18+18)
    Mandelbrot200-12            3.85ms ± 0%    3.85ms ± 0%    ~     (p=0.126 n=19+18)
    GoParse-12                  3.71ms ± 1%    3.64ms ± 1%  -1.86%  (p=0.000 n=20+18)
    RegexpMatchEasy0_32-12       100ns ± 2%     100ns ± 1%    ~     (p=0.588 n=20+20)
    RegexpMatchEasy0_1K-12       346ns ± 1%     347ns ± 1%  +0.27%  (p=0.014 n=17+20)
    RegexpMatchEasy1_32-12      82.9ns ± 3%    83.5ns ± 3%    ~     (p=0.096 n=19+20)
    RegexpMatchEasy1_1K-12       506ns ± 1%     506ns ± 1%    ~     (p=0.530 n=19+19)
    RegexpMatchMedium_32-12      129ns ± 2%     129ns ± 1%    ~     (p=0.566 n=20+19)
    RegexpMatchMedium_1K-12     39.4µs ± 1%    39.4µs ± 1%    ~     (p=0.713 n=19+20)
    RegexpMatchHard_32-12       2.05µs ± 1%    2.06µs ± 1%  +0.36%  (p=0.008 n=18+20)
    RegexpMatchHard_1K-12       61.6µs ± 1%    61.7µs ± 1%    ~     (p=0.286 n=19+20)
    Revcomp-12                   538ms ± 1%     541ms ± 2%    ~     (p=0.081 n=18+19)
    Template-12                 71.5ms ± 2%    71.6ms ± 1%    ~     (p=0.513 n=20+19)
    TimeParse-12                 357ns ± 1%     357ns ± 1%    ~     (p=0.935 n=19+18)
    TimeFormat-12                352ns ± 1%     352ns ± 1%    ~     (p=0.293 n=19+20)
    [Geo mean]                  62.0µs         61.9µs       -0.21%
    
    name              old time/op  new time/op  delta
    XBenchGarbage-12  5.83ms ± 2%  5.86ms ± 3%    ~     (p=0.247 n=19+20)
    
    Change-Id: I790bb530adace27ccf25d372f24a11954b88443c
    Reviewed-on: https://go-review.googlesource.com/17745
    Run-TryBot: Austin Clements <austin@google.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
    Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
    6383fb61
Name
Last commit
Last update
api Loading commit data...
doc Loading commit data...
lib/time Loading commit data...
misc Loading commit data...
src Loading commit data...
test Loading commit data...
.gitattributes Loading commit data...
.gitignore Loading commit data...
AUTHORS Loading commit data...
CONTRIBUTING.md Loading commit data...
CONTRIBUTORS Loading commit data...
LICENSE Loading commit data...
PATENTS Loading commit data...
README.md Loading commit data...
favicon.ico Loading commit data...
robots.txt Loading commit data...