1. 29 Sep, 2016 5 commits
  2. 28 Sep, 2016 8 commits
  3. 27 Sep, 2016 21 commits
  4. 26 Sep, 2016 6 commits
    • Austin Clements's avatar
      runtime: optimize defer code · f8b2314c
      Austin Clements authored
      This optimizes deferproc and deferreturn in various ways.
      
      The most important optimization is that it more carefully arranges to
      prevent preemption or stack growth. Currently we do this by switching
      to the system stack on every deferproc and every deferreturn. While we
      need to be on the system stack for the slow path of allocating and
      freeing defers, in the common case we can fit in the nosplit stack.
      Hence, this change pushes the system stack switch down into the slow
      paths and makes everything now exposed to the user stack nosplit. This
      also eliminates the need for various acquirem/releasem pairs, since we
      are now preventing preemption by preventing stack split checks.
      
      As another smaller optimization, we special case the common cases of
      zero-sized and pointer-sized defer frames to respectively skip the
      copy and perform the copy in line instead of calling memmove.
      
      This speeds up the runtime defer benchmark by 42%:
      
      name           old time/op  new time/op  delta
      Defer-4        75.1ns ± 1%  43.3ns ± 1%  -42.31%   (p=0.000 n=8+10)
      
      In reality, this speeds up defer by about 2.2X. The two benchmarks
      below compare a Lock/defer Unlock pair (DeferLock) with a Lock/Unlock
      pair (NoDeferLock). NoDeferLock establishes a baseline cost, so these
      two benchmarks together show that this change reduces the overhead of
      defer from 61.4ns to 27.9ns.
      
      name           old time/op  new time/op  delta
      DeferLock-4    77.4ns ± 1%  43.9ns ± 1%  -43.31%  (p=0.000 n=10+10)
      NoDeferLock-4  16.0ns ± 0%  15.9ns ± 0%   -0.39%    (p=0.000 n=9+8)
      
      This also shaves 34ns off cgo calls:
      
      name       old time/op  new time/op  delta
      CgoNoop-4   122ns ± 1%  88.3ns ± 1%  -27.72%  (p=0.000 n=8+9)
      
      Updates #14939, #16051.
      
      Change-Id: I2baa0dea378b7e4efebbee8fca919a97d5e15f38
      Reviewed-on: https://go-review.googlesource.com/29656Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      f8b2314c
    • Austin Clements's avatar
      runtime: implement getcallersp in Go · d211c2d3
      Austin Clements authored
      This makes it possible to inline getcallersp. getcallersp is on the
      hot path of defers, so this slightly speeds up defer:
      
      name           old time/op  new time/op  delta
      Defer-4        78.3ns ± 2%  75.1ns ± 1%  -4.00%   (p=0.000 n=9+8)
      
      Updates #14939.
      
      Change-Id: Icc1cc4cd2f0a81fc4c8344432d0b2e783accacdd
      Reviewed-on: https://go-review.googlesource.com/29655
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      Reviewed-by: 's avatarDavid Crawshaw <crawshaw@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      d211c2d3
    • Austin Clements's avatar
      runtime: update malloc.go documentation · aaf4099a
      Austin Clements authored
      The big documentation comment at the top of malloc.go has gotten
      woefully out of date. Update it.
      
      Change-Id: Ibdb1bdcfdd707a6dc9db79d0633a36a28882301b
      Reviewed-on: https://go-review.googlesource.com/29731Reviewed-by: 's avatarHyang-Ah Hana Kim <hyangah@gmail.com>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      aaf4099a
    • Austin Clements's avatar
      runtime: document MemStats · f67c9de6
      Austin Clements authored
      This documents all fields in MemStats and more clearly documents where
      mstats differs from MemStats.
      
      Fixes #15849.
      
      Change-Id: Ie09374bcdb3a5fdd2d25fe4bba836aaae92cb1dd
      Reviewed-on: https://go-review.googlesource.com/28972Reviewed-by: 's avatarRob Pike <r@golang.org>
      Reviewed-by: 's avatarHyang-Ah Hana Kim <hyangah@gmail.com>
      f67c9de6
    • Austin Clements's avatar
      runtime: eliminate memstats.heap_reachable · 2098e5d3
      Austin Clements authored
      We used to compute an estimate of the reachable heap size that was
      different from the marked heap size. This ultimately caused more
      problems than it solved, so we pulled it out, but memstats still has
      both heap_reachable and heap_marked, and there are some leftover TODOs
      about the problems with this estimate.
      
      Clean this up by eliminating heap_reachable in favor of heap_marked
      and deleting the stale TODOs.
      
      Change-Id: I713bc20a7c90683d2b43ff63c0b21a440269cc4d
      Reviewed-on: https://go-review.googlesource.com/29271
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      2098e5d3
    • Austin Clements's avatar
      runtime: disentangle next_gc from GC trigger · ec9c84c8
      Austin Clements authored
      Back in Go 1.4, memstats.next_gc was both the heap size at which GC
      would trigger, and the size GC kept the heap under. When we switched
      to concurrent GC in Go 1.5, we got somewhat confused and made this
      variable the trigger heap size, while gcController.heapGoal became the
      goal heap size.
      
      memstats.next_gc is exposed to the user via MemStats.NextGC, while
      gcController.heapGoal is not. This is unfortunate because 1) the heap
      goal is far more useful for diagnostics, and 2) the trigger heap size
      is just part of the GC trigger heuristic, which means it wouldn't be
      useful to an application even if it tried to use it.
      
      We never noticed this mess because MemStats.NextGC is practically
      undocumented. Now that we're trying to document MemStats, it became
      clear that this field had diverged from its original usefulness.
      
      Clean up this mess by shuffling things back around so that next_gc is
      the goal heap size and the new (unexposed) memstats.gc_trigger field
      is the trigger heap size. This eliminates gcController.heapGoal.
      
      Updates #15849.
      
      Change-Id: I2cbbd43b1d78bdf613cb43f53488bd63913189b7
      Reviewed-on: https://go-review.googlesource.com/29270
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarHyang-Ah Hana Kim <hyangah@gmail.com>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      ec9c84c8