1. 27 Apr, 2016 8 commits
  2. 26 Apr, 2016 23 commits
    • Austin Clements's avatar
      runtime: make stack re-scan O(# dirty stacks) · 2a889b9d
      Austin Clements authored
      Currently the stack re-scan during mark termination is O(# stacks)
      because we enqueue a root marking job for every goroutine. It takes
      ~34ns to process this root marking job for a valid (clean) stack, so
      at around 300k goroutines we exceed the 10ms pause goal. A non-trivial
      portion of this time is spent simply taking the cache miss to check
      the gcscanvalid flag, so simply optimizing the path that handles clean
      stacks can only improve this so much.
      
      Fix this by keeping an explicit list of goroutines with dirty stacks
      that need to be rescanned. When a goroutine first transitions to
      running after a stack scan and marks its stack dirty, it adds itself
      to this list. We enqueue root marking jobs only for the goroutines in
      this list, so this improves stack re-scanning asymptotically by
      completely eliminating time spent on clean goroutines.
      
      This reduces mark termination time for 500k idle goroutines from 15ms
      to 238µs. Overall performance effect is negligible.
      
      name \ 95%ile-time/markTerm     old           new         delta
      IdleGs/gs:500000/gomaxprocs:12  15000µs ± 0%  238µs ± 5%  -98.41% (p=0.000 n=10+10)
      
      name              old time/op  new time/op  delta
      XBenchGarbage-12  2.30ms ± 3%  2.29ms ± 1%  -0.43%  (p=0.049 n=17+18)
      
      name                      old time/op    new time/op    delta
      BinaryTree17-12              2.57s ± 3%     2.59s ± 2%    ~     (p=0.141 n=19+20)
      Fannkuch11-12                2.09s ± 0%     2.10s ± 1%  +0.53%  (p=0.000 n=19+19)
      FmtFprintfEmpty-12          45.3ns ± 3%    45.2ns ± 2%    ~     (p=0.845 n=20+20)
      FmtFprintfString-12          129ns ± 0%     127ns ± 0%  -1.55%  (p=0.000 n=16+16)
      FmtFprintfInt-12             123ns ± 0%     119ns ± 1%  -3.24%  (p=0.000 n=19+19)
      FmtFprintfIntInt-12          195ns ± 1%     189ns ± 1%  -3.11%  (p=0.000 n=17+17)
      FmtFprintfPrefixedInt-12     193ns ± 1%     187ns ± 1%  -3.06%  (p=0.000 n=19+19)
      FmtFprintfFloat-12           254ns ± 0%     255ns ± 1%  +0.35%  (p=0.001 n=14+17)
      FmtManyArgs-12               781ns ± 0%     770ns ± 0%  -1.48%  (p=0.000 n=16+19)
      GobDecode-12                7.00ms ± 1%    6.98ms ± 1%    ~     (p=0.563 n=19+19)
      GobEncode-12                5.91ms ± 1%    5.92ms ± 0%    ~     (p=0.118 n=19+18)
      Gzip-12                      219ms ± 1%     215ms ± 1%  -1.81%  (p=0.000 n=18+18)
      Gunzip-12                   37.2ms ± 0%    37.4ms ± 0%  +0.45%  (p=0.000 n=17+19)
      HTTPClientServer-12         76.9µs ± 3%    77.5µs ± 2%  +0.81%  (p=0.030 n=20+19)
      JSONEncode-12               15.0ms ± 0%    14.8ms ± 1%  -0.88%  (p=0.001 n=15+19)
      JSONDecode-12               50.6ms ± 0%    53.2ms ± 2%  +5.07%  (p=0.000 n=17+19)
      Mandelbrot200-12            4.05ms ± 0%    4.05ms ± 1%    ~     (p=0.581 n=16+17)
      GoParse-12                  3.34ms ± 1%    3.30ms ± 1%  -1.21%  (p=0.000 n=15+20)
      RegexpMatchEasy0_32-12      69.6ns ± 1%    69.8ns ± 2%    ~     (p=0.566 n=19+19)
      RegexpMatchEasy0_1K-12       238ns ± 1%     236ns ± 0%  -0.91%  (p=0.000 n=17+13)
      RegexpMatchEasy1_32-12      69.8ns ± 1%    70.0ns ± 1%  +0.23%  (p=0.026 n=17+16)
      RegexpMatchEasy1_1K-12       371ns ± 1%     363ns ± 1%  -2.07%  (p=0.000 n=19+19)
      RegexpMatchMedium_32-12      107ns ± 2%     106ns ± 1%  -0.51%  (p=0.031 n=18+20)
      RegexpMatchMedium_1K-12     33.0µs ± 0%    32.9µs ± 0%  -0.30%  (p=0.004 n=16+16)
      RegexpMatchHard_32-12       1.70µs ± 0%    1.70µs ± 0%  +0.45%  (p=0.000 n=16+17)
      RegexpMatchHard_1K-12       51.1µs ± 2%    51.4µs ± 1%  +0.53%  (p=0.000 n=17+19)
      Revcomp-12                   378ms ± 1%     385ms ± 1%  +1.92%  (p=0.000 n=19+18)
      Template-12                 64.3ms ± 2%    65.0ms ± 2%  +1.09%  (p=0.001 n=19+19)
      TimeParse-12                 315ns ± 1%     317ns ± 2%    ~     (p=0.108 n=18+20)
      TimeFormat-12                360ns ± 1%     337ns ± 0%  -6.30%  (p=0.000 n=18+13)
      [Geo mean]                  51.8µs         51.6µs       -0.48%
      
      Change-Id: Icf8994671476840e3998236e15407a505d4c760c
      Reviewed-on: https://go-review.googlesource.com/20700Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      2a889b9d
    • Austin Clements's avatar
      runtime: don't clear gcscanvalid in casfrom_Gscanstatus · 5b765ce3
      Austin Clements authored
      Currently we clear gcscanvalid in both casgstatus and
      casfrom_Gscanstatus if the new status is _Grunning. This is very
      important to do in casgstatus. However, this is potentially wrong in
      casfrom_Gscanstatus because in this case the caller doesn't own gp and
      hence the write is racy. Unlike the other _Gscan statuses, during
      _Gscanrunning, the G is still running. This does not indicate that
      it's transitioning into a running state. The scan simply hasn't
      happened yet, so it's neither valid nor invalid.
      
      Conveniently, this also means clearing gcscanvalid is unnecessary in
      this case because the G was already in _Grunning, so we can simply
      remove this code. What will happen instead is that the G will be
      preempted to scan itself, that scan will set gcscanvalid to true, and
      then the G will return to _Grunning via casgstatus, clearing
      gcscanvalid.
      
      This fix will become necessary shortly when we start keeping track of
      the set of G's with dirty stacks, since it will no longer be
      idempotent to simply set gcscanvalid to false.
      
      Change-Id: I688c82e6fbf00d5dbbbff49efa66acb99ee86785
      Reviewed-on: https://go-review.googlesource.com/20669Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      5b765ce3
    • Austin Clements's avatar
      runtime: fix typos in comment about gcscanvalid · c707d838
      Austin Clements authored
      Change-Id: Id4ad7ebf88a21eba2bc5714b96570ed5cfaed757
      Reviewed-on: https://go-review.googlesource.com/22210Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      c707d838
    • Austin Clements's avatar
      runtime: remove stack barriers during sweep · 9f263c14
      Austin Clements authored
      This adds a best-effort pass to remove stack barriers immediately
      after the end of mark termination. This isn't necessary for the Go
      runtime, but should help external tools that perform stack walks but
      aren't aware of Go's stack barriers such as GDB, perf, and VTune.
      (Though clearly they'll still have trouble unwinding stacks during
      mark.)
      
      Change-Id: I66600fae1f03ee36b5459d2b00dcc376269af18e
      Reviewed-on: https://go-review.googlesource.com/20668Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      9f263c14
    • Austin Clements's avatar
      runtime: remove stack barriers during concurrent mark · 269c969c
      Austin Clements authored
      Currently we remove stack barriers during STW mark termination, which
      has a non-trivial per-goroutine cost and means that we have to touch
      even clean stacks during mark termination. However, there's no problem
      with leaving them in during the sweep phase. They just have to be out
      by the time we install new stack barriers immediately prior to
      scanning the stack such as during the mark phase of the next GC cycle
      or during mark termination in a STW GC.
      
      Hence, move the gcRemoveStackBarriers from STW mark termination to
      just before we install new stack barriers during concurrent mark. This
      removes the cost from STW. Furthermore, this combined with concurrent
      stack shrinking means that the mark termination scan of a clean stack
      is a complete no-op, which will make it possible to skip clean stacks
      entirely during mark termination.
      
      This has the downside that it will mess up anything outside of Go that
      tries to walk Go stacks all the time instead of just some of the time.
      This includes tools like GDB, perf, and VTune. We'll improve the
      situation shortly.
      
      Change-Id: Ia40baad8f8c16aeefac05425e00b0cf478137097
      Reviewed-on: https://go-review.googlesource.com/20667Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      269c969c
    • Austin Clements's avatar
      runtime: avoid span root marking entirely during mark termination · efb0c554
      Austin Clements authored
      Currently we enqueue span root mark jobs during both concurrent mark
      and mark termination, but we make the job a no-op during mark
      termination.
      
      This is silly. Instead of queueing them up just to not do them, don't
      queue them up in the first place.
      
      Change-Id: Ie1d36de884abfb17dd0db6f0449a2b7c997affab
      Reviewed-on: https://go-review.googlesource.com/20666Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      efb0c554
    • Austin Clements's avatar
      runtime: free dead G stacks concurrently · e8337491
      Austin Clements authored
      Currently we free cached stacks of dead Gs during STW stack root
      marking. We do this during STW because there's no way to take
      ownership of a particular dead G, so attempting to free a dead G's
      stack during concurrent stack root marking could race with reusing
      that G.
      
      However, we can do this concurrently if we take a completely different
      approach. One way to prevent reuse of a dead G is to remove it from
      the free G list. Hence, this adds a new fixed root marking task that
      simply removes all Gs from the list of dead Gs with cached stacks,
      frees their stacks, and then adds them to the list of dead Gs without
      cached stacks.
      
      This is also a necessary step toward rescanning only dirty stacks,
      since it eliminates another task from STW stack marking.
      
      Change-Id: Iefbad03078b284a2e7bf30fba397da4ca87fe095
      Reviewed-on: https://go-review.googlesource.com/20665Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      e8337491
    • Austin Clements's avatar
      runtime: split gfree list into with-stacks and without-stacks · 1a2cf91f
      Austin Clements authored
      Currently all free Gs are added to one list. Split this into two
      lists: one for free Gs with cached stacks and one for Gs without
      cached stacks.
      
      This lets us preferentially allocate Gs that already have a stack, but
      more importantly, it sets us up to free cached G stacks concurrently.
      
      Change-Id: Idbe486f708997e1c9d166662995283f02d1eeb3c
      Reviewed-on: https://go-review.googlesource.com/20664Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      1a2cf91f
    • Keith Randall's avatar
      cmd/compile: a rule's line number is at its -> · 3b0efa68
      Keith Randall authored
      Let's define the line number of a multiline rule as the line
      number on which the -> appears.  This helps make the rule
      cover analysis look a bit nicer.
      
      Change-Id: I4ac4c09f2240285976590ecfd416bc4c05e78946
      Reviewed-on: https://go-review.googlesource.com/22473Reviewed-by: 's avatarJosh Bleecher Snyder <josharian@gmail.com>
      3b0efa68
    • Matthew Dempsky's avatar
      cmd/compile: lazily initialize litbuf · 8d075bee
      Matthew Dempsky authored
      Instead of eagerly creating strings like "literal 2.01" for every
      lexed number in case we need to mention it in an error message, defer
      this work to (*parser).syntax_error.
      
      name      old allocs/op  new allocs/op  delta
      Template      482k ± 0%      482k ± 0%  -0.12%   (p=0.000 n=9+10)
      GoTypes      1.35M ± 0%     1.35M ± 0%  -0.04%  (p=0.015 n=10+10)
      Compiler     5.45M ± 0%     5.44M ± 0%  -0.12%    (p=0.000 n=9+8)
      
      Change-Id: I333b3c80e583864914412fb38f8c0b7f1d8c8821
      Reviewed-on: https://go-review.googlesource.com/22480
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      8d075bee
    • Robert Griesemer's avatar
      cmd/dist: sort entries in zcgo.go generated file for deterministic build · 19912e1d
      Robert Griesemer authored
      This simplifies comparison of object files across different builds
      by ensuring that the strings in the zcgo.go always appear in the
      same order.
      
      Change-Id: I3639ea4fd10e0d645b838d1bbb03cd33deca340e
      Reviewed-on: https://go-review.googlesource.com/22478Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      19912e1d
    • Egon Elbre's avatar
      unicode: improve SimpleFold performance for ascii · e607abbf
      Egon Elbre authored
      This change significantly speeds up case-insensitive regexp matching.
      
      benchmark                      old ns/op      new ns/op      delta
      BenchmarkMatchEasy0i_32-8      2690           1473           -45.24%
      BenchmarkMatchEasy0i_1K-8      80404          42269          -47.43%
      BenchmarkMatchEasy0i_32K-8     3272187        2076118        -36.55%
      BenchmarkMatchEasy0i_1M-8      104805990      66503805       -36.55%
      BenchmarkMatchEasy0i_32M-8     3360192200     2126121600     -36.73%
      
      benchmark                      old MB/s     new MB/s     speedup
      BenchmarkMatchEasy0i_32-8      11.90        21.72        1.83x
      BenchmarkMatchEasy0i_1K-8      12.74        24.23        1.90x
      BenchmarkMatchEasy0i_32K-8     10.01        15.78        1.58x
      BenchmarkMatchEasy0i_1M-8      10.00        15.77        1.58x
      BenchmarkMatchEasy0i_32M-8     9.99         15.78        1.58x
      
      Issue #13288
      
      Change-Id: I94af7bb29e75d60b4f6ee760124867ab271b9642
      Reviewed-on: https://go-review.googlesource.com/16943Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      e607abbf
    • Alan Donovan's avatar
      gc: use AbsFileLine for deterministic binary export data · 6e4a8615
      Alan Donovan authored
      This version of the file name honors the -trimprefix flag,
      which strips off variable parts like $WORK or $PWD.
      The TestCgoConsistentResults test now passes.
      
      Change-Id: If93980b054f9b13582dd314f9d082c26eaac4f41
      Reviewed-on: https://go-review.googlesource.com/22444Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      6e4a8615
    • Robert Griesemer's avatar
      cmd/compile: don't discard inlineable but empty functions with binary export format · 17db07f9
      Robert Griesemer authored
      Change-Id: I0f016fa000f949d27847d645b4cdebe68a8abf20
      Reviewed-on: https://go-review.googlesource.com/22474
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      17db07f9
    • Michael Hudson-Doyle's avatar
      cmd/link: pass -no-pie (if supported) when creating a race-enabled executable. · 3a72d626
      Michael Hudson-Doyle authored
      Fixes #15443
      
      Change-Id: Ia3593104fc1a4255926ae5675c25390604b44b7b
      Reviewed-on: https://go-review.googlesource.com/22453
      Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      3a72d626
    • Michael Munday's avatar
      cmd/link: fix gdb backtrace on architectures using a link register · 55154cf0
      Michael Munday authored
      Also adds TestGdbBacktrace to the runtime package.
      
      Dwarf modifications written by Bryan Chan (@bryanpkc) who is also
      at IBM and covered by the same CLA.
      
      Fixes #14628
      
      Change-Id: I106a1f704c3745a31f29cdadb0032e3905829850
      Reviewed-on: https://go-review.googlesource.com/20193Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      55154cf0
    • Russ Cox's avatar
      cmd/compile/internal/gc: rewrite comment to avoid automated meaning · 01d5e63f
      Russ Cox authored
      The comment says 'DΟ NΟT SUBMIT', and that text being in a file can cause
      automated errors or warnings when trying to check the Go sources into other
      source control systems.
      
      (We reject that string in CL commit messages, which I've avoided here
      by changing the O's to Ο's above.)
      
      Change-Id: I6cdd57a8612ded5208f05a8bd6b137f44424a030
      Reviewed-on: https://go-review.googlesource.com/22434
      Run-TryBot: Russ Cox <rsc@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      01d5e63f
    • Keith Randall's avatar
      cmd/compile: more sanity checks on rewrite rules · e4355aee
      Keith Randall authored
      Make sure ops have the right number of args, set
      aux and auxint only if allowed, etc.
      
      Normalize error reporting format.
      
      Change-Id: Ie545fcc5990c8c7d62d40d9a0a55885f941eb645
      Reviewed-on: https://go-review.googlesource.com/22320Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      e4355aee
    • Michael Munday's avatar
      crypto/sha512: add s390x assembly implementation · 24a29728
      Michael Munday authored
      Renames block to blockGeneric so that it can be called when the
      assembly feature check fails. This means making block a var on
      platforms without an assembly implementation (similar to the sha1
      package).
      
      Also adds a test to check that the fallback path works correctly
      when the feature check fails.
      
      name        old speed      new speed       delta
      Hash8Bytes  7.13MB/s ± 2%  19.89MB/s ± 1%  +178.82%   (p=0.000 n=9+10)
      Hash1K       121MB/s ± 1%    661MB/s ± 1%  +444.54%   (p=0.000 n=10+9)
      Hash8K       137MB/s ± 0%    918MB/s ± 1%  +569.29%  (p=0.000 n=10+10)
      
      Change-Id: Id65dd6e943f14eeffe39a904dc88065fc6a60179
      Reviewed-on: https://go-review.googlesource.com/22402Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Michael Munday <munday@ca.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      24a29728
    • Matthew Dempsky's avatar
      net: ignore lame referral responses like libresolv · 98b99d56
      Matthew Dempsky authored
      Fixes #15434.
      
      Change-Id: Ia88b740df5418a6d3af1c29a03756f4234f388b0
      Reviewed-on: https://go-review.googlesource.com/22428Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      98b99d56
    • David Crawshaw's avatar
      cmd/link: correctly decode name length · 96b8f70e
      David Crawshaw authored
      The linker was incorrectly decoding type name lengths, causing
      typelinks to be sorted out of order and in cases where the name was
      the exact right length, linker panics.
      
      Added a test to the reflect package that causes TestTypelinksSorted
      to fail before this CL. It's not the exact failure seen in #15448
      but it has the same cause: decodetype_name calculating the wrong
      length.
      
      The equivalent decoders in reflect/type.go and runtime/type.go
      have the parenthesis in the right place.
      
      Fixes #15448
      
      Change-Id: I33257633d812b7d2091393cb9d6cc8a73e0138c8
      Reviewed-on: https://go-review.googlesource.com/22403Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      96b8f70e
    • David Chase's avatar
      cmd/compile: fix another bug in dominator computation · 0b6332eb
      David Chase authored
      Here, "fix" means "replace".  The new dominator computation
      is the "simple" algorithm from Lengauer and Tarjan's TOPLAS
      paper, with minimal changes.
      
      Also included is a test that tweaks the fixed error.
      
      Change-Id: I0abdf53d5d64df1e67e4e62f55e88957045cd63b
      Reviewed-on: https://go-review.googlesource.com/22401
      Run-TryBot: David Chase <drchase@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      0b6332eb
    • Ilya Tocar's avatar
      strings: use SSE4.2 in strings.Index on AMD64 · 6b02a192
      Ilya Tocar authored
      Use PCMPESTRI instruction if available.
      
      Index-4              21.1ns ± 0%  21.1ns ± 0%     ~     (all samples are equal)
      IndexHard1-4          395µs ± 0%   105µs ± 0%  -73.53%        (p=0.000 n=19+20)
      IndexHard2-4          300µs ± 0%   147µs ± 0%  -51.11%        (p=0.000 n=19+20)
      IndexHard3-4          665µs ± 0%   665µs ± 0%     ~           (p=0.942 n=16+19)
      
      Change-Id: I4f66794164740a2b939eb1c78934e2390b489064
      Reviewed-on: https://go-review.googlesource.com/22337
      Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      6b02a192
  3. 25 Apr, 2016 9 commits