1. 23 Apr, 2016 2 commits
    • Dmitry Vyukov's avatar
      runtime: use per-goroutine sequence numbers in tracer · a3703618
      Dmitry Vyukov authored
      Currently tracer uses global sequencer and it introduces
      significant slowdown on parallel machines (up to 10x).
      Replace the global sequencer with per-goroutine sequencer.
      
      If we assign per-goroutine sequence numbers to only 3 types
      of events (start, unblock and syscall exit), it is enough to
      restore consistent partial ordering of all events. Even these
      events don't need sequence numbers all the time (if goroutine
      starts on the same P where it was unblocked, then start does
      not need sequence number).
      The burden of restoring the order is put on trace parser.
      Details of the algorithm are described in the comments.
      
      On http benchmark with GOMAXPROCS=48:
      no tracing: 5026 ns/op
      tracing: 27803 ns/op (+453%)
      with this change: 6369 ns/op (+26%, mostly for traceback)
      
      Also trace size is reduced by ~22%. Average event size before: 4.63
      bytes/event, after: 3.62 bytes/event.
      
      Besides running trace tests, I've also tested with manually broken
      cputicks (random skew for each event, per-P skew and episodic random skew).
      In all cases broken timestamps were detected and no test failures.
      
      Change-Id: I078bde421ccc386a66f6c2051ab207bcd5613efa
      Reviewed-on: https://go-review.googlesource.com/21512
      Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
      Reviewed-by: 's avatarAustin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      a3703618
    • Francesc Campoy's avatar
      doc: mention security from contribution guidelines · ba966f5d
      Francesc Campoy authored
      Fixes #15413
      
      Change-Id: I837a391276eed565cf66d3715ec68b7b959ce143
      Reviewed-on: https://go-review.googlesource.com/22390Reviewed-by: 's avatarAndrew Gerrand <adg@golang.org>
      ba966f5d
  2. 22 Apr, 2016 26 commits
  3. 21 Apr, 2016 12 commits
    • Robert Griesemer's avatar
      flag: update test case (fix build) · 70184087
      Robert Griesemer authored
      Change-Id: I2275dc703be4fda3feedf76483148eab853b43b8
      Reviewed-on: https://go-review.googlesource.com/22360
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: 's avatarRob Pike <r@golang.org>
      70184087
    • Michael Hudson-Doyle's avatar
      cmd/link: convert Link.Filesyms into a slice · 25d95ee9
      Michael Hudson-Doyle authored
      Change-Id: I6490de325b0f4ba962c679503102d30d41dcc384
      Reviewed-on: https://go-review.googlesource.com/22359
      Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      25d95ee9
    • Michael Hudson-Doyle's avatar
      cmd/link: fix Codeblk printing when -a to use Textp as a slice · 4b175fd2
      Michael Hudson-Doyle authored
      Does anyone actually pass -a to the linker?
      
      Change-Id: I1d31ea66aa5604b7fd42adf15bdab71e9f52d0ed
      Reviewed-on: https://go-review.googlesource.com/22356
      Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDavid Crawshaw <crawshaw@golang.org>
      4b175fd2
    • Rob Pike's avatar
      doc/go1.7.txt: 0s for zero duration, go doc groups constructors with types · 0ef041cf
      Rob Pike authored
      Change-Id: I4fc35649ff5a3510f5667b62e7e84e113e95dffe
      Reviewed-on: https://go-review.googlesource.com/22358Reviewed-by: 's avatarRob Pike <r@golang.org>
      0ef041cf
    • Rob Pike's avatar
      time: print zero duration as 0s, not 0 · 9c4295b5
      Rob Pike authored
      There should be a unit, and s is the SI unit name, so use that.
      The other obvious possibility is ns (nanosecond), but the fact
      that durations are measured in nanoseconds is an internal detail.
      
      Fixes #14058.
      
      Change-Id: Id1f8f3c77088224d9f7cd643778713d5cc3be5d9
      Reviewed-on: https://go-review.googlesource.com/22357Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      9c4295b5
    • Rob Pike's avatar
      cmd/doc: group constructors with type in package presentation · a33e9cf7
      Rob Pike authored
      Fixes #14004.
      
      $ go doc encoding.gob
      Before:
      func Register(value interface{})
      func RegisterName(name string, value interface{})
      func NewDecoder(r io.Reader) *Decoder
      func NewEncoder(w io.Writer) *Encoder
      type CommonType struct { ... }
      type Decoder struct { ... }
      type Encoder struct { ... }
      type GobDecoder interface { ... }
      type GobEncoder interface { ... }
      
      After:
      func Register(value interface{})
      func RegisterName(name string, value interface{})
      type CommonType struct { ... }
      type Decoder struct { ... }
          func NewDecoder(r io.Reader) *Decoder
      type Encoder struct { ... }
          func NewEncoder(w io.Writer) *Encoder
      type GobDecoder interface { ... }
      type GobEncoder interface { ... }
      
      Change-Id: I021db25bce4a16b3dfa22ab323ca1f4e68d50111
      Reviewed-on: https://go-review.googlesource.com/22354Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      a33e9cf7
    • Keith Randall's avatar
      cmd/compile: Use pre-regalloc value ID in lateSpillUse · 8ad8d7d8
      Keith Randall authored
      The cached copy's ID is sometimes outside the bounds of the orig array.
      
      There's no reason to start at the cached copy and work backwards
      to the original value. We already have the original value ID at
      all the callsites.
      
      Fixes noopt build
      
      Change-Id: I313508a1917e838a87e8cc83b2ef3c2e4a8db304
      Reviewed-on: https://go-review.googlesource.com/22355
      Run-TryBot: Keith Randall <khr@golang.org>
      Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      8ad8d7d8
    • Matthew Dempsky's avatar
      cmd/compile: split TSLICE into separate Type kind · 40f1d0ca
      Matthew Dempsky authored
      Instead of using TARRAY for both arrays and slices, create a new
      TSLICE kind to handle slices.
      
      Also, get rid of the "DDDArray" distinction. While kinda ugly, it
      seems likely we'll need to defer evaluating the constant bounds
      expressions for golang.org/issue/13890.
      
      Passes toolstash/buildall.
      
      Change-Id: I8e45d4900e7df3a04cce59428ec8b38035d3cc3a
      Reviewed-on: https://go-review.googlesource.com/22329Reviewed-by: 's avatarJosh Bleecher Snyder <josharian@gmail.com>
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      40f1d0ca
    • Robert Griesemer's avatar
      spec: fix incorrect comment in shift example · 5213cd70
      Robert Griesemer authored
      - adjusted example code
      - fixed comments
      
      Fixes #14785.
      
      Change-Id: Ia757dc93b0a69b8408559885ece7f3685a37daaa
      Reviewed-on: https://go-review.googlesource.com/22353Reviewed-by: 's avatarRob Pike <r@golang.org>
      5213cd70
    • Austin Clements's avatar
      runtime: eliminate floating garbage estimate · c8bd293e
      Austin Clements authored
      Currently when we compute the trigger for the next GC, we do it based
      on an estimate of the reachable heap size at the start of the GC
      cycle, which is itself based on an estimate of the floating garbage.
      This was introduced by 4655aadd to fix a bad feedback loop that allowed
      the heap to grow to many times the true reachable size.
      
      However, this estimate gets easily confused by rapidly allocating
      applications, and, worse it's different than the heap size the trigger
      controller uses to compute the trigger itself. This results in the
      trigger controller often thinking that GC finished before it started.
      Since this would be a pretty great outcome from it's perspective, it
      sets the trigger for the next cycle as close to the next goal as
      possible (which is limited to 95% of the goal).
      
      Furthermore, the bad feedback loop this estimate originally fixed
      seems not to happen any more, suggesting it was fixed more correctly
      by some other change in the mean time. Finally, with the change to
      allocate black, it shouldn't even be theoretically possible for this
      bad feedback loop to occur.
      
      Hence, eliminate the floating garbage estimate and simply consider the
      reachable heap to be the marked heap. This harms overall throughput
      slightly for allocation-heavy benchmarks, but significantly improves
      mutator availability.
      
      Fixes #12204. This brings the average trigger in this benchmark from
      0.95 (the cap) to 0.7 and the active GC utilization from ~90% to ~45%.
      
      Updates #14951. This makes the trigger controller much better behaved,
      so it pulls the trigger lower if assists are consuming a lot of CPU
      like it's supposed to, increasing mutator availability.
      
      name              old time/op  new time/op  delta
      XBenchGarbage-12  2.21ms ± 1%  2.28ms ± 3%  +3.29%  (p=0.000 n=17+17)
      
      Some of this slow down we paid for in earlier commits. Relative to the
      start of the series to switch to allocate-black (the parent of "count
      black allocations toward scan work"), the garbage benchmark is 2.62%
      slower.
      
      name                      old time/op    new time/op    delta
      BinaryTree17-12              2.53s ± 3%     2.53s ± 3%    ~     (p=0.708 n=20+19)
      Fannkuch11-12                2.08s ± 0%     2.08s ± 0%  -0.22%  (p=0.002 n=19+18)
      FmtFprintfEmpty-12          45.3ns ± 2%    45.2ns ± 3%    ~     (p=0.505 n=20+20)
      FmtFprintfString-12          129ns ± 0%     131ns ± 2%  +1.80%  (p=0.000 n=16+19)
      FmtFprintfInt-12             121ns ± 2%     121ns ± 2%    ~     (p=0.768 n=19+19)
      FmtFprintfIntInt-12          186ns ± 1%     188ns ± 3%  +0.99%  (p=0.000 n=19+19)
      FmtFprintfPrefixedInt-12     188ns ± 1%     188ns ± 1%    ~     (p=0.947 n=18+16)
      FmtFprintfFloat-12           254ns ± 1%     255ns ± 1%  +0.30%  (p=0.002 n=19+17)
      FmtManyArgs-12               763ns ± 0%     770ns ± 0%  +0.92%  (p=0.000 n=18+18)
      GobDecode-12                7.00ms ± 1%    7.04ms ± 1%  +0.61%  (p=0.049 n=20+20)
      GobEncode-12                5.88ms ± 1%    5.88ms ± 0%    ~     (p=0.641 n=18+19)
      Gzip-12                      214ms ± 1%     215ms ± 1%  +0.43%  (p=0.002 n=18+19)
      Gunzip-12                   37.6ms ± 0%    37.6ms ± 0%  +0.11%  (p=0.015 n=17+18)
      HTTPClientServer-12         76.9µs ± 2%    78.1µs ± 2%  +1.44%  (p=0.000 n=20+18)
      JSONEncode-12               15.2ms ± 2%    15.1ms ± 1%    ~     (p=0.271 n=19+18)
      JSONDecode-12               53.1ms ± 1%    53.3ms ± 0%  +0.49%  (p=0.000 n=18+19)
      Mandelbrot200-12            4.04ms ± 1%    4.03ms ± 0%  -0.33%  (p=0.005 n=18+18)
      GoParse-12                  3.29ms ± 1%    3.28ms ± 1%    ~     (p=0.146 n=16+17)
      RegexpMatchEasy0_32-12      69.9ns ± 3%    69.5ns ± 1%    ~     (p=0.785 n=20+19)
      RegexpMatchEasy0_1K-12       237ns ± 0%     237ns ± 0%    ~     (p=1.000 n=18+18)
      RegexpMatchEasy1_32-12      69.5ns ± 1%    69.2ns ± 1%  -0.44%  (p=0.020 n=16+19)
      RegexpMatchEasy1_1K-12       372ns ± 1%     371ns ± 2%    ~     (p=0.086 n=20+19)
      RegexpMatchMedium_32-12      108ns ± 3%     107ns ± 1%  -1.00%  (p=0.004 n=19+14)
      RegexpMatchMedium_1K-12     34.2µs ± 4%    34.0µs ± 2%    ~     (p=0.380 n=19+20)
      RegexpMatchHard_32-12       1.77µs ± 4%    1.76µs ± 3%    ~     (p=0.558 n=18+20)
      RegexpMatchHard_1K-12       53.4µs ± 4%    52.8µs ± 2%  -1.10%  (p=0.020 n=18+20)
      Revcomp-12                   359ms ± 4%     377ms ± 0%  +5.19%  (p=0.000 n=20+18)
      Template-12                 63.7ms ± 2%    62.9ms ± 2%  -1.27%  (p=0.005 n=18+20)
      TimeParse-12                 316ns ± 2%     313ns ± 1%    ~     (p=0.059 n=20+16)
      TimeFormat-12                329ns ± 0%     331ns ± 0%  +0.39%  (p=0.000 n=16+18)
      [Geo mean]                  51.6µs         51.7µs       +0.18%
      
      Change-Id: I1dce4640c8205d41717943b021039fffea863c57
      Reviewed-on: https://go-review.googlesource.com/21324Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      c8bd293e
    • Austin Clements's avatar
      runtime: allocate black during GC · 6002e01e
      Austin Clements authored
      Currently we allocate white for most of concurrent marking. This is
      based on the classical argument that it produces less floating
      garbage, since allocations during GC may not get linked into the heap
      and allocating white lets us reclaim these. However, it's not clear
      how often this actually happens, especially since our write barrier
      shades any pointer as soon as it's installed in the heap regardless of
      the color of the slot.
      
      On the other hand, allocating black has several advantages that seem
      to significantly outweigh this downside.
      
      1) It naturally bounds the total scan work to the live heap size at
      the start of a GC cycle. Allocating white does not, and thus depends
      entirely on assists to prevent the heap from growing faster than it
      can be scanned.
      
      2) It reduces the total amount of scan work per GC cycle by the size
      of newly allocated objects that are linked into the heap graph, since
      objects allocated black never need to be scanned.
      
      3) It reduces total write barrier work since more objects will already
      be black when they are linked into the heap graph.
      
      This gives a slight overall improvement in benchmarks.
      
      name              old time/op  new time/op  delta
      XBenchGarbage-12  2.24ms ± 0%  2.21ms ± 1%  -1.32%  (p=0.000 n=18+17)
      
      name                      old time/op    new time/op    delta
      BinaryTree17-12              2.60s ± 3%     2.53s ± 3%  -2.56%  (p=0.000 n=20+20)
      Fannkuch11-12                2.08s ± 1%     2.08s ± 0%    ~     (p=0.452 n=19+19)
      FmtFprintfEmpty-12          45.1ns ± 2%    45.3ns ± 2%    ~     (p=0.367 n=19+20)
      FmtFprintfString-12          131ns ± 3%     129ns ± 0%  -1.60%  (p=0.000 n=20+16)
      FmtFprintfInt-12             122ns ± 0%     121ns ± 2%  -0.86%  (p=0.000 n=16+19)
      FmtFprintfIntInt-12          187ns ± 1%     186ns ± 1%    ~     (p=0.514 n=18+19)
      FmtFprintfPrefixedInt-12     189ns ± 0%     188ns ± 1%  -0.54%  (p=0.000 n=16+18)
      FmtFprintfFloat-12           256ns ± 0%     254ns ± 1%  -0.43%  (p=0.000 n=17+19)
      FmtManyArgs-12               769ns ± 0%     763ns ± 0%  -0.72%  (p=0.000 n=18+18)
      GobDecode-12                7.08ms ± 2%    7.00ms ± 1%  -1.22%  (p=0.000 n=20+20)
      GobEncode-12                5.88ms ± 0%    5.88ms ± 1%    ~     (p=0.406 n=18+18)
      Gzip-12                      214ms ± 0%     214ms ± 1%    ~     (p=0.103 n=17+18)
      Gunzip-12                   37.6ms ± 0%    37.6ms ± 0%    ~     (p=0.563 n=17+17)
      HTTPClientServer-12         77.2µs ± 3%    76.9µs ± 2%    ~     (p=0.606 n=20+20)
      JSONEncode-12               15.1ms ± 1%    15.2ms ± 2%    ~     (p=0.138 n=19+19)
      JSONDecode-12               53.3ms ± 1%    53.1ms ± 1%  -0.33%  (p=0.000 n=19+18)
      Mandelbrot200-12            4.04ms ± 1%    4.04ms ± 1%    ~     (p=0.075 n=19+18)
      GoParse-12                  3.30ms ± 1%    3.29ms ± 1%  -0.57%  (p=0.000 n=18+16)
      RegexpMatchEasy0_32-12      69.5ns ± 1%    69.9ns ± 3%    ~     (p=0.822 n=18+20)
      RegexpMatchEasy0_1K-12       237ns ± 1%     237ns ± 0%    ~     (p=0.398 n=19+18)
      RegexpMatchEasy1_32-12      69.8ns ± 2%    69.5ns ± 1%    ~     (p=0.090 n=20+16)
      RegexpMatchEasy1_1K-12       371ns ± 1%     372ns ± 1%    ~     (p=0.178 n=19+20)
      RegexpMatchMedium_32-12      108ns ± 2%     108ns ± 3%    ~     (p=0.124 n=20+19)
      RegexpMatchMedium_1K-12     33.9µs ± 2%    34.2µs ± 4%    ~     (p=0.309 n=20+19)
      RegexpMatchHard_32-12       1.75µs ± 2%    1.77µs ± 4%  +1.28%  (p=0.018 n=19+18)
      RegexpMatchHard_1K-12       52.7µs ± 1%    53.4µs ± 4%  +1.23%  (p=0.013 n=15+18)
      Revcomp-12                   354ms ± 1%     359ms ± 4%  +1.27%  (p=0.043 n=20+20)
      Template-12                 63.6ms ± 2%    63.7ms ± 2%    ~     (p=0.654 n=20+18)
      TimeParse-12                 313ns ± 1%     316ns ± 2%  +0.80%  (p=0.014 n=17+20)
      TimeFormat-12                332ns ± 0%     329ns ± 0%  -0.66%  (p=0.000 n=16+16)
      [Geo mean]                  51.7µs         51.6µs       -0.09%
      
      Change-Id: I2214a6a0e4f544699ea166073249a8efdf080dc0
      Reviewed-on: https://go-review.googlesource.com/21323Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      6002e01e
    • Austin Clements's avatar
      runtime: simplify/optimize allocate-black a bit · 64a26b79
      Austin Clements authored
      Currently allocating black switches to the system stack (which is
      probably a historical accident) and atomically updates the global
      bytes marked stat. Since we're about to depend on this much more,
      optimize it a bit by putting it back on the regular stack and updating
      the per-P bytes marked stat, which gets lazily folded into the global
      bytes marked stat.
      
      Change-Id: Ibbe16e5382d3fd2256e4381f88af342bf7020b04
      Reviewed-on: https://go-review.googlesource.com/22170Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      64a26b79