1. 03 Jun, 2015 33 commits
  2. 02 Jun, 2015 7 commits
    • Rob Pike's avatar
      text/template: clarify the documentation around template definitions · f9ed2f75
      Rob Pike authored
      Due to the requirements of parsing template definitions that mention
      other templates that are not yet defined, a Template can be in two states:
      defined and undefined. Thus, although one calls New, the resulting
      template has no definition even though it exists as a data structure.
      
      Thus, for example, will return nil for a template that is named but not
      yet defined.
      
      Fixes #10910
      Fixes #10926
      
      Clarify the documentation a little to explain this,
      Also tidy up the code a little and remove a spurious call to init.
      
      Change-Id: I22cc083291500bca424e83dc12807e0de7b00b7a
      Reviewed-on: https://go-review.googlesource.com/10641Reviewed-by: 's avatarAndrew Gerrand <adg@golang.org>
      f9ed2f75
    • Austin Clements's avatar
      runtime: implement GC stack barriers · faa7a7e8
      Austin Clements authored
      This commit implements stack barriers to minimize the amount of
      stack re-scanning that must be done during mark termination.
      
      Currently the GC scans stacks of active goroutines twice during every
      GC cycle: once at the beginning during root discovery and once at the
      end during mark termination. The second scan happens while the world
      is stopped and guarantees that we've seen all of the roots (since
      there are no write barriers on writes to local stack
      variables). However, this means pause time is proportional to stack
      size. In particularly recursive programs, this can drive pause time up
      past our 10ms goal (e.g., it takes about 150ms to scan a 50MB heap).
      
      Re-scanning the entire stack is rarely necessary, especially for large
      stacks, because usually most of the frames on the stack were not
      active between the first and second scans and hence any changes to
      these frames (via non-escaping pointers passed down the stack) were
      tracked by write barriers.
      
      To efficiently track how far a stack has been unwound since the first
      scan (and, hence, how much needs to be re-scanned), this commit
      introduces stack barriers. During the first scan, at exponentially
      spaced points in each stack, the scan overwrites return PCs with the
      PC of the stack barrier function. When "returned" to, the stack
      barrier function records how far the stack has unwound and jumps to
      the original return PC for that point in the stack. Then the second
      scan only needs to proceed as far as the lowest barrier that hasn't
      been hit.
      
      For deeply recursive programs, this substantially reduces mark
      termination time (and hence pause time). For the goscheme example
      linked in issue #10898, prior to this change, mark termination times
      were typically between 100 and 500ms; with this change, mark
      termination times are typically between 10 and 20ms. As a result of
      the reduced stack scanning work, this reduces overall execution time
      of the goscheme example by 20%.
      
      Fixes #10898.
      
      The effect of this on programs that are not deeply recursive is
      minimal:
      
      name                   old time/op    new time/op    delta
      BinaryTree17              3.16s ± 2%     3.26s ± 1%  +3.31%  (p=0.000 n=19+19)
      Fannkuch11                2.42s ± 1%     2.48s ± 1%  +2.24%  (p=0.000 n=17+19)
      FmtFprintfEmpty          50.0ns ± 3%    49.8ns ± 1%    ~     (p=0.534 n=20+19)
      FmtFprintfString          173ns ± 0%     175ns ± 0%  +1.49%  (p=0.000 n=16+19)
      FmtFprintfInt             170ns ± 1%     175ns ± 1%  +2.97%  (p=0.000 n=20+19)
      FmtFprintfIntInt          288ns ± 0%     295ns ± 0%  +2.73%  (p=0.000 n=16+19)
      FmtFprintfPrefixedInt     242ns ± 1%     252ns ± 1%  +4.13%  (p=0.000 n=18+18)
      FmtFprintfFloat           324ns ± 0%     323ns ± 0%  -0.36%  (p=0.000 n=20+19)
      FmtManyArgs              1.14µs ± 0%    1.12µs ± 1%  -1.01%  (p=0.000 n=18+19)
      GobDecode                8.88ms ± 1%    8.87ms ± 0%    ~     (p=0.480 n=19+18)
      GobEncode                6.80ms ± 1%    6.85ms ± 0%  +0.82%  (p=0.000 n=20+18)
      Gzip                      363ms ± 1%     363ms ± 1%    ~     (p=0.077 n=18+20)
      Gunzip                   90.6ms ± 0%    90.0ms ± 1%  -0.71%  (p=0.000 n=17+18)
      HTTPClientServer         51.5µs ± 1%    50.8µs ± 1%  -1.32%  (p=0.000 n=18+18)
      JSONEncode               17.0ms ± 0%    17.1ms ± 0%  +0.40%  (p=0.000 n=18+17)
      JSONDecode               61.8ms ± 0%    63.8ms ± 1%  +3.11%  (p=0.000 n=18+17)
      Mandelbrot200            3.84ms ± 0%    3.84ms ± 1%    ~     (p=0.583 n=19+19)
      GoParse                  3.71ms ± 1%    3.72ms ± 1%    ~     (p=0.159 n=18+19)
      RegexpMatchEasy0_32       100ns ± 0%     100ns ± 1%  -0.19%  (p=0.033 n=17+19)
      RegexpMatchEasy0_1K       342ns ± 1%     331ns ± 0%  -3.41%  (p=0.000 n=19+19)
      RegexpMatchEasy1_32      82.5ns ± 0%    81.7ns ± 0%  -0.98%  (p=0.000 n=18+18)
      RegexpMatchEasy1_1K       505ns ± 0%     494ns ± 1%  -2.16%  (p=0.000 n=18+18)
      RegexpMatchMedium_32      137ns ± 1%     137ns ± 1%  -0.24%  (p=0.048 n=20+18)
      RegexpMatchMedium_1K     41.6µs ± 0%    41.3µs ± 1%  -0.57%  (p=0.004 n=18+20)
      RegexpMatchHard_32       2.11µs ± 0%    2.11µs ± 1%  +0.20%  (p=0.037 n=17+19)
      RegexpMatchHard_1K       63.9µs ± 2%    63.3µs ± 0%  -0.99%  (p=0.000 n=20+17)
      Revcomp                   560ms ± 1%     522ms ± 0%  -6.87%  (p=0.000 n=18+16)
      Template                 75.0ms ± 0%    75.1ms ± 1%  +0.18%  (p=0.013 n=18+19)
      TimeParse                 358ns ± 1%     364ns ± 0%  +1.74%  (p=0.000 n=20+15)
      TimeFormat                360ns ± 0%     372ns ± 0%  +3.55%  (p=0.000 n=20+18)
      
      Change-Id: If8a9bfae6c128d15a4f405e02bcfa50129df82a2
      Reviewed-on: https://go-review.googlesource.com/10314Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      faa7a7e8
    • Austin Clements's avatar
      runtime: avoid double-scanning of stacks · 724f8298
      Austin Clements authored
      Currently there's a race between stopg scanning another G's stack and
      the G reaching a preemption point and scanning its own stack. When
      this race occurs, the G's stack is scanned twice. Currently this is
      okay, so this race is benign.
      
      However, we will shortly be adding stack barriers during the first
      stack scan, so scanning will no longer be idempotent. To prepare for
      this, this change ensures that each stack is scanned only once during
      each GC phase by checking the flag that indicates that the stack has
      been scanned in this phase before scanning the stack.
      
      Change-Id: Id9f4d5e2e5b839bc3f200ec1723a4a12dd677ab4
      Reviewed-on: https://go-review.googlesource.com/10458Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      724f8298
    • Austin Clements's avatar
      runtime: steal space for stack barrier tracking from stack · 3f6e69ac
      Austin Clements authored
      The stack barrier code will need a bookkeeping structure to keep track
      of the overwritten return PCs. This commit introduces and allocates
      this structure, but does not yet use the structure.
      
      We don't want to allocate space for this structure during garbage
      collection, so this commit allocates it along with the allocation of
      the corresponding stack. However, we can't do a regular allocation in
      newstack because mallocgc may itself grow the stack (which would lead
      to a recursive allocation). Hence, this commit makes the bookkeeping
      structure part of the stack allocation itself by stealing the
      necessary space from the top of the stack allocation. Since the size
      of this bookkeeping structure is logarithmic in the size of the stack,
      this has minimal impact on stack behavior.
      
      Change-Id: Ia14408be06aafa9ca4867f4e70bddb3fe0e96665
      Reviewed-on: https://go-review.googlesource.com/10313Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      3f6e69ac
    • Austin Clements's avatar
      runtime: decouple stack bounds and stack allocation size · e610c25d
      Austin Clements authored
      Currently the runtime assumes that the allocation for the stack is
      exactly [stack.lo, stack.hi). We're about to steal a small part of
      this allocation for per-stack GC metadata. To prepare for this, this
      commit adds a field to the G for the allocated size of the stack.
      With this change, stack.lo and stack.hi continue to act as the true
      bounds on the stack, but are no longer also used as the bounds on the
      stack allocation.
      
      (I also tried this the other way around, where stack.lo and stack.hi
      remained the allocation bounds and I introduced a new top of stack.
      However, there are far more places that assume stack.hi is the true
      top of the stack than there are places that assume it's the top of the
      allocation.)
      
      Change-Id: Ifa9d956753be53d286d09cbc73d47fb34a18c0c6
      Reviewed-on: https://go-review.googlesource.com/10312Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      e610c25d
    • Austin Clements's avatar
      runtime: clean up signalstack API · c02b8911
      Austin Clements authored
      Currently signalstack takes a lower limit and a length and all calls
      hard-code the passed length. Change the API to take a *stack and
      compute the lower limit and length from the passed stack.
      
      This will make it easier for the runtime to steal some space from the
      top of the stack since it eliminates the hard-coded stack sizes.
      
      Change-Id: I7d2a9f45894b221f4e521628c2165530bbc57d53
      Reviewed-on: https://go-review.googlesource.com/10311Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      c02b8911
    • Austin Clements's avatar
      runtime: increase precision of gctrace times · cc6a7fce
      Austin Clements authored
      Currently we truncate gctrace clock and CPU times to millisecond
      precision. As a result, many phases are typically printed as 0, which
      is fine for user consumption, but makes gathering statistics and
      reports over GC traces difficult.
      
      In 1.4, the gctrace line printed times in microseconds. This was
      better for statistics, but not as easy for users to read or interpret,
      and it generally made the trace lines longer.
      
      This change strikes a balance between these extremes by printing
      milliseconds, but including the decimal part to two significant
      figures down to microsecond precision. This remains easy to read and
      interpret, but includes more precision when it's useful.
      
      For example, where the code currently prints,
      
      gc #29 @1.629s 0%: 0+2+0+12+0 ms clock, 0+2+0+0/12/0+0 ms cpu, 4->4->2 MB, 4 MB goal, 1 P
      
      this prints,
      
      gc #29 @1.629s 0%: 0.005+2.1+0+12+0.29 ms clock, 0.005+2.1+0+0/12/0+0.29 ms cpu, 4->4->2 MB, 4 MB goal, 1 P
      
      Fixes #10970.
      
      Change-Id: I249624779433927cd8b0947b986df9060c289075
      Reviewed-on: https://go-review.googlesource.com/10554Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      cc6a7fce