1. 21 May, 2015 2 commits
    • Rick Hudson's avatar
      runtime: turn work buffer tracing off by default · 5b66e5d0
      Rick Hudson authored
      During development we ran with monitoring code turned
      on by default. This CL turns the work buffer monitoring
      off. Performance change on most go1 benchmarks is small
      or insignificant.
      
      name                   old mean              new mean              delta
      BinaryTree17            3.35s × (0.99,1.01)   3.35s × (0.99,1.01)    ~    (p=0.841 n=5+5)
      Fannkuch11              2.59s × (1.00,1.01)   2.55s × (1.00,1.00)  -1.65% (p=0.008 n=5+5)
      FmtFprintfEmpty        52.5ns × (0.99,1.02)  53.2ns × (0.98,1.01)    ~    (p=0.063 n=5+5)
      FmtFprintfString        181ns × (1.00,1.00)   180ns × (1.00,1.00)  -0.55% (p=0.029 n=4+4)
      FmtFprintfInt           176ns × (1.00,1.01)   174ns × (1.00,1.00)  -0.91% (p=0.000 n=5+4)
      FmtFprintfIntInt        298ns × (1.00,1.00)   299ns × (1.00,1.00)    ~    (p=0.143 n=4+4)
      FmtFprintfPrefixedInt   250ns × (1.00,1.01)   246ns × (1.00,1.00)  -1.68% (p=0.000 n=5+4)
      FmtFprintfFloat         340ns × (1.00,1.00)   340ns × (1.00,1.01)    ~    (p=0.643 n=5+5)
      FmtManyArgs            1.16µs × (1.00,1.00)  1.15µs × (1.00,1.00)  -0.47% (p=0.016 n=5+5)
      GobDecode              9.22ms × (1.00,1.00)  9.23ms × (1.00,1.00)    ~    (p=0.841 n=5+5)
      GobEncode              7.00ms × (1.00,1.01)  7.09ms × (0.99,1.01)  +1.26% (p=0.016 n=5+5)
      Gzip                    387ms × (1.00,1.00)   389ms × (0.99,1.02)    ~    (p=1.000 n=5+5)
      Gunzip                 97.8ms × (1.00,1.00)  98.3ms × (1.00,1.00)  +0.51% (p=0.016 n=5+4)
      HTTPClientServer       52.6µs × (1.00,1.01)  52.7µs × (1.00,1.01)    ~    (p=1.000 n=5+5)
      JSONEncode             18.0ms × (0.99,1.02)  17.9ms × (1.00,1.00)    ~    (p=0.310 n=5+5)
      JSONDecode             64.8ms × (0.99,1.02)  63.6ms × (1.00,1.00)  -1.94% (p=0.008 n=5+5)
      Mandelbrot200          4.05ms × (1.00,1.00)  4.05ms × (1.00,1.00)    ~    (p=0.421 n=5+5)
      GoParse                3.86ms × (1.00,1.01)  3.84ms × (0.99,1.01)    ~    (p=0.421 n=5+5)
      RegexpMatchEasy0_32     101ns × (1.00,1.00)   102ns × (0.99,1.02)    ~    (p=0.238 n=4+5)
      RegexpMatchEasy0_1K     346ns × (1.00,1.01)   345ns × (1.00,1.00)    ~    (p=0.333 n=5+4)
      RegexpMatchEasy1_32    87.3ns × (0.99,1.02)  87.4ns × (1.00,1.00)    ~    (p=0.190 n=5+4)
      RegexpMatchEasy1_1K     520ns × (1.00,1.00)   520ns × (1.00,1.01)    ~    (p=1.000 n=4+5)
      RegexpMatchMedium_32    143ns × (1.00,1.00)   142ns × (1.00,1.00)  -0.70% (p=0.029 n=4+4)
      RegexpMatchMedium_1K   43.2µs × (1.00,1.01)  43.2µs × (1.00,1.00)    ~    (p=0.841 n=5+5)
      RegexpMatchHard_32     2.24µs × (1.00,1.01)  2.23µs × (1.00,1.01)  -0.63% (p=0.048 n=5+5)
      RegexpMatchHard_1K     68.7µs × (1.00,1.00)  68.3µs × (1.00,1.00)  -0.56% (p=0.008 n=5+5)
      Revcomp                 577ms × (1.00,1.01)   579ms × (1.00,1.00)    ~    (p=0.151 n=5+5)
      Template               74.9ms × (1.00,1.00)  76.5ms × (1.00,1.00)  +2.11% (p=0.008 n=5+5)
      TimeParse               359ns × (1.00,1.00)   362ns × (1.00,1.00)  +0.72% (p=0.008 n=5+5)
      TimeFormat              369ns × (1.00,1.00)   371ns × (1.00,1.01)    ~    (p=0.071 n=5+5)
      
      Change-Id: I4206a3f77a3d1450966b7a62ea7597aec44cb72f
      Reviewed-on: https://go-review.googlesource.com/10294Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: 's avatarAustin Clements <austin@google.com>
      5b66e5d0
    • Austin Clements's avatar
      runtime: make runtime.callers walk calling G, not g0 · 719efc70
      Austin Clements authored
      Currently runtime.callers invokes gentraceback with the pc and sp of
      the G it is called from, but always passes g0 even if it was called
      from a regular g. Right now this has no ill effects because
      runtime.callers does not use either callback argument or the
      _TraceJumpStack flag, but it makes the code fragile and will break
      some upcoming changes.
      
      Fix this by lifting the getg() call outside of the systemstack in
      runtime.callers.
      
      Change-Id: I4e1e927961c0e0cd4dcf28693be47df7bae9e122
      Reviewed-on: https://go-review.googlesource.com/10292Reviewed-by: 's avatarDaniel Morsing <daniel.morsing@gmail.com>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      719efc70
  2. 20 May, 2015 8 commits
  3. 19 May, 2015 16 commits
  4. 18 May, 2015 14 commits
    • Josh Bleecher Snyder's avatar
      cmd/internal/gc: rearrange Node fields · 82833b31
      Josh Bleecher Snyder authored
      Rearrange Node fields to enable better struct packing.
      This reduces readability in favor of shrinking
      the size of Nodes.
      
      This reduces the size of Node from 328 to 312.
      This reduces the memory usage to compile the
      rotate tests by about 4.4%.
      
      No functional changes. Passes toolstash -cmp.
      
      Updates #9933.
      
      Change-Id: I2764c5847fb1635ddc898e2ee385d007d67f03c5
      Reviewed-on: https://go-review.googlesource.com/10141Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      82833b31
    • Josh Bleecher Snyder's avatar
      cmd/internal/gc: separate Node param fields · f4ab8203
      Josh Bleecher Snyder authored
      Param will be converted from an anonymous to a
      named field in a subsequent, automated CL.
      
      Reduces Node size from 368 to 328.
      Reduces inuse_space on the rotate tests by about 3%.
      
      No functional changes. Passes toolstash -cmp.
      
      Updates #9933.
      
      Change-Id: I5867b00328abf17ee24aea6ca58876bae9d8bfed
      Reviewed-on: https://go-review.googlesource.com/10210Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      f4ab8203
    • Josh Bleecher Snyder's avatar
      cmd/6g, cmd/internal/gc: use Etype instead of Ostk · ddc93398
      Josh Bleecher Snyder authored
      Change-Id: Ifda5d84b28717986c93b63767298180a6d6236c0
      Reviewed-on: https://go-review.googlesource.com/10140Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      ddc93398
    • Josh Bleecher Snyder's avatar
      cmd/internal/gc: make all Node depths int32 · 2b063bdf
      Josh Bleecher Snyder authored
      Funcdepth was already int32. Make Escloopdepth
      and Decldepth also int32 instead of int.
      
      No functional changes for non-absurd code. Passes toolstash -cmp.
      
      Change-Id: I47e145dd732b6a73cfcc6d45956df0dbccdcd999
      Reviewed-on: https://go-review.googlesource.com/10129Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      2b063bdf
    • Josh Bleecher Snyder's avatar
      runtime/pprof: write heap statistics to heap profile always · 79986e24
      Josh Bleecher Snyder authored
      This is a duplicate of CL 9491.
      That CL broke the build due to pprof shortcomings
      and was reverted in CL 9565.
      
      CL 9623 fixed pprof, so this can go in again.
      
      Fixes #10659.
      
      Change-Id: If470fc90b3db2ade1d161b4417abd2f5c6c330b8
      Reviewed-on: https://go-review.googlesource.com/10212Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      79986e24
    • Daniel Morsing's avatar
      cmd/pprof/internal/profile: ignore comments when parsing heap profiles · 19354b9d
      Daniel Morsing authored
      Fixes #10659.
      
      Change-Id: I22dc306ce6f398dd40010ac430928a718d67d466
      Reviewed-on: https://go-review.googlesource.com/9623Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      19354b9d
    • Rob Pike's avatar
      cmd/doc: put blank lines around comment for types, etc. · 6f7b4e89
      Rob Pike authored
      Better layout.
      
      Fixes #10859.
      
      The issue suggests rearranging so the comment comes out
      after the methods. I tried this and it looks good but it is less
      useful, since the stuff you're probably looking for - the methods
      - are scrolled away by the comment. The most important
      information should be last because that leaves it on your
      screen after the print if the output is long.
      
      Change-Id: I560f992601ccbe2293c347fa1b1018a3f5346c82
      Reviewed-on: https://go-review.googlesource.com/10160Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      6f7b4e89
    • Michael Hudson-Doyle's avatar
      misc/cgo/testshared: rewrite in Go · 362a40e3
      Michael Hudson-Doyle authored
      And fix to work on filesystems with only 1s resolution.
      
      Fixes #10724
      
      Change-Id: Ia07463f090b4290fc27f5953fa94186463d7afc7
      Reviewed-on: https://go-review.googlesource.com/9768Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      362a40e3
    • Robert Griesemer's avatar
      spec: fix typo · f9ec929a
      Robert Griesemer authored
      Fixes #10893.
      
      Change-Id: I8afeb55acda1e1c8e181379dbaf443716d63ded1
      Reviewed-on: https://go-review.googlesource.com/10201Reviewed-by: 's avatarRob Pike <r@golang.org>
      f9ec929a
    • David Chase's avatar
      cmd/internal/gc: extend escape analysis to pointers in slices · a21cf5b6
      David Chase authored
      Modified esc.go to allow slice literals (before append)
      to be non-escaping.  Modified tests to account for changes
      in escape behavior and to also test the two cases that
      were previously not tested.
      
      Also minor cleanups to debug-printing within esc.go
      
      Allocation stats for running compiler
      ( cd src/html/template;
        for i in {1..5} ; do
           go tool 6g -memprofile=testzz.${i}.prof  -memprofilerate=1 *.go ;
           go tool pprof -alloc_objects -text  testzz.${i}.prof ;
           done ; )
      before about 86k allocations
      after  about 83k allocations
      
      Fixes #8972
      
      Change-Id: Ib61dd70dc74adb40d6f6fdda6eaa4bf7d83481de
      Reviewed-on: https://go-review.googlesource.com/10118Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      a21cf5b6
    • Austin Clements's avatar
      runtime: use separate count and note for forEachP · f0dd0028
      Austin Clements authored
      Currently, forEachP reuses the stopwait and stopnote fields from
      stopTheWorld to track how many Ps have not responded to the safe-point
      request and to sleep until all Ps have responded.
      
      It was assumed this was safe because both stopTheWorld and forEachP
      must occur under the worlsema and hence stopwait and stopnote cannot
      be used for both purposes simultaneously and callers could always
      determine the appropriate use based on sched.gcwaiting (which is only
      set by stopTheWorld). However, this is not the case, since it's
      possible for there to be a window between when an M observes that
      gcwaiting is set and when it checks stopwait during which stopwait
      could have changed meanings. When this happens, the M decrements
      stopwait and may wakeup stopnote, but does not otherwise participate
      in the forEachP protocol. As a result, stopwait is decremented too
      many times, so it may reach zero before all Ps have run the safe-point
      function, causing forEachP to wake up early. It will then either
      observe that some P has not run the safe-point function and panic with
      "P did not run fn", or the remaining P (or Ps) will run the safe-point
      function before it wakes up and it will observe that stopwait is
      negative and panic with "not stopped".
      
      Fix this problem by giving forEachP its own safePointWait and
      safePointNote fields.
      
      One known sequence of events that can cause this race is as
      follows. It involves three actors:
      
      G1 is running on M1 on P1. P1 has an empty run queue.
      
      G2/M2 is in a blocked syscall and has lost its P. (The details of this
      don't matter, it just needs to be in a position where it needs to grab
      an idle P.)
      
      GC just started on G3/M3/P3. (These aren't very involved, they just
      have to be separate from the other G's, M's, and P's.)
      
      1. GC calls stopTheWorld(), which sets sched.gcwaiting to 1.
      
      Now G1/M1 begins to enter a syscall:
      
      2. G1/M1 invokes reentersyscall, which sets the P1's status to
         _Psyscall.
      
      3. G1/M1's reentersyscall observes gcwaiting != 0 and calls
         entersyscall_gcwait.
      
      4. G1/M1's entersyscall_gcwait blocks acquiring sched.lock.
      
      Back on GC:
      
      5. stopTheWorld cas's P1's status to _Pgcstop, does other stuff, and
         returns.
      
      6. GC does stuff and then calls startTheWorld().
      
      7. startTheWorld() calls procresize(), which sets P1's status to
         _Pidle and puts P1 on the idle list.
      
      Now G2/M2 returns from its syscall and takes over P1:
      
      8. G2/M2 returns from its blocked syscall and gets P1 from the idle
         list.
      
      9. G2/M2 acquires P1, which sets P1's status to _Prunning.
      
      10. G2/M2 starts a new syscall and invokes reentersyscall, which sets
          P1's status to _Psyscall.
      
      Back on G1/M1:
      
      11. G1/M1 finally acquires sched.lock in entersyscall_gcwait.
      
      At this point, G1/M1 still thinks it's running on P1. P1's status is
      _Psyscall, which is consistent with what G1/M1 is doing, but it's
      _Psyscall because *G2/M2* put it in to _Psyscall, not G1/M1. This is
      basically an ABA race on P1's status.
      
      Because forEachP currently shares stopwait with stopTheWorld. G1/M1's
      entersyscall_gcwait observes the non-zero stopwait set by forEachP,
      but mistakes it for a stopTheWorld. It cas's P1's status from
      _Psyscall (set by G2/M2) to _Pgcstop and proceeds to decrement
      stopwait one more time than forEachP was expecting.
      
      Fixes #10618. (See the issue for details on why the above race is safe
      when forEachP is not involved.)
      
      Prior to this commit, the command
        stress ./runtime.test -test.run TestFutexsleep\|TestGoroutineProfile
      would reliably fail after a few hundred runs. With this commit, it
      ran for over 2 million runs and never crashed.
      
      Change-Id: I9a91ea20035b34b6e5f07ef135b144115f281f30
      Reviewed-on: https://go-review.googlesource.com/10157Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      f0dd0028
    • Austin Clements's avatar
      runtime: hold worldsema while starting the world · 277acca2
      Austin Clements authored
      Currently, startTheWorld releases worldsema before starting the
      world. Since startTheWorld can change gomaxprocs after allowing Ps to
      run, this means that gomaxprocs can change while another P holds
      worldsema.
      
      Unfortunately, the garbage collector and forEachP assume that holding
      worldsema protects against changes in gomaxprocs (which it *almost*
      does). In particular, this is causing somewhat frequent "P did not run
      fn" crashes in forEachP in the runtime tests because gomaxprocs is
      changing between the several loops that forEachP does over all the Ps.
      
      Fix this by only releasing worldsema after the world is started.
      
      This relates to issue #10618. forEachP still fails under stress
      testing, but much less frequently.
      
      Change-Id: I085d627b70cca9ebe9af28fe73b9872f1bb224ff
      Reviewed-on: https://go-review.googlesource.com/10156Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      277acca2
    • Austin Clements's avatar
      runtime: disallow preemption during startTheWorld · 9c44a41d
      Austin Clements authored
      Currently, startTheWorld clears preemptoff for the current M before
      starting the world. A few callers increment m.locks around
      startTheWorld, presumably to prevent preemption any time during
      starting the world. This is almost certainly pointless (none of the
      other callers do this), but there's no harm in making startTheWorld
      keep preemption disabled until it's all done, which definitely lets us
      drop these m.locks manipulations.
      
      Change-Id: I8a93658abd0c72276c9bafa3d2c7848a65b4691a
      Reviewed-on: https://go-review.googlesource.com/10155Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      9c44a41d
    • Austin Clements's avatar
      runtime: factor stoptheworld/starttheworld pattern · a1da255a
      Austin Clements authored
      There are several steps to stopping and starting the world and
      currently they're open-coded in several places. The garbage collector
      is the only thing that needs to stop and start the world in a
      non-trivial pattern. Replace all other uses with calls to higher-level
      functions that implement the entire pattern necessary to stop and
      start the world.
      
      This is a pure refectoring and should not change any code semantics.
      In the following commits, we'll make changes that are easier to do
      with this abstraction in place.
      
      This commit renames the old starttheworld to startTheWorldWithSema.
      This is a slight misnomer right now because the callers release
      worldsema just before calling this. However, a later commit will swap
      these and I don't want to think of another name in the mean time.
      
      Change-Id: I5dc97f87b44fb98963c49c777d7053653974c911
      Reviewed-on: https://go-review.googlesource.com/10154Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      a1da255a