1. 01 Dec, 2014 2 commits
    • Russ Cox's avatar
      runtime: fix hang in GC due to shrinkstack vs netpoll race · 2b62e1ea
      Russ Cox authored
      During garbage collection, after scanning a stack, we think about
      shrinking it to reclaim some memory. The shrinking code (called
      while the world is stopped) checked that the status was Gwaiting
      or Grunnable and then changed the state to Gcopystack, to essentially
      lock the stack so that no other GC thread is scanning it.
      The same locking happens for stack growth (and is more necessary there).
      
              oldstatus = runtime·readgstatus(gp);
              oldstatus &= ~Gscan;
              if(oldstatus == Gwaiting || oldstatus == Grunnable)
                      runtime·casgstatus(gp, oldstatus, Gcopystack); // oldstatus is Gwaiting or Grunnable
              else
                      runtime·throw("copystack: bad status, not Gwaiting or Grunnable");
      
      Unfortunately, "stop the world" doesn't stop everything. It stops all
      normal goroutine execution, but the network polling thread is still
      blocked in epoll and may wake up. If it does, and it chooses a goroutine
      to mark runnable, and that goroutine is the one whose stack is shrinking,
      then it can happen that between readgstatus and casgstatus, the status
      changes from Gwaiting to Grunnable.
      
      casgstatus assumes that if the status is not what is expected, it is a
      transient change (like from Gwaiting to Gscanwaiting and back, or like
      from Gwaiting to Gcopystack and back), and it loops until the status
      has been restored to the expected value. In this case, the status has
      changed semi-permanently from Gwaiting to Grunnable - it won't
      change again until the GC is done and the world can continue, but the
      GC is waiting for the status to change back. This wedges the program.
      
      To fix, call a special variant of casgstatus that accepts either Gwaiting
      or Grunnable as valid statuses.
      
      Without the fix bug with the extra check+throw in casgstatus, the
      program below dies in a few seconds (2-10) with GOMAXPROCS=8
      on a 2012 Retina MacBook Pro. With the fix, it runs for minutes
      and minutes.
      
      package main
      
      import (
              "io"
              "log"
              "net"
              "runtime"
      )
      
      func main() {
              const N = 100
              for i := 0; i < N; i++ {
                      l, err := net.Listen("tcp", "127.0.0.1:0")
                      if err != nil {
                              log.Fatal(err)
                      }
                      ch := make(chan net.Conn, 1)
                      go func() {
                              var err error
                              c1, err := net.Dial("tcp", l.Addr().String())
                              if err != nil {
                                      log.Fatal(err)
                              }
                              ch <- c1
                      }()
                      c2, err := l.Accept()
                      if err != nil {
                              log.Fatal(err)
                      }
                      c1 := <-ch
                      l.Close()
                      go netguy(c1, c2)
                      go netguy(c2, c1)
                      c1.Write(make([]byte, 100))
              }
              for {
                      runtime.GC()
              }
      }
      
      func netguy(r, w net.Conn) {
              buf := make([]byte, 100)
              for {
                      bigstack(1000)
                      _, err := io.ReadFull(r, buf)
                      if err != nil {
                              log.Fatal(err)
                      }
                      w.Write(buf)
              }
      }
      
      var g int
      
      func bigstack(n int) {
              var buf [100]byte
              if n > 0 {
                      bigstack(n - 1)
              }
              g = int(buf[0]) + int(buf[99])
      }
      
      Fixes #9186.
      
      LGTM=rlh
      R=austin, rlh
      CC=dvyukov, golang-codereviews, iant, khr, r
      https://golang.org/cl/179680043
      2b62e1ea
    • Keith Randall's avatar
      reflect: Fix reflect.funcLayout. The GC bitmap has two bits per · 7c1e3303
      Keith Randall authored
      pointer, not one.
      
      Fixes #9179
      
      LGTM=iant, rsc
      R=golang-codereviews, iant, rsc
      CC=golang-codereviews
      https://golang.org/cl/182160043
      7c1e3303
  2. 25 Nov, 2014 2 commits
  3. 22 Nov, 2014 2 commits
  4. 20 Nov, 2014 3 commits
  5. 19 Nov, 2014 2 commits
    • Russ Cox's avatar
      runtime: remove assumption that noptrdata data bss noptrbss are ordered and contiguous · 378c2515
      Russ Cox authored
      The assumption can be violated by external linkers reordering them or
      inserting non-Go sections in between them. I looked briefly at trying
      to write out the _go_.o in external linking mode in a way that forced
      the ordering, but no matter what there's no way to force Go's data
      and Go's bss to be next to each other. If there is any data or bss from
      non-Go objects, it's very likely to get stuck in between them.
      
      Instead, rewrite the two places we know about that make the assumption.
      I grepped for noptrdata to look for more and didn't find any.
      
      The added race test (os/exec in external linking mode) fails without
      the changes in the runtime. It crashes with an invalid pointer dereference.
      
      Fixes #9133.
      
      LGTM=dneil
      R=dneil
      CC=dvyukov, golang-codereviews, iant
      https://golang.org/cl/179980043
      378c2515
    • Russ Cox's avatar
      undo CL 131750044 / 2d6d44ceb80e · 2d53d6b5
      Russ Cox authored
      Breaks reading from stdin in parent after exec with SysProcAttr{Setpgid: true}.
      
      package main
      
      import (
              "fmt"
              "os"
              "os/exec"
              "syscall"
      )
      
      func main() {
              cmd := exec.Command("true")
              cmd.SysProcAttr = &syscall.SysProcAttr{Setpgid: true}
              cmd.Run()
      
              fmt.Printf("Hit enter:")
              os.Stdin.Read(make([]byte, 100))
              fmt.Printf("Bye\n")
      }
      
      In go1.3, I type enter at the prompt and the program exits.
      With the CL being rolled back, the program wedges at the
      prompt.
      
      ««« original CL description
      syscall: SysProcAttr job control changes
      
      Making the child's process group the foreground process group and
      placing the child in a specific process group involves co-ordination
      between the parent and child that must be done post-fork but pre-exec.
      
      LGTM=iant
      R=golang-codereviews, gobot, iant, mikioh.mikioh
      CC=golang-codereviews
      https://golang.org/cl/131750044
      
      »»»
      
      LGTM=minux, dneil
      R=dneil, minux
      CC=golang-codereviews, iant, michael.p.macinnis
      https://golang.org/cl/174450043
      2d53d6b5
  6. 18 Nov, 2014 1 commit
  7. 17 Nov, 2014 8 commits
  8. 16 Nov, 2014 2 commits
    • Russ Cox's avatar
      runtime: fix sudog leak · b3932bab
      Russ Cox authored
      The SudoG used to sit on the stack, so it was cheap to allocated
      and didn't need to be cleaned up when finished.
      
      For the conversion to Go, we had to move sudog off the stack
      for a few reasons, so we added a cache of recently used sudogs
      to keep allocation cheap. But we didn't add any of the necessary
      cleanup before adding a SudoG to the new cache, and so the cached
      SudoGs had stale pointers inside them that have caused all sorts
      of awful, hard to debug problems.
      
      CL 155760043 made sure SudoG.elem is cleaned up.
      CL 150520043 made sure SudoG.selectdone is cleaned up.
      
      This CL makes sure SudoG.next, SudoG.prev, and SudoG.waitlink
      are cleaned up. I should have done this when I did the other two
      fields; instead I wasted a week tracking down a leak they caused.
      
      A dangling SudoG.waitlink can point into a sudogcache list that
      has been "forgotten" in order to let the GC collect it, but that
      dangling .waitlink keeps the list from being collected.
      And then the list holding the SudoG with the dangling waitlink
      can find itself in the same situation, and so on. We end up
      with lists of lists of unusable SudoGs that are still linked into
      the object graph and never collected (given the right mix of
      non-trivial selects and non-channel synchronization).
      
      More details in golang.org/issue/9110.
      
      Fixes #9110.
      
      LGTM=r
      R=r
      CC=dvyukov, golang-codereviews, iant, khr
      https://golang.org/cl/177870043
      b3932bab
    • Russ Cox's avatar
      runtime: update URL for heap dump format · 6150414c
      Russ Cox authored
      I just created that redirect, so we can change
      it once the wiki moves.
      
      LGTM=bradfitz, khr
      R=khr, bradfitz
      CC=golang-codereviews
      https://golang.org/cl/177780043
      6150414c
  9. 14 Nov, 2014 3 commits
  10. 12 Nov, 2014 4 commits
  11. 11 Nov, 2014 3 commits
    • Robert Griesemer's avatar
      spec: method selectors don't auto-deref named pointer types · 40818cfe
      Robert Griesemer authored
      Language clarification.
      
      The existing rules for selector expressions imply
      automatic dereferencing of pointers to struct fields.
      They also implied automatic dereferencing of selectors
      denoting methods. In almost all cases, such automatic
      dereferencing does indeed take place for methods but the
      reason is not the selector rules but the fact that method
      sets include both methods with T and *T receivers; so for
      a *T actual receiver, a method expecting a formal T
      receiver, also accepts a *T (and the invocation or method
      value expression is the reason for the auto-derefering).
      
      However, the rules as stated so far implied that even in
      case of a variable p of named pointer type P, a selector
      expression p.f would always be shorthand for (*p).f. This
      is true for field selectors f, but cannot be true for
      method selectors since a named pointer type always has an
      empty method set.
      
      Named pointer types may never appear as anonymous field
      types (and method receivers, for that matter), so this
      only applies to variables declared of a named pointer
      type. This is exceedingly rare and perhaps shouldn't be
      permitted in the first place (but we cannot change that).
      
      Amended the selector rules to make auto-deref of values
      of named pointer types an exception to the general rules
      and added corresponding examples with explanations.
      
      Both gc and gccgo have a bug where they do auto-deref
      pointers of named types in method selectors where they
      should not:
      
      See http://play.golang.org/p/c6VhjcIVdM , line 45.
      
      Fixes #5769.
      Fixes #8989.
      
      LGTM=r, rsc
      R=r, rsc, iant, ken
      CC=golang-codereviews
      https://golang.org/cl/168790043
      40818cfe
    • Rob Pike's avatar
      doc/gopher: add jpgs of the 5th anniversary image · 0f8cd143
      Rob Pike authored
      LGTM=adg
      R=golang-codereviews, adg
      CC=golang-codereviews
      https://golang.org/cl/172980043
      0f8cd143
    • Nigel Tao's avatar
      doc: update go1.4.html's minor library changes. · e522a477
      Nigel Tao authored
      LGTM=r
      R=r
      CC=golang-codereviews
      https://golang.org/cl/173920043
      e522a477
  12. 10 Nov, 2014 8 commits