1. 17 Feb, 2017 18 commits
    • Robert Griesemer's avatar
      math/bits: faster Rotate functions, added respective benchmarks · 19028bdd
      Robert Griesemer authored
      Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.
      
      benchmark                    old ns/op     new ns/op     delta
      BenchmarkRotateLeft-8        7.87          7.00          -11.05%
      BenchmarkRotateLeft8-8       8.41          4.52          -46.25%
      BenchmarkRotateLeft16-8      8.07          4.55          -43.62%
      BenchmarkRotateLeft32-8      8.36          4.73          -43.42%
      BenchmarkRotateLeft64-8      7.93          4.78          -39.72%
      
      BenchmarkRotateRight-8       8.23          6.72          -18.35%
      BenchmarkRotateRight8-8      8.76          4.39          -49.89%
      BenchmarkRotateRight16-8     9.07          4.44          -51.05%
      BenchmarkRotateRight32-8     8.85          4.46          -49.60%
      BenchmarkRotateRight64-8     8.11          4.43          -45.38%
      
      Change-Id: I79ea1e9e6fc65f95794a91f860a911efed3aa8a1
      Reviewed-on: https://go-review.googlesource.com/37219Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      19028bdd
    • Robert Griesemer's avatar
      math/bits: faster OnesCount, added respective benchmarks · a12edb8d
      Robert Griesemer authored
      Also: Changed Reverse/ReverseBytes implementations to use
      the same (smaller) masks as OnesCount.
      
      BenchmarkOnesCount-8          37.0          6.26          -83.08%
      BenchmarkOnesCount8-8         7.24          1.99          -72.51%
      BenchmarkOnesCount16-8        11.3          2.47          -78.14%
      BenchmarkOnesCount32-8        18.4          3.02          -83.59%
      BenchmarkOnesCount64-8        40.0          3.78          -90.55%
      BenchmarkReverse-8            6.69          6.22          -7.03%
      BenchmarkReverse8-8           1.64          1.64          +0.00%
      BenchmarkReverse16-8          2.26          2.18          -3.54%
      BenchmarkReverse32-8          2.88          2.87          -0.35%
      BenchmarkReverse64-8          5.64          4.34          -23.05%
      BenchmarkReverseBytes-8       2.48          2.17          -12.50%
      BenchmarkReverseBytes16-8     0.63          0.95          +50.79%
      BenchmarkReverseBytes32-8     1.13          1.24          +9.73%
      BenchmarkReverseBytes64-8     2.50          2.16          -13.60%
      
      OnesCount-8       37.0ns ± 0%   6.3ns ± 0%   ~             (p=1.000 n=1+1)
      OnesCount8-8      7.24ns ± 0%  1.99ns ± 0%   ~             (p=1.000 n=1+1)
      OnesCount16-8     11.3ns ± 0%   2.5ns ± 0%   ~             (p=1.000 n=1+1)
      OnesCount32-8     18.4ns ± 0%   3.0ns ± 0%   ~             (p=1.000 n=1+1)
      OnesCount64-8     40.0ns ± 0%   3.8ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse-8         6.69ns ± 0%  6.22ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse8-8        1.64ns ± 0%  1.64ns ± 0%   ~     (all samples are equal)
      Reverse16-8       2.26ns ± 0%  2.18ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse32-8       2.88ns ± 0%  2.87ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse64-8       5.64ns ± 0%  4.34ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes-8    2.48ns ± 0%  2.17ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes16-8  0.63ns ± 0%  0.95ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes32-8  1.13ns ± 0%  1.24ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes64-8  2.50ns ± 0%  2.16ns ± 0%   ~             (p=1.000 n=1+1)
      
      Change-Id: I591b0ffc83fc3a42828256b6e5030f32c64f9497
      Reviewed-on: https://go-review.googlesource.com/37218Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      a12edb8d
    • Ilya Tocar's avatar
      cmd/compile/internal/ssa: combine load + op on AMD64 · 21c71d77
      Ilya Tocar authored
      On AMD64 Most operation can have one operand in memory.
      Combine load and dependand operation into one new operation,
      where possible. I've seen no significant performance changes on go1,
      but this allows to remove ~1.8kb code from go tool. And in math package
      I see e. g.:
      
      Remainder-6            70.0ns ± 0%   64.6ns ± 0%   -7.76%  (p=0.000 n=9+1
      Change-Id: I88b8602b1d55da8ba548a34eb7da4b25d59a297e
      Reviewed-on: https://go-review.googlesource.com/36793
      Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      21c71d77
    • Keith Randall's avatar
      cmd/compile: fix 32-bit unsigned division on 64-bit machines · a9292b83
      Keith Randall authored
      The type of an intermediate multiply was wrong.  When that
      intermediate multiply was spilled, the top 32 bits were lost.
      
      Fixes #19153
      
      Change-Id: Ib29350a4351efa405935b7f7ee3c112668e64108
      Reviewed-on: https://go-review.googlesource.com/37212
      Run-TryBot: Keith Randall <khr@golang.org>
      Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      a9292b83
    • Robert Griesemer's avatar
      math/bits: faster Reverse, ReverseBytes · 4498b683
      Robert Griesemer authored
      - moved from: x&m>>k | x&^m<<k to: x&m>>k | x<<k&m
        This permits use of the same constant m twice (*) which may be
        better for machines that can't use large immediate constants
        directly with an AND instruction and have to load them explicitly.
        *) CPUs don't usually have a &^ instruction, so x&^m becomes x&(^m)
      
      - simplified returns
        This improves the generated code because the compiler recognizes
        x>>k | x<<k as ROT when k is the bitsize of x.
      
      The 8-bit versions of these instructions can be significantly faster
      still if they are replaced with table lookups, as long as the table
      is in cache. If the table is not in cache, table-lookup is probably
      slower, hence the choice of an explicit register-only implementation
      for now.
      
      BenchmarkReverse-8            8.50          6.86          -19.29%
      BenchmarkReverse8-8           2.17          1.74          -19.82%
      BenchmarkReverse16-8          2.89          2.34          -19.03%
      BenchmarkReverse32-8          3.55          2.95          -16.90%
      BenchmarkReverse64-8          6.81          5.57          -18.21%
      BenchmarkReverseBytes-8       3.49          2.48          -28.94%
      BenchmarkReverseBytes16-8     0.93          0.62          -33.33%
      BenchmarkReverseBytes32-8     1.55          1.13          -27.10%
      BenchmarkReverseBytes64-8     2.47          2.47          +0.00%
      
      Reverse-8         8.50ns ± 0%  6.86ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse8-8        2.17ns ± 0%  1.74ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse16-8       2.89ns ± 0%  2.34ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse32-8       3.55ns ± 0%  2.95ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse64-8       6.81ns ± 0%  5.57ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes-8    3.49ns ± 0%  2.48ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes16-8  0.93ns ± 0%  0.62ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes32-8  1.55ns ± 0%  1.13ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes64-8  2.47ns ± 0%  2.47ns ± 0%   ~     (all samples are equal)
      
      Change-Id: I0064de8c7e0e568ca7885d6f7064344bef91a06d
      Reviewed-on: https://go-review.googlesource.com/37215
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      4498b683
    • Matthew Dempsky's avatar
      cmd/compile/internal/gc: remove Node.IsStatic field · c61cf5e6
      Matthew Dempsky authored
      We can immediately emit static assignment data rather than queueing
      them up to be processed during SSA building.
      
      Passes toolstash -cmp.
      
      Change-Id: I8bcea4b72eafb0cc0b849cd93e9cde9d84f30d5e
      Reviewed-on: https://go-review.googlesource.com/37024
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      Reviewed-by: 's avatarJosh Bleecher Snyder <josharian@gmail.com>
      c61cf5e6
    • Cherry Zhang's avatar
      cmd/compile: check both syms when folding address into load/store on ARM64 · 3557d546
      Cherry Zhang authored
      The rules for folding addresses into load/stores checks sym1 is
      not on stack (because the stack offset is not known at that point).
      But sym1 could be nil, which invalidates the check. Check merged
      sym instead.
      
      Fixes #19137.
      
      Change-Id: I8574da22ced1216bb5850403d8f08ec60a8d1005
      Reviewed-on: https://go-review.googlesource.com/37145
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      3557d546
    • Robert Griesemer's avatar
      math/bits: fix benchmarks (make sure calls don't get optimized away) · 3a239a6a
      Robert Griesemer authored
      Sum up function results and store them in an exported (global)
      variable. This prevents the compiler from optimizing away the
      otherwise side-effect free function calls.
      
      We now have more realistic set of benchmark numbers...
      
      Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.
      
      Note: These measurements are based on the same "old"
      implementation as the prior measurements (commit 7d5c003a).
      
      benchmark                     old ns/op     new ns/op     delta
      BenchmarkReverse-8            72.9          8.50          -88.34%
      BenchmarkReverse8-8           13.2          2.17          -83.56%
      BenchmarkReverse16-8          21.2          2.89          -86.37%
      BenchmarkReverse32-8          36.3          3.55          -90.22%
      BenchmarkReverse64-8          71.3          6.81          -90.45%
      BenchmarkReverseBytes-8       11.2          3.49          -68.84%
      BenchmarkReverseBytes16-8     6.24          0.93          -85.10%
      BenchmarkReverseBytes32-8     7.40          1.55          -79.05%
      BenchmarkReverseBytes64-8     10.5          2.47          -76.48%
      
      Reverse-8         72.9ns ± 0%   8.5ns ± 0%   ~     (p=1.000 n=1+1)
      Reverse8-8        13.2ns ± 0%   2.2ns ± 0%   ~     (p=1.000 n=1+1)
      Reverse16-8       21.2ns ± 0%   2.9ns ± 0%   ~     (p=1.000 n=1+1)
      Reverse32-8       36.3ns ± 0%   3.5ns ± 0%   ~     (p=1.000 n=1+1)
      Reverse64-8       71.3ns ± 0%   6.8ns ± 0%   ~     (p=1.000 n=1+1)
      ReverseBytes-8    11.2ns ± 0%   3.5ns ± 0%   ~     (p=1.000 n=1+1)
      ReverseBytes16-8  6.24ns ± 0%  0.93ns ± 0%   ~     (p=1.000 n=1+1)
      ReverseBytes32-8  7.40ns ± 0%  1.55ns ± 0%   ~     (p=1.000 n=1+1)
      ReverseBytes64-8  10.5ns ± 0%   2.5ns ± 0%   ~     (p=1.000 n=1+1)
      
      Change-Id: I8aef1334b84f6cafd25edccad7e6868b37969efb
      Reviewed-on: https://go-review.googlesource.com/37213Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      3a239a6a
    • Robert Griesemer's avatar
      math/bits: much faster ReverseBytes, added respective benchmarks · ddb15cea
      Robert Griesemer authored
      Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.
      
      benchmark                     old ns/op     new ns/op     delta
      BenchmarkReverseBytes-8       11.4          3.51          -69.21%
      BenchmarkReverseBytes16-8     6.87          0.64          -90.68%
      BenchmarkReverseBytes32-8     7.79          0.65          -91.66%
      BenchmarkReverseBytes64-8     11.6          0.64          -94.48%
      
      name              old time/op  new time/op  delta
      ReverseBytes-8    11.4ns ± 0%   3.5ns ± 0%   ~     (p=1.000 n=1+1)
      ReverseBytes16-8  6.87ns ± 0%  0.64ns ± 0%   ~     (p=1.000 n=1+1)
      ReverseBytes32-8  7.79ns ± 0%  0.65ns ± 0%   ~     (p=1.000 n=1+1)
      ReverseBytes64-8  11.6ns ± 0%   0.6ns ± 0%   ~     (p=1.000 n=1+1)
      
      Change-Id: I67b529652b3b613c61687e9e185e8d4ee40c51a2
      Reviewed-on: https://go-review.googlesource.com/37211
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      ddb15cea
    • Robert Griesemer's avatar
      math/bits: much faster Reverse, added respective benchmarks · 7d5c003a
      Robert Griesemer authored
      Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.
      
      name         old time/op  new time/op  delta
      Reverse-8    76.6ns ± 0%   8.1ns ± 0%   ~     (p=1.000 n=1+1)
      Reverse8-8   12.6ns ± 0%   0.6ns ± 0%   ~     (p=1.000 n=1+1)
      Reverse16-8  20.8ns ± 0%   0.6ns ± 0%   ~     (p=1.000 n=1+1)
      Reverse32-8  36.5ns ± 0%   0.6ns ± 0%   ~     (p=1.000 n=1+1)
      Reverse64-8  74.0ns ± 0%   6.4ns ± 0%   ~     (p=1.000 n=1+1)
      
      benchmark                old ns/op     new ns/op     delta
      BenchmarkReverse-8       76.6          8.07          -89.46%
      BenchmarkReverse8-8      12.6          0.64          -94.92%
      BenchmarkReverse16-8     20.8          0.64          -96.92%
      BenchmarkReverse32-8     36.5          0.64          -98.25%
      BenchmarkReverse64-8     74.0          6.38          -91.38%
      
      Change-Id: I6b99b10cee2f2babfe79342b50ee36a45a34da30
      Reviewed-on: https://go-review.googlesource.com/37149
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      7d5c003a
    • Cherry Zhang's avatar
      cmd/compile: fix some types in SSA · c4b8dadb
      Cherry Zhang authored
      These seem not to really matter, but good to be correct.
      
      Change-Id: I02edb9797c3d6739725cfbe4723c75f151acd05e
      Reviewed-on: https://go-review.googlesource.com/36837
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarJosh Bleecher Snyder <josharian@gmail.com>
      c4b8dadb
    • Cherry Zhang's avatar
      cmd/compile: redo writebarrier pass · c4ef597c
      Cherry Zhang authored
      SSA's writebarrier pass requires WB store ops are always at the
      end of a block. If we move write barrier insertion into SSA and
      emits normal Store ops when building SSA, this requirement becomes
      impractical -- it will create too many blocks for all the Store
      ops.
      
      Redo SSA's writebarrier pass, explicitly order values in store
      order, so it no longer needs this requirement.
      
      Updates #17583.
      Fixes #19067.
      
      Change-Id: I66e817e526affb7e13517d4245905300a90b7170
      Reviewed-on: https://go-review.googlesource.com/36834
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      c4ef597c
    • Cherry Zhang's avatar
      cmd/compile: re-enable nilcheck removal in same block · 98061fa5
      Cherry Zhang authored
      Nil check removal in the same block is disabled due to issue 18725:
      because the values are not ordered, a nilcheck may influence a
      value that is logically before it. This CL re-enables same-block
      nilcheck removal by ordering values in store order first.
      
      Updates #18725.
      
      Change-Id: I287a38525230c14c5412cbcdbc422547dabd54f6
      Reviewed-on: https://go-review.googlesource.com/35496
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      98061fa5
    • Robert Griesemer's avatar
      math/bits: expand doc strings for all functions · 81acd308
      Robert Griesemer authored
      Follow-up on https://go-review.googlesource.com/36315.
      No functionality change.
      
      For #18616.
      
      Change-Id: Id4df34dd7d0381be06eea483a11bf92f4a01f604
      Reviewed-on: https://go-review.googlesource.com/37140Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      81acd308
    • Koki Ide's avatar
      all: fix a few typos in comments · 045ad5ba
      Koki Ide authored
      Change-Id: I0455ffaa51c661803d8013c7961910f920d3c3cc
      Reviewed-on: https://go-review.googlesource.com/37043Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      045ad5ba
    • Dmitry Vyukov's avatar
      sync: make Mutex more fair · 0556e262
      Dmitry Vyukov authored
      Add new starvation mode for Mutex.
      In starvation mode ownership is directly handed off from
      unlocking goroutine to the next waiter. New arriving goroutines
      don't compete for ownership.
      Unfair wait time is now limited to 1ms.
      Also fix a long standing bug that goroutines were requeued
      at the tail of the wait queue. That lead to even more unfair
      acquisition times with multiple waiters.
      
      Performance of normal mode is not considerably affected.
      
      Fixes #13086
      
      On the provided in the issue lockskew program:
      
      done in 1.207853ms
      done in 1.177451ms
      done in 1.184168ms
      done in 1.198633ms
      done in 1.185797ms
      done in 1.182502ms
      done in 1.316485ms
      done in 1.211611ms
      done in 1.182418ms
      
      name                    old time/op  new time/op   delta
      MutexUncontended-48     0.65ns ± 0%   0.65ns ± 1%     ~           (p=0.087 n=10+10)
      Mutex-48                 112ns ± 1%    114ns ± 1%   +1.69%        (p=0.000 n=10+10)
      MutexSlack-48            113ns ± 0%     87ns ± 1%  -22.65%         (p=0.000 n=8+10)
      MutexWork-48             149ns ± 0%    145ns ± 0%   -2.48%         (p=0.000 n=9+10)
      MutexWorkSlack-48        149ns ± 0%    122ns ± 3%  -18.26%         (p=0.000 n=6+10)
      MutexNoSpin-48           103ns ± 4%    105ns ± 3%     ~           (p=0.089 n=10+10)
      MutexSpin-48             490ns ± 4%    515ns ± 6%   +5.08%        (p=0.006 n=10+10)
      Cond32-48               13.4µs ± 6%   13.1µs ± 5%   -2.75%        (p=0.023 n=10+10)
      RWMutexWrite100-48      53.2ns ± 3%   41.2ns ± 3%  -22.57%        (p=0.000 n=10+10)
      RWMutexWrite10-48       45.9ns ± 2%   43.9ns ± 2%   -4.38%        (p=0.000 n=10+10)
      RWMutexWorkWrite100-48   122ns ± 2%    134ns ± 1%   +9.92%        (p=0.000 n=10+10)
      RWMutexWorkWrite10-48    206ns ± 1%    188ns ± 1%   -8.52%         (p=0.000 n=8+10)
      Cond32-24               12.1µs ± 3%   12.4µs ± 3%   +1.98%         (p=0.043 n=10+9)
      MutexUncontended-24     0.74ns ± 1%   0.75ns ± 1%     ~           (p=0.650 n=10+10)
      Mutex-24                 122ns ± 2%    124ns ± 1%   +1.31%        (p=0.007 n=10+10)
      MutexSlack-24           96.9ns ± 2%  102.8ns ± 2%   +6.11%        (p=0.000 n=10+10)
      MutexWork-24             146ns ± 1%    135ns ± 2%   -7.70%         (p=0.000 n=10+9)
      MutexWorkSlack-24        135ns ± 1%    128ns ± 2%   -5.01%         (p=0.000 n=10+9)
      MutexNoSpin-24           114ns ± 3%    110ns ± 4%   -3.84%        (p=0.000 n=10+10)
      MutexSpin-24             482ns ± 4%    475ns ± 8%     ~           (p=0.286 n=10+10)
      RWMutexWrite100-24      43.0ns ± 3%   43.1ns ± 2%     ~           (p=0.956 n=10+10)
      RWMutexWrite10-24       43.4ns ± 1%   43.2ns ± 1%     ~            (p=0.085 n=10+9)
      RWMutexWorkWrite100-24   130ns ± 3%    131ns ± 3%     ~           (p=0.747 n=10+10)
      RWMutexWorkWrite10-24    191ns ± 1%    192ns ± 1%     ~           (p=0.210 n=10+10)
      Cond32-12               11.5µs ± 2%   11.7µs ± 2%   +1.98%        (p=0.002 n=10+10)
      MutexUncontended-12     1.48ns ± 0%   1.50ns ± 1%   +1.08%        (p=0.004 n=10+10)
      Mutex-12                 141ns ± 1%    143ns ± 1%   +1.63%        (p=0.000 n=10+10)
      MutexSlack-12            121ns ± 0%    119ns ± 0%   -1.65%          (p=0.001 n=8+9)
      MutexWork-12             141ns ± 2%    150ns ± 3%   +6.36%         (p=0.000 n=9+10)
      MutexWorkSlack-12        131ns ± 0%    138ns ± 0%   +5.73%         (p=0.000 n=9+10)
      MutexNoSpin-12          87.0ns ± 1%   83.7ns ± 1%   -3.80%        (p=0.000 n=10+10)
      MutexSpin-12             364ns ± 1%    377ns ± 1%   +3.77%        (p=0.000 n=10+10)
      RWMutexWrite100-12      42.8ns ± 1%   43.9ns ± 1%   +2.41%         (p=0.000 n=8+10)
      RWMutexWrite10-12       39.8ns ± 4%   39.3ns ± 1%     ~            (p=0.433 n=10+9)
      RWMutexWorkWrite100-12   131ns ± 1%    131ns ± 0%     ~            (p=0.591 n=10+9)
      RWMutexWorkWrite10-12    173ns ± 1%    174ns ± 0%     ~            (p=0.059 n=10+8)
      Cond32-6                10.9µs ± 2%   10.9µs ± 2%     ~           (p=0.739 n=10+10)
      MutexUncontended-6      2.97ns ± 0%   2.97ns ± 0%     ~     (all samples are equal)
      Mutex-6                  122ns ± 6%    122ns ± 2%     ~           (p=0.668 n=10+10)
      MutexSlack-6             149ns ± 3%    142ns ± 3%   -4.63%        (p=0.000 n=10+10)
      MutexWork-6              136ns ± 3%    140ns ± 5%     ~           (p=0.077 n=10+10)
      MutexWorkSlack-6         152ns ± 0%    138ns ± 2%   -9.21%         (p=0.000 n=6+10)
      MutexNoSpin-6            150ns ± 1%    152ns ± 0%   +1.50%         (p=0.000 n=8+10)
      MutexSpin-6              726ns ± 0%    730ns ± 1%     ~           (p=0.069 n=10+10)
      RWMutexWrite100-6       40.6ns ± 1%   40.9ns ± 1%   +0.91%         (p=0.001 n=8+10)
      RWMutexWrite10-6        37.1ns ± 0%   37.0ns ± 1%     ~            (p=0.386 n=9+10)
      RWMutexWorkWrite100-6    133ns ± 1%    134ns ± 1%   +1.01%         (p=0.005 n=9+10)
      RWMutexWorkWrite10-6     152ns ± 0%    152ns ± 0%     ~     (all samples are equal)
      Cond32-2                7.86µs ± 2%   7.95µs ± 2%   +1.10%        (p=0.023 n=10+10)
      MutexUncontended-2      8.10ns ± 0%   9.11ns ± 4%  +12.44%         (p=0.000 n=9+10)
      Mutex-2                 32.9ns ± 9%   38.4ns ± 6%  +16.58%        (p=0.000 n=10+10)
      MutexSlack-2            93.4ns ± 1%   98.5ns ± 2%   +5.39%         (p=0.000 n=10+9)
      MutexWork-2             40.8ns ± 3%   43.8ns ± 7%   +7.38%         (p=0.000 n=10+9)
      MutexWorkSlack-2        98.6ns ± 5%  108.2ns ± 2%   +9.80%         (p=0.000 n=10+8)
      MutexNoSpin-2            399ns ± 1%    398ns ± 2%     ~             (p=0.463 n=8+9)
      MutexSpin-2             1.99µs ± 3%   1.97µs ± 1%   -0.81%          (p=0.003 n=9+8)
      RWMutexWrite100-2       37.6ns ± 5%   46.0ns ± 4%  +22.17%         (p=0.000 n=10+8)
      RWMutexWrite10-2        50.1ns ± 6%   36.8ns ±12%  -26.46%         (p=0.000 n=9+10)
      RWMutexWorkWrite100-2    136ns ± 0%    134ns ± 2%   -1.80%          (p=0.001 n=7+9)
      RWMutexWorkWrite10-2     140ns ± 1%    138ns ± 1%   -1.50%        (p=0.000 n=10+10)
      Cond32                  5.93µs ± 1%   5.91µs ± 0%     ~            (p=0.411 n=9+10)
      MutexUncontended        15.9ns ± 0%   15.8ns ± 0%   -0.63%          (p=0.000 n=8+8)
      Mutex                   15.9ns ± 0%   15.8ns ± 0%   -0.44%        (p=0.003 n=10+10)
      MutexSlack              26.9ns ± 3%   26.7ns ± 2%     ~           (p=0.084 n=10+10)
      MutexWork               47.8ns ± 0%   47.9ns ± 0%   +0.21%          (p=0.014 n=9+8)
      MutexWorkSlack          54.9ns ± 3%   54.5ns ± 3%     ~           (p=0.254 n=10+10)
      MutexNoSpin              786ns ± 2%    765ns ± 1%   -2.66%        (p=0.000 n=10+10)
      MutexSpin               3.87µs ± 1%   3.83µs ± 0%   -0.85%          (p=0.005 n=9+8)
      RWMutexWrite100         21.2ns ± 2%   21.0ns ± 1%   -0.88%         (p=0.018 n=10+9)
      RWMutexWrite10          22.6ns ± 1%   22.6ns ± 0%     ~             (p=0.471 n=9+9)
      RWMutexWorkWrite100      132ns ± 0%    132ns ± 0%     ~     (all samples are equal)
      RWMutexWorkWrite10       124ns ± 0%    123ns ± 0%     ~           (p=0.656 n=10+10)
      
      Change-Id: I66412a3a0980df1233ad7a5a0cd9723b4274528b
      Reviewed-on: https://go-review.googlesource.com/34310
      Run-TryBot: Russ Cox <rsc@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      0556e262
    • Wander Lairson Costa's avatar
      syscall: only call setgroups if we need to · 79f6a5c7
      Wander Lairson Costa authored
      If the caller set ups a Credential in os/exec.Command,
      os/exec.Command.Start will end up calling setgroups(2), even if no
      supplementary groups were given.
      
      Only root can call setgroups(2) on BSD kernels, which causes Start to
      fail for non-root users when they try to set uid and gid for the new
      process.
      
      We fix by introducing a new field to syscall.Credential named
      NoSetGroups, and setgroups(2) is only called if it is false.
      We make this field with inverted logic to preserve backward
      compatibility.
      
      RELNOTES=yes
      
      Change-Id: I3cff1f21c117a1430834f640ef21fd4e87e06804
      Reviewed-on: https://go-review.googlesource.com/36697Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      79f6a5c7
    • Keith Randall's avatar
      cmd/compile: move constant divide strength reduction to SSA rules · 708ba22a
      Keith Randall authored
      Currently the conversion from constant divides to multiplies is mostly
      done during the walk pass.  This is suboptimal because SSA can
      determine that the value being divided by is constant more often
      (e.g. after inlining).
      
      Change-Id: If1a9b993edd71be37396b9167f77da271966f85f
      Reviewed-on: https://go-review.googlesource.com/37015
      Run-TryBot: Keith Randall <khr@golang.org>
      Reviewed-by: 's avatarJosh Bleecher Snyder <josharian@gmail.com>
      708ba22a
  2. 16 Feb, 2017 12 commits
  3. 15 Feb, 2017 10 commits
    • Matthew Dempsky's avatar
      cmd/compile/internal/gc: skip useless loads for non-SSA params · a6b33312
      Matthew Dempsky authored
      Change-Id: I78ca43a0f0a6a162a2ade1352e2facb29432d4ac
      Reviewed-on: https://go-review.googlesource.com/37102
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      a6b33312
    • Matthew Dempsky's avatar
      cmd/compile/internal/gc: document (*state).checkgoto · 862fde81
      Matthew Dempsky authored
      No behavior change.
      
      Change-Id: I595c15ee976adf21bdbabdf24edf203c9e446185
      Reviewed-on: https://go-review.googlesource.com/36958Reviewed-by: 's avatarJosh Bleecher Snyder <josharian@gmail.com>
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      862fde81
    • Ian Lance Taylor's avatar
      internal/poll: define PollDescriptor on plan9 · 45a5f79c
      Ian Lance Taylor authored
      Fixes #19114.
      
      Change-Id: I352add53d6ee8bf78792564225099f8537ac6b46
      Reviewed-on: https://go-review.googlesource.com/37106
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      Reviewed-by: 's avatarDavid du Colombier <0intro@gmail.com>
      45a5f79c
    • Sarah Adams's avatar
      doc: update Code of Conduct wording and scope · 025dfb13
      Sarah Adams authored
      This change removes the punitive language and anonymous reporting mechanism
      from the Code of Conduct document. Read on for the rationale.
      
      More than a year has passed since the Go Code of Conduct was introduced.
      In that time, there have been a small number (<30) of reports to the Working Group.
      Some reports we handled well, with positive outcomes for all involved.
      A few reports we handled badly, resulting in hurt feelings and a bad
      experience for all involved.
      
      On reflection, the reports that had positive outcomes were ones where the
      Working Group took the role of advisor/facilitator, listening to complaints and
      providing suggestions and advice to the parties involved.
      The reports that had negative outcomes were ones where the subject of the
      report felt threatened by the Working Group and Code of Conduct.
      
      After some discussion among the Working Group, we saw that we are most
      effective as facilitators, rather than disciplinarians. The various Go spaces
      already have moderators; this change to the CoC acknowledges their authority
      and places the group in a purely advisory role. If an incident is
      reported to the group we may provide information to or make a
      suggestion the moderators, but the Working Group need not (and should not) have
      any authority to take disciplinary action.
      
      In short, we want it to be clear that the Working Group are here to help
      resolve conflict, period.
      
      The second change made here is the removal of the anonymous reporting mechanism.
      To date, the quality of anonymous reports has been low, and with no way to
      reach out to the reporter for more information there is often very little we
      can do in response. Removing this one-way reporting mechanism strengthens the
      message that the Working Group are here to facilitate a constructive dialogue.
      
      Change-Id: Iee52aff5446accd0dae0c937bb3aa89709ad5fb4
      Reviewed-on: https://go-review.googlesource.com/37014Reviewed-by: 's avatarAndrew Gerrand <adg@golang.org>
      Reviewed-by: 's avatarRuss Cox <rsc@golang.org>
      025dfb13
    • Ian Lance Taylor's avatar
      os: skip TestPipeThreads on Solaris · ae1d0598
      Ian Lance Taylor authored
      I don't know why it is not working.  Filed issue 19111 for this.
      
      Fixes build.
      
      Update #19111.
      
      Change-Id: I76f8d6aafba5951da2f3ad7d10960419cca7dd1f
      Reviewed-on: https://go-review.googlesource.com/37092Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      ae1d0598
    • Ian Lance Taylor's avatar
      os: skip TestPipeThreads on Plan 9 · 0fe62e75
      Ian Lance Taylor authored
      It can't work since Plan 9 does not support the runtime poller.
      
      Fixes build.
      
      Change-Id: I9ec33eb66019d9364c6ff6519b61b32e59498559
      Reviewed-on: https://go-review.googlesource.com/37091
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      0fe62e75
    • Russ Cox's avatar
      runtime: do not call wakep from enlistWorker, to avoid possible deadlock · 1f77db94
      Russ Cox authored
      We have seen one instance of a production job suddenly spinning to
      100% CPU and becoming unresponsive. In that one instance, a SIGQUIT
      was sent after 328 minutes of spinning, and the stacks showed a single
      goroutine in "IO wait (scan)" state.
      
      Looking for things that might get stuck if a goroutine got stuck in
      scanning a stack, we found that injectglist does:
      
      	lock(&sched.lock)
      	var n int
      	for n = 0; glist != nil; n++ {
      		gp := glist
      		glist = gp.schedlink.ptr()
      		casgstatus(gp, _Gwaiting, _Grunnable)
      		globrunqput(gp)
      	}
      	unlock(&sched.lock)
      
      and that casgstatus spins on gp.atomicstatus until the _Gscan bit goes
      away. Essentially, this code locks sched.lock and then while holding
      sched.lock, waits to lock gp.atomicstatus.
      
      The code that is doing the scan is:
      
      	if castogscanstatus(gp, s, s|_Gscan) {
      		if !gp.gcscandone {
      			scanstack(gp, gcw)
      			gp.gcscandone = true
      		}
      		restartg(gp)
      		break loop
      	}
      
      More analysis showed that scanstack can, in a rare case, end up
      calling back into code that acquires sched.lock. For example:
      
      	runtime.scanstack at proc.go:866
      	calls runtime.gentraceback at mgcmark.go:842
      	calls runtime.scanstack$1 at traceback.go:378
      	calls runtime.scanframeworker at mgcmark.go:819
      	calls runtime.scanblock at mgcmark.go:904
      	calls runtime.greyobject at mgcmark.go:1221
      	calls (*runtime.gcWork).put at mgcmark.go:1412
      	calls (*runtime.gcControllerState).enlistWorker at mgcwork.go:127
      	calls runtime.wakep at mgc.go:632
      	calls runtime.startm at proc.go:1779
      	acquires runtime.sched.lock at proc.go:1675
      
      This path was found with an automated deadlock-detecting tool.
      There are many such paths but they all go through enlistWorker -> wakep.
      
      The evidence strongly suggests that one of these paths is what caused
      the deadlock we observed. We're running those jobs with
      GOTRACEBACK=crash now to try to get more information if it happens
      again.
      
      Further refinement and analysis shows that if we drop the wakep call
      from enlistWorker, the remaining few deadlock cycles found by the tool
      are all false positives caused by not understanding the effect of calls
      to func variables.
      
      The enlistWorker -> wakep call was intended only as a performance
      optimization, it rarely executes, and if it does execute at just the
      wrong time it can (and plausibly did) cause the deadlock we saw.
      
      Comment it out, to avoid the potential deadlock.
      
      Fixes #19112.
      Unfixes #14179.
      
      Change-Id: I6f7e10b890b991c11e79fab7aeefaf70b5d5a07b
      Reviewed-on: https://go-review.googlesource.com/37093
      Run-TryBot: Russ Cox <rsc@golang.org>
      Reviewed-by: 's avatarAustin Clements <austin@google.com>
      1f77db94
    • Hana Kim's avatar
      runtime/pprof: print newly added fields of runtime.MemStats · 8833af3f
      Hana Kim authored
      in heap profile with debug mode
      
      Change-Id: I3a80d03a4aa556614626067a8fd698b3b00f4290
      Reviewed-on: https://go-review.googlesource.com/36962Reviewed-by: 's avatarAustin Clements <austin@google.com>
      8833af3f
    • Heschi Kreinick's avatar
      cmd/compile/internal/ssa: display NamedValues in SSA html output. · 35a95df5
      Heschi Kreinick authored
      Change-Id: If268b42b32e6bcd6e7913bffa6e493dc78af40aa
      Reviewed-on: https://go-review.googlesource.com/36539
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Run-TryBot: Heschi Kreinick <heschi@google.com>
      Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      35a95df5
    • Lynn Boger's avatar
      cmd/go: improve stale reason for packages · 2ac32b63
      Lynn Boger authored
      This adds more information to the pkg stale reason for debugging
      purposes.
      
      Change-Id: I7b626db4520baa1127195ae859f4da9b49304636
      Reviewed-on: https://go-review.googlesource.com/36944Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      2ac32b63