Commits · 177dfba1120d2d5976bb5fb5a68bf20bb6ca9ada · go / golang

19 Feb, 2017 5 commits

Robert Griesemer authored Feb 18, 2017

Using some additional suggestions per "Hacker's Delight".
Added documentation and extra tests.

Measured on 1.7 GHz Intel Core i7, running macOS 10.12.3.

benchmark                  old ns/op     new ns/op     delta
BenchmarkOnesCount-4       7.34          5.38          -26.70%
BenchmarkOnesCount8-4      2.03          1.98          -2.46%
BenchmarkOnesCount16-4     2.56          2.50          -2.34%
BenchmarkOnesCount32-4     2.98          2.39          -19.80%
BenchmarkOnesCount64-4     4.22          2.96          -29.86%

Change-Id: I566b0ef766e55cf5776b1662b6016024ebe5d878
Reviewed-on: https://go-review.googlesource.com/37223Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

177dfba1

fmt: remove unused global variable byteType · d9a19f86

Martin Möhrmann authored Feb 19, 2017

Change list https://golang.org/cl/20686/ removed the last use
of the variable byteType.

Change-Id: I4ea79095136a49a9d22767b37f48f3404da05056
Reviewed-on: https://go-review.googlesource.com/37197
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

d9a19f86

cmd/compile: amd64, allow XCHG on stack pointers · cfb0d349

Keith Randall authored Feb 19, 2017

XCHG needs to allow the stack pointer as an argument because we have a
rewrite that incorporates the address of a local variable into the
instruction.

Fixes #19184

Change-Id: Ic438e6e1946332cdce3864d15abecd41b911b2a9
Reviewed-on: https://go-review.googlesource.com/37253
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>

cfb0d349

cmd/go/internal/envcmd: report PKG_CONFIG after the CGO group · f37428d8

Jaana Burcu Dogan authored Feb 18, 2017

Before the change, `go env` reports PKG_CONFIG in between the
CGO env group:

    GOARCH="amd64"
    GOBIN=""
    GOEXE=""
    GOHOSTARCH="amd64"
    GOHOSTOS="darwin"
    GOOS="darwin"
    GOPATH="/Users/jbd"
    GORACE=""
    GOROOT="/Users/jbd/go"
    GOTOOLDIR="/Users/jbd/go/pkg/tool/darwin_amd64"
    GCCGO="gccgo"
    CC="clang"
    GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/lq/qcn67khn4_1b41_g48x3zchh005d21/T/go-build184491598=/tmp/go-build -gno-record-gcc-switches -fno-common"
    CXX="clang++"
    CGO_ENABLED="1"
    PKG_CONFIG="pkg-config"
    CGO_CFLAGS="-g -O2"
    CGO_CPPFLAGS=""
    CGO_CXXFLAGS="-g -O2"
    CGO_FFLAGS="-g -O2"
    CGO_LDFLAGS="-g -O2"

The change makes PKG_CONFIG to be reported as the final item,
and not breaking the CGO_* group apart.

Change-Id: I1e7ed6bdec83009ff118f85c9f0f7b78a67fdd76
Reviewed-on: https://go-review.googlesource.com/37228Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

f37428d8

fmt: support sharp flag for float and complex value printing · e97f407e

Martin Möhrmann authored Feb 15, 2017

Added an alternate form of printing floats and complex values
by specifying the sharp flag.

Output formatted using the the verbs v, e, E, f, F, g and G in
combination with the sharp flag will always include a decimal point.

The alternate form specified by the sharp flag for %g and %G verbs
will not truncate trailing zeros and assume a default precision of 6.

Fixes #18857.

Change-Id: I4d776239e06d7a6a90f2d8556240a359888cb7c3
Reviewed-on: https://go-review.googlesource.com/37051Reviewed-by: Rob Pike <r@golang.org>

e97f407e

18 Feb, 2017 4 commits

net/url: document that Query returns only valid values · 1e69aefb

Kenny Grant authored Feb 18, 2017

Fixes #19110

Change-Id: I291fa4ec3c61145162acd019e3f0e5dd3d7c97e9
Reviewed-on: https://go-review.googlesource.com/37194Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

1e69aefb

math: protect benchmarked functions from being optimized away · 6cfc3b25

Martin Möhrmann authored Feb 18, 2017

Add exported global variables and store the results of benchmarked
functions in them. This prevents the current compiler optimizations
from removing the instructions that are needed to compute the return
values of the benchmarked functions.

Change-Id: If8b08424e85f3796bb6dd73e761c653abbabcc5e
Reviewed-on: https://go-review.googlesource.com/37195Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

6cfc3b25

os: remove incorrect detection of O_CLOEXEC flag on darwin · 6ef92b6e

Martin Möhrmann authored Feb 18, 2017

The below range loop will not stop when encountering
the first '.' character in a Darwin version string like "15.6.0".

for i = range osver {
   if osver[i] != '.' {
         continue
      }
   }
}

Therefore, the condition i > 2 was always satisfied and
supportsCloseOnExec was always set to true.

Since the minimum supported version of OSX for go is currently 10.8
and O_CLOEXEC is implemented from OSX 10.7 on the detection code
can be removed and support for O_CLOEXEC is always assumed to exist.

Change-Id: Idd10094d8385dd4adebc8d7a6d9e9a8f29455867
Reviewed-on: https://go-review.googlesource.com/37193Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

6ef92b6e

go/doc: allow : in godoc links · 497b608f

Kenny Grant authored Dec 18, 2016

The emphasize function used a complex regexp to find URLs, which
truncated some types of URL and did not match others.
This has been simplified and adjusted to allow valid punctuation
like :: or ! in the path part and :[] in the host part.
Comments were added to clarify what this regexp allows.
The path part matches query and fragment also so document this.
Removed news, telnet, wais, and prospero protocols.

Tests were added for:
 IPV6 URLs
 URLs surrounded by brackets
 URLs containing ::
 URLs containing :;!- in the path

In order to allow punctuation and yet preserve current behaviour,
URLs are not permitted to end in .,:;?! to allow the use of
normal punctuation surrounding URLs in comments.

Fixes #18139

Change-Id: I38b2d7a85fe0d171e4bf4aac420f8c2d3ced8a2f
Reviewed-on: https://go-review.googlesource.com/37192Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

497b608f

17 Feb, 2017 19 commits

math/bits: added benchmarks for Leading/TrailingZeros · a4a3d63d

Robert Griesemer authored Feb 17, 2017

BenchmarkLeadingZeros-8      	200000000	         8.80 ns/op
BenchmarkLeadingZeros8-8     	200000000	         8.21 ns/op
BenchmarkLeadingZeros16-8    	200000000	         7.49 ns/op
BenchmarkLeadingZeros32-8    	200000000	         7.80 ns/op
BenchmarkLeadingZeros64-8    	200000000	         8.67 ns/op

BenchmarkTrailingZeros-8     	1000000000	         2.05 ns/op
BenchmarkTrailingZeros8-8    	2000000000	         1.94 ns/op
BenchmarkTrailingZeros16-8   	2000000000	         1.94 ns/op
BenchmarkTrailingZeros32-8   	2000000000	         1.92 ns/op
BenchmarkTrailingZeros64-8   	2000000000	         2.03 ns/op

Change-Id: I45497bf2d6369ba6cfc88ded05aa735908af8908
Reviewed-on: https://go-review.googlesource.com/37220
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>

a4a3d63d

math/bits: faster Rotate functions, added respective benchmarks · 19028bdd

Robert Griesemer authored Feb 17, 2017

Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.

benchmark old ns/op new ns/op delta
BenchmarkRotateLeft-8 7.87 7.00 -11.05%
BenchmarkRotateLeft8-8 8.41 4.52 -46.25%
BenchmarkRotateLeft16-8 8.07 4.55 -43.62%
BenchmarkRotateLeft32-8 8.36 4.73 -43.42%
BenchmarkRotateLeft64-8 7.93 4.78 -39.72%

BenchmarkRotateRight-8 8.23 6.72 -18.35%
BenchmarkRotateRight8-8 8.76 4.39 -49.89%
BenchmarkRotateRight16-8 9.07 4.44 -51.05%
BenchmarkRotateRight32-8 8.85 4.46 -49.60%
BenchmarkRotateRight64-8 8.11 4.43 -45.38%

Change-Id: I79ea1e9e6fc65f95794a91f860a911efed3aa8a1
Reviewed-on: https://go-review.googlesource.com/37219Reviewed-by: Matthew Dempsky <mdempsky@google.com>

19028bdd

math/bits: faster OnesCount, added respective benchmarks · a12edb8d

Robert Griesemer authored Feb 17, 2017

Also: Changed Reverse/ReverseBytes implementations to use
the same (smaller) masks as OnesCount.

BenchmarkOnesCount-8 37.0 6.26 -83.08%
BenchmarkOnesCount8-8 7.24 1.99 -72.51%
BenchmarkOnesCount16-8 11.3 2.47 -78.14%
BenchmarkOnesCount32-8 18.4 3.02 -83.59%
BenchmarkOnesCount64-8 40.0 3.78 -90.55%
BenchmarkReverse-8 6.69 6.22 -7.03%
BenchmarkReverse8-8 1.64 1.64 +0.00%
BenchmarkReverse16-8 2.26 2.18 -3.54%
BenchmarkReverse32-8 2.88 2.87 -0.35%
BenchmarkReverse64-8 5.64 4.34 -23.05%
BenchmarkReverseBytes-8 2.48 2.17 -12.50%
BenchmarkReverseBytes16-8 0.63 0.95 +50.79%
BenchmarkReverseBytes32-8 1.13 1.24 +9.73%
BenchmarkReverseBytes64-8 2.50 2.16 -13.60%

OnesCount-8 37.0ns ± 0% 6.3ns ± 0% ~ (p=1.000 n=1+1)
OnesCount8-8 7.24ns ± 0% 1.99ns ± 0% ~ (p=1.000 n=1+1)
OnesCount16-8 11.3ns ± 0% 2.5ns ± 0% ~ (p=1.000 n=1+1)
OnesCount32-8 18.4ns ± 0% 3.0ns ± 0% ~ (p=1.000 n=1+1)
OnesCount64-8 40.0ns ± 0% 3.8ns ± 0% ~ (p=1.000 n=1+1)
Reverse-8 6.69ns ± 0% 6.22ns ± 0% ~ (p=1.000 n=1+1)
Reverse8-8 1.64ns ± 0% 1.64ns ± 0% ~ (all samples are equal)
Reverse16-8 2.26ns ± 0% 2.18ns ± 0% ~ (p=1.000 n=1+1)
Reverse32-8 2.88ns ± 0% 2.87ns ± 0% ~ (p=1.000 n=1+1)
Reverse64-8 5.64ns ± 0% 4.34ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes-8 2.48ns ± 0% 2.17ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes16-8 0.63ns ± 0% 0.95ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes32-8 1.13ns ± 0% 1.24ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes64-8 2.50ns ± 0% 2.16ns ± 0% ~ (p=1.000 n=1+1)

Change-Id: I591b0ffc83fc3a42828256b6e5030f32c64f9497
Reviewed-on: https://go-review.googlesource.com/37218Reviewed-by: Matthew Dempsky <mdempsky@google.com>

a12edb8d

cmd/compile/internal/ssa: combine load + op on AMD64 · 21c71d77

Ilya Tocar authored Feb 10, 2017

On AMD64 Most operation can have one operand in memory.
Combine load and dependand operation into one new operation,
where possible. I've seen no significant performance changes on go1,
but this allows to remove ~1.8kb code from go tool. And in math package
I see e. g.:

Remainder-6            70.0ns ± 0%   64.6ns ± 0%   -7.76%  (p=0.000 n=9+1
Change-Id: I88b8602b1d55da8ba548a34eb7da4b25d59a297e
Reviewed-on: https://go-review.googlesource.com/36793
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>

21c71d77

cmd/compile: fix 32-bit unsigned division on 64-bit machines · a9292b83

Keith Randall authored Feb 17, 2017

The type of an intermediate multiply was wrong.  When that
intermediate multiply was spilled, the top 32 bits were lost.

Fixes #19153

Change-Id: Ib29350a4351efa405935b7f7ee3c112668e64108
Reviewed-on: https://go-review.googlesource.com/37212
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>

a9292b83

math/bits: faster Reverse, ReverseBytes · 4498b683

Robert Griesemer authored Feb 17, 2017

- moved from: x&m>>k | x&^m<<k to: x&m>>k | x<<k&m
  This permits use of the same constant m twice (*) which may be
  better for machines that can't use large immediate constants
  directly with an AND instruction and have to load them explicitly.
  *) CPUs don't usually have a &^ instruction, so x&^m becomes x&(^m)

- simplified returns
  This improves the generated code because the compiler recognizes
  x>>k | x<<k as ROT when k is the bitsize of x.

The 8-bit versions of these instructions can be significantly faster
still if they are replaced with table lookups, as long as the table
is in cache. If the table is not in cache, table-lookup is probably
slower, hence the choice of an explicit register-only implementation
for now.

BenchmarkReverse-8            8.50          6.86          -19.29%
BenchmarkReverse8-8           2.17          1.74          -19.82%
BenchmarkReverse16-8          2.89          2.34          -19.03%
BenchmarkReverse32-8          3.55          2.95          -16.90%
BenchmarkReverse64-8          6.81          5.57          -18.21%
BenchmarkReverseBytes-8       3.49          2.48          -28.94%
BenchmarkReverseBytes16-8     0.93          0.62          -33.33%
BenchmarkReverseBytes32-8     1.55          1.13          -27.10%
BenchmarkReverseBytes64-8     2.47          2.47          +0.00%

Reverse-8         8.50ns ± 0%  6.86ns ± 0%   ~             (p=1.000 n=1+1)
Reverse8-8        2.17ns ± 0%  1.74ns ± 0%   ~             (p=1.000 n=1+1)
Reverse16-8       2.89ns ± 0%  2.34ns ± 0%   ~             (p=1.000 n=1+1)
Reverse32-8       3.55ns ± 0%  2.95ns ± 0%   ~             (p=1.000 n=1+1)
Reverse64-8       6.81ns ± 0%  5.57ns ± 0%   ~             (p=1.000 n=1+1)
ReverseBytes-8    3.49ns ± 0%  2.48ns ± 0%   ~             (p=1.000 n=1+1)
ReverseBytes16-8  0.93ns ± 0%  0.62ns ± 0%   ~             (p=1.000 n=1+1)
ReverseBytes32-8  1.55ns ± 0%  1.13ns ± 0%   ~             (p=1.000 n=1+1)
ReverseBytes64-8  2.47ns ± 0%  2.47ns ± 0%   ~     (all samples are equal)

Change-Id: I0064de8c7e0e568ca7885d6f7064344bef91a06d
Reviewed-on: https://go-review.googlesource.com/37215
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

4498b683

cmd/compile/internal/gc: remove Node.IsStatic field · c61cf5e6

Matthew Dempsky authored Feb 16, 2017

We can immediately emit static assignment data rather than queueing
them up to be processed during SSA building.

Passes toolstash -cmp.

Change-Id: I8bcea4b72eafb0cc0b849cd93e9cde9d84f30d5e
Reviewed-on: https://go-review.googlesource.com/37024
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>

c61cf5e6

cmd/compile: check both syms when folding address into load/store on ARM64 · 3557d546

Cherry Zhang authored Feb 17, 2017

The rules for folding addresses into load/stores checks sym1 is
not on stack (because the stack offset is not known at that point).
But sym1 could be nil, which invalidates the check. Check merged
sym instead.

Fixes #19137.

Change-Id: I8574da22ced1216bb5850403d8f08ec60a8d1005
Reviewed-on: https://go-review.googlesource.com/37145
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>

3557d546

math/bits: fix benchmarks (make sure calls don't get optimized away) · 3a239a6a

Robert Griesemer authored Feb 17, 2017

Sum up function results and store them in an exported (global)
variable. This prevents the compiler from optimizing away the
otherwise side-effect free function calls.

We now have more realistic set of benchmark numbers...

Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.

Note: These measurements are based on the same "old"
implementation as the prior measurements (commit 7d5c003a).

benchmark old ns/op new ns/op delta
BenchmarkReverse-8 72.9 8.50 -88.34%
BenchmarkReverse8-8 13.2 2.17 -83.56%
BenchmarkReverse16-8 21.2 2.89 -86.37%
BenchmarkReverse32-8 36.3 3.55 -90.22%
BenchmarkReverse64-8 71.3 6.81 -90.45%
BenchmarkReverseBytes-8 11.2 3.49 -68.84%
BenchmarkReverseBytes16-8 6.24 0.93 -85.10%
BenchmarkReverseBytes32-8 7.40 1.55 -79.05%
BenchmarkReverseBytes64-8 10.5 2.47 -76.48%

Reverse-8 72.9ns ± 0% 8.5ns ± 0% ~ (p=1.000 n=1+1)
Reverse8-8 13.2ns ± 0% 2.2ns ± 0% ~ (p=1.000 n=1+1)
Reverse16-8 21.2ns ± 0% 2.9ns ± 0% ~ (p=1.000 n=1+1)
Reverse32-8 36.3ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1)
Reverse64-8 71.3ns ± 0% 6.8ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes-8 11.2ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes16-8 6.24ns ± 0% 0.93ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes32-8 7.40ns ± 0% 1.55ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes64-8 10.5ns ± 0% 2.5ns ± 0% ~ (p=1.000 n=1+1)

Change-Id: I8aef1334b84f6cafd25edccad7e6868b37969efb
Reviewed-on: https://go-review.googlesource.com/37213Reviewed-by: Matthew Dempsky <mdempsky@google.com>

3a239a6a

math/bits: much faster ReverseBytes, added respective benchmarks · ddb15cea

Robert Griesemer authored Feb 17, 2017

Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.

benchmark                     old ns/op     new ns/op     delta
BenchmarkReverseBytes-8       11.4          3.51          -69.21%
BenchmarkReverseBytes16-8     6.87          0.64          -90.68%
BenchmarkReverseBytes32-8     7.79          0.65          -91.66%
BenchmarkReverseBytes64-8     11.6          0.64          -94.48%

name              old time/op  new time/op  delta
ReverseBytes-8    11.4ns ± 0%   3.5ns ± 0%   ~     (p=1.000 n=1+1)
ReverseBytes16-8  6.87ns ± 0%  0.64ns ± 0%   ~     (p=1.000 n=1+1)
ReverseBytes32-8  7.79ns ± 0%  0.65ns ± 0%   ~     (p=1.000 n=1+1)
ReverseBytes64-8  11.6ns ± 0%   0.6ns ± 0%   ~     (p=1.000 n=1+1)

Change-Id: I67b529652b3b613c61687e9e185e8d4ee40c51a2
Reviewed-on: https://go-review.googlesource.com/37211
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>

ddb15cea

math/bits: much faster Reverse, added respective benchmarks · 7d5c003a

Robert Griesemer authored Feb 17, 2017

Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.

name         old time/op  new time/op  delta
Reverse-8    76.6ns ± 0%   8.1ns ± 0%   ~     (p=1.000 n=1+1)
Reverse8-8   12.6ns ± 0%   0.6ns ± 0%   ~     (p=1.000 n=1+1)
Reverse16-8  20.8ns ± 0%   0.6ns ± 0%   ~     (p=1.000 n=1+1)
Reverse32-8  36.5ns ± 0%   0.6ns ± 0%   ~     (p=1.000 n=1+1)
Reverse64-8  74.0ns ± 0%   6.4ns ± 0%   ~     (p=1.000 n=1+1)

benchmark                old ns/op     new ns/op     delta
BenchmarkReverse-8       76.6          8.07          -89.46%
BenchmarkReverse8-8      12.6          0.64          -94.92%
BenchmarkReverse16-8     20.8          0.64          -96.92%
BenchmarkReverse32-8     36.5          0.64          -98.25%
BenchmarkReverse64-8     74.0          6.38          -91.38%

Change-Id: I6b99b10cee2f2babfe79342b50ee36a45a34da30
Reviewed-on: https://go-review.googlesource.com/37149
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>

7d5c003a

cmd/compile: fix some types in SSA · c4b8dadb

Cherry Zhang authored Feb 06, 2017

These seem not to really matter, but good to be correct.

Change-Id: I02edb9797c3d6739725cfbe4723c75f151acd05e
Reviewed-on: https://go-review.googlesource.com/36837
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>

c4b8dadb

cmd/compile: redo writebarrier pass · c4ef597c

Cherry Zhang authored Feb 01, 2017

SSA's writebarrier pass requires WB store ops are always at the
end of a block. If we move write barrier insertion into SSA and
emits normal Store ops when building SSA, this requirement becomes
impractical -- it will create too many blocks for all the Store
ops.

Redo SSA's writebarrier pass, explicitly order values in store
order, so it no longer needs this requirement.

Updates #17583.
Fixes #19067.

Change-Id: I66e817e526affb7e13517d4245905300a90b7170
Reviewed-on: https://go-review.googlesource.com/36834
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>

c4ef597c

cmd/compile: re-enable nilcheck removal in same block · 98061fa5

Cherry Zhang authored Jan 20, 2017

Nil check removal in the same block is disabled due to issue 18725:
because the values are not ordered, a nilcheck may influence a
value that is logically before it. This CL re-enables same-block
nilcheck removal by ordering values in store order first.

Updates #18725.

Change-Id: I287a38525230c14c5412cbcdbc422547dabd54f6
Reviewed-on: https://go-review.googlesource.com/35496
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>

98061fa5

math/bits: expand doc strings for all functions · 81acd308

Robert Griesemer authored Feb 16, 2017

Follow-up on https://go-review.googlesource.com/36315.
No functionality change.

For #18616.

Change-Id: Id4df34dd7d0381be06eea483a11bf92f4a01f604
Reviewed-on: https://go-review.googlesource.com/37140Reviewed-by: Matthew Dempsky <mdempsky@google.com>

81acd308

all: fix a few typos in comments · 045ad5ba

Koki Ide authored Feb 16, 2017

Change-Id: I0455ffaa51c661803d8013c7961910f920d3c3cc
Reviewed-on: https://go-review.googlesource.com/37043Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

045ad5ba

sync: make Mutex more fair · 0556e262

Dmitry Vyukov authored Dec 13, 2016

Add new starvation mode for Mutex.
In starvation mode ownership is directly handed off from
unlocking goroutine to the next waiter. New arriving goroutines
don't compete for ownership.
Unfair wait time is now limited to 1ms.
Also fix a long standing bug that goroutines were requeued
at the tail of the wait queue. That lead to even more unfair
acquisition times with multiple waiters.

Performance of normal mode is not considerably affected.

Fixes #13086

On the provided in the issue lockskew program:

done in 1.207853ms
done in 1.177451ms
done in 1.184168ms
done in 1.198633ms
done in 1.185797ms
done in 1.182502ms
done in 1.316485ms
done in 1.211611ms
done in 1.182418ms

name old time/op new time/op delta
MutexUncontended-48 0.65ns ± 0% 0.65ns ± 1% ~ (p=0.087 n=10+10)
Mutex-48 112ns ± 1% 114ns ± 1% +1.69% (p=0.000 n=10+10)
MutexSlack-48 113ns ± 0% 87ns ± 1% -22.65% (p=0.000 n=8+10)
MutexWork-48 149ns ± 0% 145ns ± 0% -2.48% (p=0.000 n=9+10)
MutexWorkSlack-48 149ns ± 0% 122ns ± 3% -18.26% (p=0.000 n=6+10)
MutexNoSpin-48 103ns ± 4% 105ns ± 3% ~ (p=0.089 n=10+10)
MutexSpin-48 490ns ± 4% 515ns ± 6% +5.08% (p=0.006 n=10+10)
Cond32-48 13.4µs ± 6% 13.1µs ± 5% -2.75% (p=0.023 n=10+10)
RWMutexWrite100-48 53.2ns ± 3% 41.2ns ± 3% -22.57% (p=0.000 n=10+10)
RWMutexWrite10-48 45.9ns ± 2% 43.9ns ± 2% -4.38% (p=0.000 n=10+10)
RWMutexWorkWrite100-48 122ns ± 2% 134ns ± 1% +9.92% (p=0.000 n=10+10)
RWMutexWorkWrite10-48 206ns ± 1% 188ns ± 1% -8.52% (p=0.000 n=8+10)
Cond32-24 12.1µs ± 3% 12.4µs ± 3% +1.98% (p=0.043 n=10+9)
MutexUncontended-24 0.74ns ± 1% 0.75ns ± 1% ~ (p=0.650 n=10+10)
Mutex-24 122ns ± 2% 124ns ± 1% +1.31% (p=0.007 n=10+10)
MutexSlack-24 96.9ns ± 2% 102.8ns ± 2% +6.11% (p=0.000 n=10+10)
MutexWork-24 146ns ± 1% 135ns ± 2% -7.70% (p=0.000 n=10+9)
MutexWorkSlack-24 135ns ± 1% 128ns ± 2% -5.01% (p=0.000 n=10+9)
MutexNoSpin-24 114ns ± 3% 110ns ± 4% -3.84% (p=0.000 n=10+10)
MutexSpin-24 482ns ± 4% 475ns ± 8% ~ (p=0.286 n=10+10)
RWMutexWrite100-24 43.0ns ± 3% 43.1ns ± 2% ~ (p=0.956 n=10+10)
RWMutexWrite10-24 43.4ns ± 1% 43.2ns ± 1% ~ (p=0.085 n=10+9)
RWMutexWorkWrite100-24 130ns ± 3% 131ns ± 3% ~ (p=0.747 n=10+10)
RWMutexWorkWrite10-24 191ns ± 1% 192ns ± 1% ~ (p=0.210 n=10+10)
Cond32-12 11.5µs ± 2% 11.7µs ± 2% +1.98% (p=0.002 n=10+10)
MutexUncontended-12 1.48ns ± 0% 1.50ns ± 1% +1.08% (p=0.004 n=10+10)
Mutex-12 141ns ± 1% 143ns ± 1% +1.63% (p=0.000 n=10+10)
MutexSlack-12 121ns ± 0% 119ns ± 0% -1.65% (p=0.001 n=8+9)
MutexWork-12 141ns ± 2% 150ns ± 3% +6.36% (p=0.000 n=9+10)
MutexWorkSlack-12 131ns ± 0% 138ns ± 0% +5.73% (p=0.000 n=9+10)
MutexNoSpin-12 87.0ns ± 1% 83.7ns ± 1% -3.80% (p=0.000 n=10+10)
MutexSpin-12 364ns ± 1% 377ns ± 1% +3.77% (p=0.000 n=10+10)
RWMutexWrite100-12 42.8ns ± 1% 43.9ns ± 1% +2.41% (p=0.000 n=8+10)
RWMutexWrite10-12 39.8ns ± 4% 39.3ns ± 1% ~ (p=0.433 n=10+9)
RWMutexWorkWrite100-12 131ns ± 1% 131ns ± 0% ~ (p=0.591 n=10+9)
RWMutexWorkWrite10-12 173ns ± 1% 174ns ± 0% ~ (p=0.059 n=10+8)
Cond32-6 10.9µs ± 2% 10.9µs ± 2% ~ (p=0.739 n=10+10)
MutexUncontended-6 2.97ns ± 0% 2.97ns ± 0% ~ (all samples are equal)
Mutex-6 122ns ± 6% 122ns ± 2% ~ (p=0.668 n=10+10)
MutexSlack-6 149ns ± 3% 142ns ± 3% -4.63% (p=0.000 n=10+10)
MutexWork-6 136ns ± 3% 140ns ± 5% ~ (p=0.077 n=10+10)
MutexWorkSlack-6 152ns ± 0% 138ns ± 2% -9.21% (p=0.000 n=6+10)
MutexNoSpin-6 150ns ± 1% 152ns ± 0% +1.50% (p=0.000 n=8+10)
MutexSpin-6 726ns ± 0% 730ns ± 1% ~ (p=0.069 n=10+10)
RWMutexWrite100-6 40.6ns ± 1% 40.9ns ± 1% +0.91% (p=0.001 n=8+10)
RWMutexWrite10-6 37.1ns ± 0% 37.0ns ± 1% ~ (p=0.386 n=9+10)
RWMutexWorkWrite100-6 133ns ± 1% 134ns ± 1% +1.01% (p=0.005 n=9+10)
RWMutexWorkWrite10-6 152ns ± 0% 152ns ± 0% ~ (all samples are equal)
Cond32-2 7.86µs ± 2% 7.95µs ± 2% +1.10% (p=0.023 n=10+10)
MutexUncontended-2 8.10ns ± 0% 9.11ns ± 4% +12.44% (p=0.000 n=9+10)
Mutex-2 32.9ns ± 9% 38.4ns ± 6% +16.58% (p=0.000 n=10+10)
MutexSlack-2 93.4ns ± 1% 98.5ns ± 2% +5.39% (p=0.000 n=10+9)
MutexWork-2 40.8ns ± 3% 43.8ns ± 7% +7.38% (p=0.000 n=10+9)
MutexWorkSlack-2 98.6ns ± 5% 108.2ns ± 2% +9.80% (p=0.000 n=10+8)
MutexNoSpin-2 399ns ± 1% 398ns ± 2% ~ (p=0.463 n=8+9)
MutexSpin-2 1.99µs ± 3% 1.97µs ± 1% -0.81% (p=0.003 n=9+8)
RWMutexWrite100-2 37.6ns ± 5% 46.0ns ± 4% +22.17% (p=0.000 n=10+8)
RWMutexWrite10-2 50.1ns ± 6% 36.8ns ±12% -26.46% (p=0.000 n=9+10)
RWMutexWorkWrite100-2 136ns ± 0% 134ns ± 2% -1.80% (p=0.001 n=7+9)
RWMutexWorkWrite10-2 140ns ± 1% 138ns ± 1% -1.50% (p=0.000 n=10+10)
Cond32 5.93µs ± 1% 5.91µs ± 0% ~ (p=0.411 n=9+10)
MutexUncontended 15.9ns ± 0% 15.8ns ± 0% -0.63% (p=0.000 n=8+8)
Mutex 15.9ns ± 0% 15.8ns ± 0% -0.44% (p=0.003 n=10+10)
MutexSlack 26.9ns ± 3% 26.7ns ± 2% ~ (p=0.084 n=10+10)
MutexWork 47.8ns ± 0% 47.9ns ± 0% +0.21% (p=0.014 n=9+8)
MutexWorkSlack 54.9ns ± 3% 54.5ns ± 3% ~ (p=0.254 n=10+10)
MutexNoSpin 786ns ± 2% 765ns ± 1% -2.66% (p=0.000 n=10+10)
MutexSpin 3.87µs ± 1% 3.83µs ± 0% -0.85% (p=0.005 n=9+8)
RWMutexWrite100 21.2ns ± 2% 21.0ns ± 1% -0.88% (p=0.018 n=10+9)
RWMutexWrite10 22.6ns ± 1% 22.6ns ± 0% ~ (p=0.471 n=9+9)
RWMutexWorkWrite100 132ns ± 0% 132ns ± 0% ~ (all samples are equal)
RWMutexWorkWrite10 124ns ± 0% 123ns ± 0% ~ (p=0.656 n=10+10)

Change-Id: I66412a3a0980df1233ad7a5a0cd9723b4274528b
Reviewed-on: https://go-review.googlesource.com/34310
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

0556e262

syscall: only call setgroups if we need to · 79f6a5c7

Wander Lairson Costa authored Feb 10, 2017

If the caller set ups a Credential in os/exec.Command,
os/exec.Command.Start will end up calling setgroups(2), even if no
supplementary groups were given.

Only root can call setgroups(2) on BSD kernels, which causes Start to
fail for non-root users when they try to set uid and gid for the new
process.

We fix by introducing a new field to syscall.Credential named
NoSetGroups, and setgroups(2) is only called if it is false.
We make this field with inverted logic to preserve backward
compatibility.

RELNOTES=yes

Change-Id: I3cff1f21c117a1430834f640ef21fd4e87e06804
Reviewed-on: https://go-review.googlesource.com/36697Reviewed-by: Ian Lance Taylor <iant@golang.org>

79f6a5c7

cmd/compile: move constant divide strength reduction to SSA rules · 708ba22a

Keith Randall authored Feb 14, 2017

Currently the conversion from constant divides to multiplies is mostly
done during the walk pass.  This is suboptimal because SSA can
determine that the value being divided by is constant more often
(e.g. after inlining).

Change-Id: If1a9b993edd71be37396b9167f77da271966f85f
Reviewed-on: https://go-review.googlesource.com/37015
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>

708ba22a

16 Feb, 2017 12 commits

cmd/compile: simplify needwritebarrier · 794f1ebf

Matthew Dempsky authored Feb 16, 2017

Currently, whether we need a write barrier is simply a property of the
pointer slot being written to.

The only optimization we currently apply using the value being written
is that pointers to stack variables can omit write barriers because
they're only written to stack slots... but we already omit write
barriers for all writes to the stack anyway.

Passes toolstash -cmp.

Change-Id: I7f16b71ff473899ed96706232d371d5b2b7ae789
Reviewed-on: https://go-review.googlesource.com/37109Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

794f1ebf

math: fix typos in Bessel function docs · 211102c8

Shenghou Ma authored Jan 28, 2017

While we're at it, also document Yn(0, 0) = -Inf for completeness.

Fixes #18823.

Change-Id: Ib6db68f76d29cc2373c12ebdf3fab129cac8c167
Reviewed-on: https://go-review.googlesource.com/35970Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

211102c8

math/bits: added package for bit-level counting and manipulation · 661e2179

Robert Griesemer authored Feb 03, 2017

Initial platform-independent implementation.

For #18616.

Change-Id: I4585c55b963101af9059c06c1b8a866cb384754c
Reviewed-on: https://go-review.googlesource.com/36315Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

661e2179

cmd/compile/internal/syntax: better errors and recovery for invalid character literals · 1693e7b6

Robert Griesemer authored Feb 16, 2017

Fixes #15611.

Change-Id: I352b145026466cafef8cf87addafbd30716bda24
Reviewed-on: https://go-review.googlesource.com/37138
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>

1693e7b6

runtime: use balanced tree for addr lookup in semaphore implementation · 990124da

Russ Cox authored Feb 12, 2017

CL 36792 fixed #17953, a linear scan caused by n goroutines piling into
two different locks that hashed to the same bucket in the semaphore table.
In that CL, n goroutines contending for 2 unfortunately chosen locks
went from O(n²) to O(n).

This CL fixes a different linear scan, when n goroutines are contending for
n/2 different locks that all hash to the same bucket in the semaphore table.
In this CL, n goroutines contending for n/2 unfortunately chosen locks
goes from O(n²) to O(n log n). This case is much less likely, but any linear
scan eventually hurts, so we might as well fix it while the problem is fresh
in our minds.

The new test in this CL checks for both linear scans.

The effect of this CL on the sync benchmarks is negligible
(but it fixes the new test).

name old time/op new time/op delta
Cond1-48 576ns ±10% 575ns ±13% ~ (p=0.679 n=71+71)
Cond2-48 1.59µs ± 8% 1.61µs ± 9% ~ (p=0.107 n=73+69)
Cond4-48 4.56µs ± 7% 4.55µs ± 7% ~ (p=0.670 n=74+72)
Cond8-48 9.87µs ± 9% 9.90µs ± 7% ~ (p=0.507 n=69+73)
Cond16-48 20.4µs ± 7% 20.4µs ±10% ~ (p=0.588 n=69+71)
Cond32-48 45.4µs ±10% 45.4µs ±14% ~ (p=0.944 n=73+73)
UncontendedSemaphore-48 19.7ns ±12% 19.7ns ± 8% ~ (p=0.589 n=65+63)
ContendedSemaphore-48 55.4ns ±26% 54.9ns ±32% ~ (p=0.441 n=75+75)
MutexUncontended-48 0.63ns ± 0% 0.63ns ± 0% ~ (all equal)
Mutex-48 210ns ± 6% 213ns ±10% +1.30% (p=0.035 n=70+74)
MutexSlack-48 210ns ± 7% 211ns ± 9% ~ (p=0.184 n=71+72)
MutexWork-48 299ns ± 5% 300ns ± 5% ~ (p=0.678 n=73+75)
MutexWorkSlack-48 302ns ± 6% 300ns ± 5% ~ (p=0.149 n=74+72)
MutexNoSpin-48 135ns ± 6% 135ns ±10% ~ (p=0.788 n=67+75)
MutexSpin-48 693ns ± 5% 689ns ± 6% ~ (p=0.092 n=65+74)
Once-48 0.22ns ±25% 0.22ns ±24% ~ (p=0.882 n=74+73)
Pool-48 5.88ns ±36% 5.79ns ±24% ~ (p=0.655 n=69+69)
PoolOverflow-48 4.79µs ±18% 4.87µs ±20% ~ (p=0.233 n=75+75)
SemaUncontended-48 0.80ns ± 1% 0.82ns ± 8% +2.46% (p=0.000 n=60+74)
SemaSyntNonblock-48 103ns ± 4% 102ns ± 5% -1.11% (p=0.003 n=75+75)
SemaSyntBlock-48 104ns ± 4% 104ns ± 5% ~ (p=0.231 n=71+75)
SemaWorkNonblock-48 128ns ± 4% 129ns ± 6% +1.51% (p=0.000 n=63+75)
SemaWorkBlock-48 129ns ± 8% 130ns ± 7% ~ (p=0.072 n=75+74)
RWMutexUncontended-48 2.35ns ± 1% 2.35ns ± 0% ~ (p=0.144 n=70+55)
RWMutexWrite100-48 139ns ±18% 141ns ±21% ~ (p=0.071 n=75+73)
RWMutexWrite10-48 145ns ± 9% 145ns ± 8% ~ (p=0.553 n=75+75)
RWMutexWorkWrite100-48 297ns ±13% 297ns ±15% ~ (p=0.519 n=75+74)
RWMutexWorkWrite10-48 588ns ± 7% 585ns ± 5% ~ (p=0.173 n=73+70)
WaitGroupUncontended-48 0.87ns ± 0% 0.87ns ± 0% ~ (all equal)
WaitGroupAddDone-48 63.2ns ± 4% 62.7ns ± 4% -0.82% (p=0.027 n=72+75)
WaitGroupAddDoneWork-48 109ns ± 5% 109ns ± 4% ~ (p=0.233 n=75+75)
WaitGroupWait-48 0.17ns ± 0% 0.16ns ±16% -8.55% (p=0.000 n=56+75)
WaitGroupWaitWork-48 1.78ns ± 1% 2.08ns ± 5% +16.92% (p=0.000 n=74+70)
WaitGroupActuallyWait-48 52.0ns ± 3% 50.6ns ± 5% -2.70% (p=0.000 n=71+69)

https://perf.golang.org/search?q=upload:20170215.1

Change-Id: Ia29a8bd006c089e401ec4297c3038cca656bcd0a
Reviewed-on: https://go-review.googlesource.com/37103
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

990124da

cmd/compile/internal/gc: drop unused src.XPos params in SSA builder · fc456c7f

Matthew Dempsky authored Feb 16, 2017

Passes toolstash -cmp.

Change-Id: I037278404ebf762482557e2b6867cbc595074a83
Reviewed-on: https://go-review.googlesource.com/37023
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>

fc456c7f

runtime: run mutexevent profiling without holding semaRoot lock · 58d76217

Russ Cox authored Feb 13, 2017

Suggested by Dmitry in CL 36792 review.
Clearly safe since there are many different semaRoots
that could all have profiled sudogs calling mutexevent.

Change-Id: I45eed47a5be3e513b2dad63b60afcd94800e16d1
Reviewed-on: https://go-review.googlesource.com/37104
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

58d76217

sync: deflake TestWaitGroupMisuse2 · 83f95b85

Russ Cox authored Feb 13, 2017

Also runs 100X faster on average, because it takes so many
fewer attempts to trigger the failure.

Fixes #11443.

Change-Id: I8c39ee48bb3ff6c36fa63083e04076771b65a80d
Reviewed-on: https://go-review.googlesource.com/36841
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

83f95b85

doc: document go1.8 · 863035ef

Chris Broadfoot authored Feb 16, 2017

Change-Id: Ie2144d001c6b4b2293d07b2acf62d7e3cd0b46a7
Reviewed-on: https://go-review.googlesource.com/37130Reviewed-by: Russ Cox <rsc@golang.org>

863035ef

cmd/link: delay calculating pe file parameters after Linkmode is set · 0ad247c6

Alex Brainman authored Feb 14, 2017

For #10776.

Change-Id: Id64a7e35c7cdcd9be16cbe3358402fa379090e36
Reviewed-on: https://go-review.googlesource.com/36975Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

0ad247c6

cmd/link: set pe section and file alignment to 0 during external linking · e31144f1

Alex Brainman authored Feb 05, 2017

This is what gcc does when it generates object files.
And it is easier to count everything, when it starts from 0.
Make go linker do the same.

gcc also does not output IMAGE_OPTIONAL_HEADER or
PE64_IMAGE_OPTIONAL_HEADER for object files.
Perhaps we should do the same, but not in this CL.

For #10776.

Change-Id: I9789c337648623b6cfaa7d18d1ac9cef32e180dc
Reviewed-on: https://go-review.googlesource.com/36974Reviewed-by: Ian Lance Taylor <iant@golang.org>

e31144f1

debug/pe: add test to check dwarf info · 64c02460

Alex Brainman authored Feb 02, 2017

For #10776.

Change-Id: I7931558257c1f6b895e4d44b46d320a54de0d677
Reviewed-on: https://go-review.googlesource.com/36973
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>

64c02460