- 19 Feb, 2017 5 commits
-
-
Robert Griesemer authored
Using some additional suggestions per "Hacker's Delight". Added documentation and extra tests. Measured on 1.7 GHz Intel Core i7, running macOS 10.12.3. benchmark old ns/op new ns/op delta BenchmarkOnesCount-4 7.34 5.38 -26.70% BenchmarkOnesCount8-4 2.03 1.98 -2.46% BenchmarkOnesCount16-4 2.56 2.50 -2.34% BenchmarkOnesCount32-4 2.98 2.39 -19.80% BenchmarkOnesCount64-4 4.22 2.96 -29.86% Change-Id: I566b0ef766e55cf5776b1662b6016024ebe5d878 Reviewed-on: https://go-review.googlesource.com/37223Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Martin Möhrmann authored
Change list https://golang.org/cl/20686/ removed the last use of the variable byteType. Change-Id: I4ea79095136a49a9d22767b37f48f3404da05056 Reviewed-on: https://go-review.googlesource.com/37197 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Keith Randall authored
XCHG needs to allow the stack pointer as an argument because we have a rewrite that incorporates the address of a local variable into the instruction. Fixes #19184 Change-Id: Ic438e6e1946332cdce3864d15abecd41b911b2a9 Reviewed-on: https://go-review.googlesource.com/37253 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
-
Jaana Burcu Dogan authored
Before the change, `go env` reports PKG_CONFIG in between the CGO env group: GOARCH="amd64" GOBIN="" GOEXE="" GOHOSTARCH="amd64" GOHOSTOS="darwin" GOOS="darwin" GOPATH="/Users/jbd" GORACE="" GOROOT="/Users/jbd/go" GOTOOLDIR="/Users/jbd/go/pkg/tool/darwin_amd64" GCCGO="gccgo" CC="clang" GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/lq/qcn67khn4_1b41_g48x3zchh005d21/T/go-build184491598=/tmp/go-build -gno-record-gcc-switches -fno-common" CXX="clang++" CGO_ENABLED="1" PKG_CONFIG="pkg-config" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" The change makes PKG_CONFIG to be reported as the final item, and not breaking the CGO_* group apart. Change-Id: I1e7ed6bdec83009ff118f85c9f0f7b78a67fdd76 Reviewed-on: https://go-review.googlesource.com/37228Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Martin Möhrmann authored
Added an alternate form of printing floats and complex values by specifying the sharp flag. Output formatted using the the verbs v, e, E, f, F, g and G in combination with the sharp flag will always include a decimal point. The alternate form specified by the sharp flag for %g and %G verbs will not truncate trailing zeros and assume a default precision of 6. Fixes #18857. Change-Id: I4d776239e06d7a6a90f2d8556240a359888cb7c3 Reviewed-on: https://go-review.googlesource.com/37051Reviewed-by: Rob Pike <r@golang.org>
-
- 18 Feb, 2017 4 commits
-
-
Kenny Grant authored
Fixes #19110 Change-Id: I291fa4ec3c61145162acd019e3f0e5dd3d7c97e9 Reviewed-on: https://go-review.googlesource.com/37194Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Martin Möhrmann authored
Add exported global variables and store the results of benchmarked functions in them. This prevents the current compiler optimizations from removing the instructions that are needed to compute the return values of the benchmarked functions. Change-Id: If8b08424e85f3796bb6dd73e761c653abbabcc5e Reviewed-on: https://go-review.googlesource.com/37195Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Martin Möhrmann authored
The below range loop will not stop when encountering the first '.' character in a Darwin version string like "15.6.0". for i = range osver { if osver[i] != '.' { continue } } } Therefore, the condition i > 2 was always satisfied and supportsCloseOnExec was always set to true. Since the minimum supported version of OSX for go is currently 10.8 and O_CLOEXEC is implemented from OSX 10.7 on the detection code can be removed and support for O_CLOEXEC is always assumed to exist. Change-Id: Idd10094d8385dd4adebc8d7a6d9e9a8f29455867 Reviewed-on: https://go-review.googlesource.com/37193Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Kenny Grant authored
The emphasize function used a complex regexp to find URLs, which truncated some types of URL and did not match others. This has been simplified and adjusted to allow valid punctuation like :: or ! in the path part and :[] in the host part. Comments were added to clarify what this regexp allows. The path part matches query and fragment also so document this. Removed news, telnet, wais, and prospero protocols. Tests were added for: IPV6 URLs URLs surrounded by brackets URLs containing :: URLs containing :;!- in the path In order to allow punctuation and yet preserve current behaviour, URLs are not permitted to end in .,:;?! to allow the use of normal punctuation surrounding URLs in comments. Fixes #18139 Change-Id: I38b2d7a85fe0d171e4bf4aac420f8c2d3ced8a2f Reviewed-on: https://go-review.googlesource.com/37192Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 17 Feb, 2017 19 commits
-
-
Robert Griesemer authored
BenchmarkLeadingZeros-8 200000000 8.80 ns/op BenchmarkLeadingZeros8-8 200000000 8.21 ns/op BenchmarkLeadingZeros16-8 200000000 7.49 ns/op BenchmarkLeadingZeros32-8 200000000 7.80 ns/op BenchmarkLeadingZeros64-8 200000000 8.67 ns/op BenchmarkTrailingZeros-8 1000000000 2.05 ns/op BenchmarkTrailingZeros8-8 2000000000 1.94 ns/op BenchmarkTrailingZeros16-8 2000000000 1.94 ns/op BenchmarkTrailingZeros32-8 2000000000 1.92 ns/op BenchmarkTrailingZeros64-8 2000000000 2.03 ns/op Change-Id: I45497bf2d6369ba6cfc88ded05aa735908af8908 Reviewed-on: https://go-review.googlesource.com/37220 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Robert Griesemer authored
Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. benchmark old ns/op new ns/op delta BenchmarkRotateLeft-8 7.87 7.00 -11.05% BenchmarkRotateLeft8-8 8.41 4.52 -46.25% BenchmarkRotateLeft16-8 8.07 4.55 -43.62% BenchmarkRotateLeft32-8 8.36 4.73 -43.42% BenchmarkRotateLeft64-8 7.93 4.78 -39.72% BenchmarkRotateRight-8 8.23 6.72 -18.35% BenchmarkRotateRight8-8 8.76 4.39 -49.89% BenchmarkRotateRight16-8 9.07 4.44 -51.05% BenchmarkRotateRight32-8 8.85 4.46 -49.60% BenchmarkRotateRight64-8 8.11 4.43 -45.38% Change-Id: I79ea1e9e6fc65f95794a91f860a911efed3aa8a1 Reviewed-on: https://go-review.googlesource.com/37219Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Robert Griesemer authored
Also: Changed Reverse/ReverseBytes implementations to use the same (smaller) masks as OnesCount. BenchmarkOnesCount-8 37.0 6.26 -83.08% BenchmarkOnesCount8-8 7.24 1.99 -72.51% BenchmarkOnesCount16-8 11.3 2.47 -78.14% BenchmarkOnesCount32-8 18.4 3.02 -83.59% BenchmarkOnesCount64-8 40.0 3.78 -90.55% BenchmarkReverse-8 6.69 6.22 -7.03% BenchmarkReverse8-8 1.64 1.64 +0.00% BenchmarkReverse16-8 2.26 2.18 -3.54% BenchmarkReverse32-8 2.88 2.87 -0.35% BenchmarkReverse64-8 5.64 4.34 -23.05% BenchmarkReverseBytes-8 2.48 2.17 -12.50% BenchmarkReverseBytes16-8 0.63 0.95 +50.79% BenchmarkReverseBytes32-8 1.13 1.24 +9.73% BenchmarkReverseBytes64-8 2.50 2.16 -13.60% OnesCount-8 37.0ns ± 0% 6.3ns ± 0% ~ (p=1.000 n=1+1) OnesCount8-8 7.24ns ± 0% 1.99ns ± 0% ~ (p=1.000 n=1+1) OnesCount16-8 11.3ns ± 0% 2.5ns ± 0% ~ (p=1.000 n=1+1) OnesCount32-8 18.4ns ± 0% 3.0ns ± 0% ~ (p=1.000 n=1+1) OnesCount64-8 40.0ns ± 0% 3.8ns ± 0% ~ (p=1.000 n=1+1) Reverse-8 6.69ns ± 0% 6.22ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 1.64ns ± 0% 1.64ns ± 0% ~ (all samples are equal) Reverse16-8 2.26ns ± 0% 2.18ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 2.88ns ± 0% 2.87ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 5.64ns ± 0% 4.34ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes-8 2.48ns ± 0% 2.17ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 0.63ns ± 0% 0.95ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 1.13ns ± 0% 1.24ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 2.50ns ± 0% 2.16ns ± 0% ~ (p=1.000 n=1+1) Change-Id: I591b0ffc83fc3a42828256b6e5030f32c64f9497 Reviewed-on: https://go-review.googlesource.com/37218Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Ilya Tocar authored
On AMD64 Most operation can have one operand in memory. Combine load and dependand operation into one new operation, where possible. I've seen no significant performance changes on go1, but this allows to remove ~1.8kb code from go tool. And in math package I see e. g.: Remainder-6 70.0ns ± 0% 64.6ns ± 0% -7.76% (p=0.000 n=9+1 Change-Id: I88b8602b1d55da8ba548a34eb7da4b25d59a297e Reviewed-on: https://go-review.googlesource.com/36793 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-
Keith Randall authored
The type of an intermediate multiply was wrong. When that intermediate multiply was spilled, the top 32 bits were lost. Fixes #19153 Change-Id: Ib29350a4351efa405935b7f7ee3c112668e64108 Reviewed-on: https://go-review.googlesource.com/37212 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
-
Robert Griesemer authored
- moved from: x&m>>k | x&^m<<k to: x&m>>k | x<<k&m This permits use of the same constant m twice (*) which may be better for machines that can't use large immediate constants directly with an AND instruction and have to load them explicitly. *) CPUs don't usually have a &^ instruction, so x&^m becomes x&(^m) - simplified returns This improves the generated code because the compiler recognizes x>>k | x<<k as ROT when k is the bitsize of x. The 8-bit versions of these instructions can be significantly faster still if they are replaced with table lookups, as long as the table is in cache. If the table is not in cache, table-lookup is probably slower, hence the choice of an explicit register-only implementation for now. BenchmarkReverse-8 8.50 6.86 -19.29% BenchmarkReverse8-8 2.17 1.74 -19.82% BenchmarkReverse16-8 2.89 2.34 -19.03% BenchmarkReverse32-8 3.55 2.95 -16.90% BenchmarkReverse64-8 6.81 5.57 -18.21% BenchmarkReverseBytes-8 3.49 2.48 -28.94% BenchmarkReverseBytes16-8 0.93 0.62 -33.33% BenchmarkReverseBytes32-8 1.55 1.13 -27.10% BenchmarkReverseBytes64-8 2.47 2.47 +0.00% Reverse-8 8.50ns ± 0% 6.86ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 2.17ns ± 0% 1.74ns ± 0% ~ (p=1.000 n=1+1) Reverse16-8 2.89ns ± 0% 2.34ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 3.55ns ± 0% 2.95ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 6.81ns ± 0% 5.57ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes-8 3.49ns ± 0% 2.48ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 0.93ns ± 0% 0.62ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 1.55ns ± 0% 1.13ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 2.47ns ± 0% 2.47ns ± 0% ~ (all samples are equal) Change-Id: I0064de8c7e0e568ca7885d6f7064344bef91a06d Reviewed-on: https://go-review.googlesource.com/37215 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Matthew Dempsky authored
We can immediately emit static assignment data rather than queueing them up to be processed during SSA building. Passes toolstash -cmp. Change-Id: I8bcea4b72eafb0cc0b849cd93e9cde9d84f30d5e Reviewed-on: https://go-review.googlesource.com/37024 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
-
Cherry Zhang authored
The rules for folding addresses into load/stores checks sym1 is not on stack (because the stack offset is not known at that point). But sym1 could be nil, which invalidates the check. Check merged sym instead. Fixes #19137. Change-Id: I8574da22ced1216bb5850403d8f08ec60a8d1005 Reviewed-on: https://go-review.googlesource.com/37145 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
-
Robert Griesemer authored
Sum up function results and store them in an exported (global) variable. This prevents the compiler from optimizing away the otherwise side-effect free function calls. We now have more realistic set of benchmark numbers... Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. Note: These measurements are based on the same "old" implementation as the prior measurements (commit 7d5c003a). benchmark old ns/op new ns/op delta BenchmarkReverse-8 72.9 8.50 -88.34% BenchmarkReverse8-8 13.2 2.17 -83.56% BenchmarkReverse16-8 21.2 2.89 -86.37% BenchmarkReverse32-8 36.3 3.55 -90.22% BenchmarkReverse64-8 71.3 6.81 -90.45% BenchmarkReverseBytes-8 11.2 3.49 -68.84% BenchmarkReverseBytes16-8 6.24 0.93 -85.10% BenchmarkReverseBytes32-8 7.40 1.55 -79.05% BenchmarkReverseBytes64-8 10.5 2.47 -76.48% Reverse-8 72.9ns ± 0% 8.5ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 13.2ns ± 0% 2.2ns ± 0% ~ (p=1.000 n=1+1) Reverse16-8 21.2ns ± 0% 2.9ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 36.3ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 71.3ns ± 0% 6.8ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes-8 11.2ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 6.24ns ± 0% 0.93ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 7.40ns ± 0% 1.55ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 10.5ns ± 0% 2.5ns ± 0% ~ (p=1.000 n=1+1) Change-Id: I8aef1334b84f6cafd25edccad7e6868b37969efb Reviewed-on: https://go-review.googlesource.com/37213Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Robert Griesemer authored
Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. benchmark old ns/op new ns/op delta BenchmarkReverseBytes-8 11.4 3.51 -69.21% BenchmarkReverseBytes16-8 6.87 0.64 -90.68% BenchmarkReverseBytes32-8 7.79 0.65 -91.66% BenchmarkReverseBytes64-8 11.6 0.64 -94.48% name old time/op new time/op delta ReverseBytes-8 11.4ns ± 0% 3.5ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes16-8 6.87ns ± 0% 0.64ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes32-8 7.79ns ± 0% 0.65ns ± 0% ~ (p=1.000 n=1+1) ReverseBytes64-8 11.6ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Change-Id: I67b529652b3b613c61687e9e185e8d4ee40c51a2 Reviewed-on: https://go-review.googlesource.com/37211 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Robert Griesemer authored
Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3. name old time/op new time/op delta Reverse-8 76.6ns ± 0% 8.1ns ± 0% ~ (p=1.000 n=1+1) Reverse8-8 12.6ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Reverse16-8 20.8ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Reverse32-8 36.5ns ± 0% 0.6ns ± 0% ~ (p=1.000 n=1+1) Reverse64-8 74.0ns ± 0% 6.4ns ± 0% ~ (p=1.000 n=1+1) benchmark old ns/op new ns/op delta BenchmarkReverse-8 76.6 8.07 -89.46% BenchmarkReverse8-8 12.6 0.64 -94.92% BenchmarkReverse16-8 20.8 0.64 -96.92% BenchmarkReverse32-8 36.5 0.64 -98.25% BenchmarkReverse64-8 74.0 6.38 -91.38% Change-Id: I6b99b10cee2f2babfe79342b50ee36a45a34da30 Reviewed-on: https://go-review.googlesource.com/37149 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Cherry Zhang authored
These seem not to really matter, but good to be correct. Change-Id: I02edb9797c3d6739725cfbe4723c75f151acd05e Reviewed-on: https://go-review.googlesource.com/36837 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
-
Cherry Zhang authored
SSA's writebarrier pass requires WB store ops are always at the end of a block. If we move write barrier insertion into SSA and emits normal Store ops when building SSA, this requirement becomes impractical -- it will create too many blocks for all the Store ops. Redo SSA's writebarrier pass, explicitly order values in store order, so it no longer needs this requirement. Updates #17583. Fixes #19067. Change-Id: I66e817e526affb7e13517d4245905300a90b7170 Reviewed-on: https://go-review.googlesource.com/36834 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
-
Cherry Zhang authored
Nil check removal in the same block is disabled due to issue 18725: because the values are not ordered, a nilcheck may influence a value that is logically before it. This CL re-enables same-block nilcheck removal by ordering values in store order first. Updates #18725. Change-Id: I287a38525230c14c5412cbcdbc422547dabd54f6 Reviewed-on: https://go-review.googlesource.com/35496 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
-
Robert Griesemer authored
Follow-up on https://go-review.googlesource.com/36315. No functionality change. For #18616. Change-Id: Id4df34dd7d0381be06eea483a11bf92f4a01f604 Reviewed-on: https://go-review.googlesource.com/37140Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Koki Ide authored
Change-Id: I0455ffaa51c661803d8013c7961910f920d3c3cc Reviewed-on: https://go-review.googlesource.com/37043Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Dmitry Vyukov authored
Add new starvation mode for Mutex. In starvation mode ownership is directly handed off from unlocking goroutine to the next waiter. New arriving goroutines don't compete for ownership. Unfair wait time is now limited to 1ms. Also fix a long standing bug that goroutines were requeued at the tail of the wait queue. That lead to even more unfair acquisition times with multiple waiters. Performance of normal mode is not considerably affected. Fixes #13086 On the provided in the issue lockskew program: done in 1.207853ms done in 1.177451ms done in 1.184168ms done in 1.198633ms done in 1.185797ms done in 1.182502ms done in 1.316485ms done in 1.211611ms done in 1.182418ms name old time/op new time/op delta MutexUncontended-48 0.65ns ± 0% 0.65ns ± 1% ~ (p=0.087 n=10+10) Mutex-48 112ns ± 1% 114ns ± 1% +1.69% (p=0.000 n=10+10) MutexSlack-48 113ns ± 0% 87ns ± 1% -22.65% (p=0.000 n=8+10) MutexWork-48 149ns ± 0% 145ns ± 0% -2.48% (p=0.000 n=9+10) MutexWorkSlack-48 149ns ± 0% 122ns ± 3% -18.26% (p=0.000 n=6+10) MutexNoSpin-48 103ns ± 4% 105ns ± 3% ~ (p=0.089 n=10+10) MutexSpin-48 490ns ± 4% 515ns ± 6% +5.08% (p=0.006 n=10+10) Cond32-48 13.4µs ± 6% 13.1µs ± 5% -2.75% (p=0.023 n=10+10) RWMutexWrite100-48 53.2ns ± 3% 41.2ns ± 3% -22.57% (p=0.000 n=10+10) RWMutexWrite10-48 45.9ns ± 2% 43.9ns ± 2% -4.38% (p=0.000 n=10+10) RWMutexWorkWrite100-48 122ns ± 2% 134ns ± 1% +9.92% (p=0.000 n=10+10) RWMutexWorkWrite10-48 206ns ± 1% 188ns ± 1% -8.52% (p=0.000 n=8+10) Cond32-24 12.1µs ± 3% 12.4µs ± 3% +1.98% (p=0.043 n=10+9) MutexUncontended-24 0.74ns ± 1% 0.75ns ± 1% ~ (p=0.650 n=10+10) Mutex-24 122ns ± 2% 124ns ± 1% +1.31% (p=0.007 n=10+10) MutexSlack-24 96.9ns ± 2% 102.8ns ± 2% +6.11% (p=0.000 n=10+10) MutexWork-24 146ns ± 1% 135ns ± 2% -7.70% (p=0.000 n=10+9) MutexWorkSlack-24 135ns ± 1% 128ns ± 2% -5.01% (p=0.000 n=10+9) MutexNoSpin-24 114ns ± 3% 110ns ± 4% -3.84% (p=0.000 n=10+10) MutexSpin-24 482ns ± 4% 475ns ± 8% ~ (p=0.286 n=10+10) RWMutexWrite100-24 43.0ns ± 3% 43.1ns ± 2% ~ (p=0.956 n=10+10) RWMutexWrite10-24 43.4ns ± 1% 43.2ns ± 1% ~ (p=0.085 n=10+9) RWMutexWorkWrite100-24 130ns ± 3% 131ns ± 3% ~ (p=0.747 n=10+10) RWMutexWorkWrite10-24 191ns ± 1% 192ns ± 1% ~ (p=0.210 n=10+10) Cond32-12 11.5µs ± 2% 11.7µs ± 2% +1.98% (p=0.002 n=10+10) MutexUncontended-12 1.48ns ± 0% 1.50ns ± 1% +1.08% (p=0.004 n=10+10) Mutex-12 141ns ± 1% 143ns ± 1% +1.63% (p=0.000 n=10+10) MutexSlack-12 121ns ± 0% 119ns ± 0% -1.65% (p=0.001 n=8+9) MutexWork-12 141ns ± 2% 150ns ± 3% +6.36% (p=0.000 n=9+10) MutexWorkSlack-12 131ns ± 0% 138ns ± 0% +5.73% (p=0.000 n=9+10) MutexNoSpin-12 87.0ns ± 1% 83.7ns ± 1% -3.80% (p=0.000 n=10+10) MutexSpin-12 364ns ± 1% 377ns ± 1% +3.77% (p=0.000 n=10+10) RWMutexWrite100-12 42.8ns ± 1% 43.9ns ± 1% +2.41% (p=0.000 n=8+10) RWMutexWrite10-12 39.8ns ± 4% 39.3ns ± 1% ~ (p=0.433 n=10+9) RWMutexWorkWrite100-12 131ns ± 1% 131ns ± 0% ~ (p=0.591 n=10+9) RWMutexWorkWrite10-12 173ns ± 1% 174ns ± 0% ~ (p=0.059 n=10+8) Cond32-6 10.9µs ± 2% 10.9µs ± 2% ~ (p=0.739 n=10+10) MutexUncontended-6 2.97ns ± 0% 2.97ns ± 0% ~ (all samples are equal) Mutex-6 122ns ± 6% 122ns ± 2% ~ (p=0.668 n=10+10) MutexSlack-6 149ns ± 3% 142ns ± 3% -4.63% (p=0.000 n=10+10) MutexWork-6 136ns ± 3% 140ns ± 5% ~ (p=0.077 n=10+10) MutexWorkSlack-6 152ns ± 0% 138ns ± 2% -9.21% (p=0.000 n=6+10) MutexNoSpin-6 150ns ± 1% 152ns ± 0% +1.50% (p=0.000 n=8+10) MutexSpin-6 726ns ± 0% 730ns ± 1% ~ (p=0.069 n=10+10) RWMutexWrite100-6 40.6ns ± 1% 40.9ns ± 1% +0.91% (p=0.001 n=8+10) RWMutexWrite10-6 37.1ns ± 0% 37.0ns ± 1% ~ (p=0.386 n=9+10) RWMutexWorkWrite100-6 133ns ± 1% 134ns ± 1% +1.01% (p=0.005 n=9+10) RWMutexWorkWrite10-6 152ns ± 0% 152ns ± 0% ~ (all samples are equal) Cond32-2 7.86µs ± 2% 7.95µs ± 2% +1.10% (p=0.023 n=10+10) MutexUncontended-2 8.10ns ± 0% 9.11ns ± 4% +12.44% (p=0.000 n=9+10) Mutex-2 32.9ns ± 9% 38.4ns ± 6% +16.58% (p=0.000 n=10+10) MutexSlack-2 93.4ns ± 1% 98.5ns ± 2% +5.39% (p=0.000 n=10+9) MutexWork-2 40.8ns ± 3% 43.8ns ± 7% +7.38% (p=0.000 n=10+9) MutexWorkSlack-2 98.6ns ± 5% 108.2ns ± 2% +9.80% (p=0.000 n=10+8) MutexNoSpin-2 399ns ± 1% 398ns ± 2% ~ (p=0.463 n=8+9) MutexSpin-2 1.99µs ± 3% 1.97µs ± 1% -0.81% (p=0.003 n=9+8) RWMutexWrite100-2 37.6ns ± 5% 46.0ns ± 4% +22.17% (p=0.000 n=10+8) RWMutexWrite10-2 50.1ns ± 6% 36.8ns ±12% -26.46% (p=0.000 n=9+10) RWMutexWorkWrite100-2 136ns ± 0% 134ns ± 2% -1.80% (p=0.001 n=7+9) RWMutexWorkWrite10-2 140ns ± 1% 138ns ± 1% -1.50% (p=0.000 n=10+10) Cond32 5.93µs ± 1% 5.91µs ± 0% ~ (p=0.411 n=9+10) MutexUncontended 15.9ns ± 0% 15.8ns ± 0% -0.63% (p=0.000 n=8+8) Mutex 15.9ns ± 0% 15.8ns ± 0% -0.44% (p=0.003 n=10+10) MutexSlack 26.9ns ± 3% 26.7ns ± 2% ~ (p=0.084 n=10+10) MutexWork 47.8ns ± 0% 47.9ns ± 0% +0.21% (p=0.014 n=9+8) MutexWorkSlack 54.9ns ± 3% 54.5ns ± 3% ~ (p=0.254 n=10+10) MutexNoSpin 786ns ± 2% 765ns ± 1% -2.66% (p=0.000 n=10+10) MutexSpin 3.87µs ± 1% 3.83µs ± 0% -0.85% (p=0.005 n=9+8) RWMutexWrite100 21.2ns ± 2% 21.0ns ± 1% -0.88% (p=0.018 n=10+9) RWMutexWrite10 22.6ns ± 1% 22.6ns ± 0% ~ (p=0.471 n=9+9) RWMutexWorkWrite100 132ns ± 0% 132ns ± 0% ~ (all samples are equal) RWMutexWorkWrite10 124ns ± 0% 123ns ± 0% ~ (p=0.656 n=10+10) Change-Id: I66412a3a0980df1233ad7a5a0cd9723b4274528b Reviewed-on: https://go-review.googlesource.com/34310 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
-
Wander Lairson Costa authored
If the caller set ups a Credential in os/exec.Command, os/exec.Command.Start will end up calling setgroups(2), even if no supplementary groups were given. Only root can call setgroups(2) on BSD kernels, which causes Start to fail for non-root users when they try to set uid and gid for the new process. We fix by introducing a new field to syscall.Credential named NoSetGroups, and setgroups(2) is only called if it is false. We make this field with inverted logic to preserve backward compatibility. RELNOTES=yes Change-Id: I3cff1f21c117a1430834f640ef21fd4e87e06804 Reviewed-on: https://go-review.googlesource.com/36697Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Keith Randall authored
Currently the conversion from constant divides to multiplies is mostly done during the walk pass. This is suboptimal because SSA can determine that the value being divided by is constant more often (e.g. after inlining). Change-Id: If1a9b993edd71be37396b9167f77da271966f85f Reviewed-on: https://go-review.googlesource.com/37015 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
-
- 16 Feb, 2017 12 commits
-
-
Matthew Dempsky authored
Currently, whether we need a write barrier is simply a property of the pointer slot being written to. The only optimization we currently apply using the value being written is that pointers to stack variables can omit write barriers because they're only written to stack slots... but we already omit write barriers for all writes to the stack anyway. Passes toolstash -cmp. Change-Id: I7f16b71ff473899ed96706232d371d5b2b7ae789 Reviewed-on: https://go-review.googlesource.com/37109Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Shenghou Ma authored
While we're at it, also document Yn(0, 0) = -Inf for completeness. Fixes #18823. Change-Id: Ib6db68f76d29cc2373c12ebdf3fab129cac8c167 Reviewed-on: https://go-review.googlesource.com/35970Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Robert Griesemer authored
Initial platform-independent implementation. For #18616. Change-Id: I4585c55b963101af9059c06c1b8a866cb384754c Reviewed-on: https://go-review.googlesource.com/36315Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
-
Robert Griesemer authored
Fixes #15611. Change-Id: I352b145026466cafef8cf87addafbd30716bda24 Reviewed-on: https://go-review.googlesource.com/37138 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Russ Cox authored
CL 36792 fixed #17953, a linear scan caused by n goroutines piling into two different locks that hashed to the same bucket in the semaphore table. In that CL, n goroutines contending for 2 unfortunately chosen locks went from O(n²) to O(n). This CL fixes a different linear scan, when n goroutines are contending for n/2 different locks that all hash to the same bucket in the semaphore table. In this CL, n goroutines contending for n/2 unfortunately chosen locks goes from O(n²) to O(n log n). This case is much less likely, but any linear scan eventually hurts, so we might as well fix it while the problem is fresh in our minds. The new test in this CL checks for both linear scans. The effect of this CL on the sync benchmarks is negligible (but it fixes the new test). name old time/op new time/op delta Cond1-48 576ns ±10% 575ns ±13% ~ (p=0.679 n=71+71) Cond2-48 1.59µs ± 8% 1.61µs ± 9% ~ (p=0.107 n=73+69) Cond4-48 4.56µs ± 7% 4.55µs ± 7% ~ (p=0.670 n=74+72) Cond8-48 9.87µs ± 9% 9.90µs ± 7% ~ (p=0.507 n=69+73) Cond16-48 20.4µs ± 7% 20.4µs ±10% ~ (p=0.588 n=69+71) Cond32-48 45.4µs ±10% 45.4µs ±14% ~ (p=0.944 n=73+73) UncontendedSemaphore-48 19.7ns ±12% 19.7ns ± 8% ~ (p=0.589 n=65+63) ContendedSemaphore-48 55.4ns ±26% 54.9ns ±32% ~ (p=0.441 n=75+75) MutexUncontended-48 0.63ns ± 0% 0.63ns ± 0% ~ (all equal) Mutex-48 210ns ± 6% 213ns ±10% +1.30% (p=0.035 n=70+74) MutexSlack-48 210ns ± 7% 211ns ± 9% ~ (p=0.184 n=71+72) MutexWork-48 299ns ± 5% 300ns ± 5% ~ (p=0.678 n=73+75) MutexWorkSlack-48 302ns ± 6% 300ns ± 5% ~ (p=0.149 n=74+72) MutexNoSpin-48 135ns ± 6% 135ns ±10% ~ (p=0.788 n=67+75) MutexSpin-48 693ns ± 5% 689ns ± 6% ~ (p=0.092 n=65+74) Once-48 0.22ns ±25% 0.22ns ±24% ~ (p=0.882 n=74+73) Pool-48 5.88ns ±36% 5.79ns ±24% ~ (p=0.655 n=69+69) PoolOverflow-48 4.79µs ±18% 4.87µs ±20% ~ (p=0.233 n=75+75) SemaUncontended-48 0.80ns ± 1% 0.82ns ± 8% +2.46% (p=0.000 n=60+74) SemaSyntNonblock-48 103ns ± 4% 102ns ± 5% -1.11% (p=0.003 n=75+75) SemaSyntBlock-48 104ns ± 4% 104ns ± 5% ~ (p=0.231 n=71+75) SemaWorkNonblock-48 128ns ± 4% 129ns ± 6% +1.51% (p=0.000 n=63+75) SemaWorkBlock-48 129ns ± 8% 130ns ± 7% ~ (p=0.072 n=75+74) RWMutexUncontended-48 2.35ns ± 1% 2.35ns ± 0% ~ (p=0.144 n=70+55) RWMutexWrite100-48 139ns ±18% 141ns ±21% ~ (p=0.071 n=75+73) RWMutexWrite10-48 145ns ± 9% 145ns ± 8% ~ (p=0.553 n=75+75) RWMutexWorkWrite100-48 297ns ±13% 297ns ±15% ~ (p=0.519 n=75+74) RWMutexWorkWrite10-48 588ns ± 7% 585ns ± 5% ~ (p=0.173 n=73+70) WaitGroupUncontended-48 0.87ns ± 0% 0.87ns ± 0% ~ (all equal) WaitGroupAddDone-48 63.2ns ± 4% 62.7ns ± 4% -0.82% (p=0.027 n=72+75) WaitGroupAddDoneWork-48 109ns ± 5% 109ns ± 4% ~ (p=0.233 n=75+75) WaitGroupWait-48 0.17ns ± 0% 0.16ns ±16% -8.55% (p=0.000 n=56+75) WaitGroupWaitWork-48 1.78ns ± 1% 2.08ns ± 5% +16.92% (p=0.000 n=74+70) WaitGroupActuallyWait-48 52.0ns ± 3% 50.6ns ± 5% -2.70% (p=0.000 n=71+69) https://perf.golang.org/search?q=upload:20170215.1 Change-Id: Ia29a8bd006c089e401ec4297c3038cca656bcd0a Reviewed-on: https://go-review.googlesource.com/37103 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Matthew Dempsky authored
Passes toolstash -cmp. Change-Id: I037278404ebf762482557e2b6867cbc595074a83 Reviewed-on: https://go-review.googlesource.com/37023 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
-
Russ Cox authored
Suggested by Dmitry in CL 36792 review. Clearly safe since there are many different semaRoots that could all have profiled sudogs calling mutexevent. Change-Id: I45eed47a5be3e513b2dad63b60afcd94800e16d1 Reviewed-on: https://go-review.googlesource.com/37104 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
-
Russ Cox authored
Also runs 100X faster on average, because it takes so many fewer attempts to trigger the failure. Fixes #11443. Change-Id: I8c39ee48bb3ff6c36fa63083e04076771b65a80d Reviewed-on: https://go-review.googlesource.com/36841 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
-
Chris Broadfoot authored
Change-Id: Ie2144d001c6b4b2293d07b2acf62d7e3cd0b46a7 Reviewed-on: https://go-review.googlesource.com/37130Reviewed-by: Russ Cox <rsc@golang.org>
-
Alex Brainman authored
For #10776. Change-Id: Id64a7e35c7cdcd9be16cbe3358402fa379090e36 Reviewed-on: https://go-review.googlesource.com/36975Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Alex Brainman authored
This is what gcc does when it generates object files. And it is easier to count everything, when it starts from 0. Make go linker do the same. gcc also does not output IMAGE_OPTIONAL_HEADER or PE64_IMAGE_OPTIONAL_HEADER for object files. Perhaps we should do the same, but not in this CL. For #10776. Change-Id: I9789c337648623b6cfaa7d18d1ac9cef32e180dc Reviewed-on: https://go-review.googlesource.com/36974Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Alex Brainman authored
For #10776. Change-Id: I7931558257c1f6b895e4d44b46d320a54de0d677 Reviewed-on: https://go-review.googlesource.com/36973 Run-TryBot: Alex Brainman <alex.brainman@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
-