- 17 Mar, 2017 7 commits
-
-
Josh Bleecher Snyder authored
This is a follow-up to CL 38167. Pure code movement. Change-Id: I13e58f7eac6718c77076d89e13fc721a5205ec57 Reviewed-on: https://go-review.googlesource.com/38322 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Josh Bleecher Snyder authored
This makes ssa.Func, ssa.Cache, and ssa.Config fulfill the roles laid out for them in CL 38160. The only non-trivial change in this CL is how cached values and blocks get IDs. Prior to this CL, their IDs were assigned as part of resetting the cache, and only modified IDs were reset. This required knowing how many values and blocks were modified, which required a tight coupling between ssa.Func and ssa.Config. To eliminate that coupling, we now zero values and blocks during reset, and assign their IDs when they are used. Since unused values and blocks have ID == 0, we can efficiently find the last used value/block, to avoid zeroing everything. Bulk zeroing is efficient, but not efficient enough to obviate the need to avoid zeroing everything every time. As a happy side-effect, ssa.Func.Free is no longer necessary. DebugHashMatch and friends now belong in func.go. They have been left in place for clarity and review. I will move them in a subsequent CL. Passes toolstash -cmp. No compiler performance impact. No change in 'go test cmd/compile/internal/ssa' execution time. Change-Id: I2eb7af58da067ef6a36e815a6f386cfe8634d098 Reviewed-on: https://go-review.googlesource.com/38167 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-
Josh Bleecher Snyder authored
Minor cleanup only. No reason to go through String() when it is just as easy to do a direct string comparison. Eliminates a surprising number of allocations. name old alloc/op new alloc/op delta Template 40.9MB ± 0% 40.9MB ± 0% ~ (p=0.190 n=10+10) Unicode 30.3MB ± 0% 30.3MB ± 0% ~ (p=0.218 n=10+10) GoTypes 116MB ± 0% 116MB ± 0% -0.09% (p=0.000 n=10+10) SSA 871MB ± 0% 869MB ± 0% -0.14% (p=0.000 n=10+9) Flate 26.2MB ± 0% 26.2MB ± 0% -0.15% (p=0.002 n=10+10) GoParser 32.5MB ± 0% 32.5MB ± 0% ~ (p=0.165 n=10+10) Reflect 80.5MB ± 0% 80.4MB ± 0% -0.12% (p=0.003 n=9+10) Tar 27.3MB ± 0% 27.3MB ± 0% -0.13% (p=0.008 n=10+9) XML 43.1MB ± 0% 43.1MB ± 0% ~ (p=0.218 n=10+10) name old allocs/op new allocs/op delta Template 402k ± 1% 400k ± 1% -0.64% (p=0.002 n=10+10) Unicode 322k ± 1% 321k ± 1% ~ (p=0.075 n=10+10) GoTypes 1.19M ± 0% 1.18M ± 0% -0.90% (p=0.000 n=10+10) SSA 7.94M ± 0% 7.81M ± 0% -1.66% (p=0.000 n=10+9) Flate 246k ± 0% 242k ± 1% -1.42% (p=0.000 n=10+10) GoParser 325k ± 1% 323k ± 1% -0.84% (p=0.000 n=10+10) Reflect 1.02M ± 0% 1.01M ± 0% -0.99% (p=0.000 n=10+10) Tar 259k ± 0% 257k ± 1% -0.72% (p=0.009 n=10+10) XML 406k ± 1% 403k ± 1% -0.69% (p=0.001 n=10+10) Change-Id: Ia129a4cd272027d627e1f3b27e9f07f93e3aa27e Reviewed-on: https://go-review.googlesource.com/38230Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Josh Bleecher Snyder authored
Passes toolstash -cmp. Updates #15756 Change-Id: Ia071dbbd7f2ee0f8433d8c37af4f7b588016244e Reviewed-on: https://go-review.googlesource.com/38231Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Josh Bleecher Snyder authored
Stack barriers were removed in CL 36620. Change-Id: If124d65a73a7b344a42be2a4b386a14d7a0a428b Reviewed-on: https://go-review.googlesource.com/38169Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com> Reviewed-by: David Chase <drchase@google.com>
-
Robert Griesemer authored
This matches the error message of cmd/compile (for assignments). Change-Id: I42a428f5d72f034e7b7e97b090a929e317e812af Reviewed-on: https://go-review.googlesource.com/38315 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Alan Donovan <adonovan@google.com>
-
Robert Griesemer authored
See https://go-review.googlesource.com/#/c/38313/ for background. It turns out that only a few tests checked for this. The new error message is shorter and very clear. Change-Id: I8ab4ad59fb023c8b54806339adc23aefd7dc7b07 Reviewed-on: https://go-review.googlesource.com/38314 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
- 16 Mar, 2017 15 commits
-
-
Jeremy Jackins authored
This is an evolution of https://go-review.googlesource.com/33616, as discussed via email with Robert (gri): $ cat foobar.go package main func main() { a := "foo", "bar" } before: ./foobar.go:4:4: assignment count mismatch: want 1 values, got 2 after: ./foobar.go:4:4: assignment count mismatch: cannot assign 2 values to 1 variables We could likely also eliminate the "assignment count mismatch" prefix now without losing any information, but that string is matched by a number of tests. Change-Id: Ie6fc8a7bbd0ebe841d53e66e5c2f49868decf761 Reviewed-on: https://go-review.googlesource.com/38313Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Keith Randall authored
name old time/op new time/op delta LeadingZeros-4 2.00ns ± 0% 1.34ns ± 1% -33.02% (p=0.000 n=8+10) LeadingZeros16-4 1.62ns ± 0% 1.57ns ± 0% -3.09% (p=0.001 n=8+9) LeadingZeros32-4 2.14ns ± 0% 1.48ns ± 0% -30.84% (p=0.002 n=8+10) LeadingZeros64-4 2.06ns ± 1% 1.33ns ± 0% -35.08% (p=0.000 n=8+8) 8-bit args is a special case - the Go code is really fast because it is just a single table lookup. So I've disabled that for now. Intrinsics were actually slower: LeadingZeros8-4 1.22ns ± 3% 1.58ns ± 1% +29.56% (p=0.000 n=10+10) Update #18616 Change-Id: Ia9c289b9ba59c583ea64060470315fd637e814cf Reviewed-on: https://go-review.googlesource.com/38311 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>
-
Steve Francia authored
Updates #17802 Change-Id: I65ea0f4cde973604c04051e7eb25d12e4facecd3 Reviewed-on: https://go-review.googlesource.com/36626Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Chris Broadfoot <cbro@golang.org>
-
Aliaksandr Valialkin authored
Avoid memory allocations by returning pre-calculated strings for decimal ints in the range 0..99. Benchmark results: name old time/op new time/op delta FormatInt-4 2.45µs ± 1% 2.40µs ± 1% -1.86% (p=0.000 n=8+9) AppendInt-4 1.67µs ± 1% 1.65µs ± 0% -0.92% (p=0.000 n=10+10) FormatUint-4 676ns ± 3% 669ns ± 1% ~ (p=0.146 n=10+10) AppendUint-4 467ns ± 2% 474ns ± 0% +1.58% (p=0.000 n=10+10) FormatIntSmall-4 29.6ns ± 2% 3.3ns ± 0% -88.98% (p=0.000 n=10+9) AppendIntSmall-4 16.0ns ± 1% 8.5ns ± 0% -46.98% (p=0.000 n=10+9) name old alloc/op new alloc/op delta FormatInt-4 576B ± 0% 576B ± 0% ~ (all equal) AppendInt-4 0.00B 0.00B ~ (all equal) FormatUint-4 224B ± 0% 224B ± 0% ~ (all equal) AppendUint-4 0.00B 0.00B ~ (all equal) FormatIntSmall-4 2.00B ± 0% 0.00B -100.00% (p=0.000 n=10+10) AppendIntSmall-4 0.00B 0.00B ~ (all equal) name old allocs/op new allocs/op delta FormatInt-4 37.0 ± 0% 35.0 ± 0% -5.41% (p=0.000 n=10+10) AppendInt-4 0.00 0.00 ~ (all equal) FormatUint-4 6.00 ± 0% 6.00 ± 0% ~ (all equal) AppendUint-4 0.00 0.00 ~ (all equal) FormatIntSmall-4 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) AppendIntSmall-4 0.00 0.00 ~ (all equal) Fixes #19445 Change-Id: Ib1f8922f2e0b13743c847ee9e703d1dab77f705c Reviewed-on: https://go-review.googlesource.com/37963Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Keith Randall authored
Update #18616 Change-Id: I0c2d643cbbeb131b4c9b12194697afa4af48e1d2 Reviewed-on: https://go-review.googlesource.com/38166 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>
-
Cherry Zhang authored
A copy-paste error in CL 38150. Fix build. Change-Id: Ib2afc83564ebe7dab934d45522803e1a191dea18 Reviewed-on: https://go-review.googlesource.com/38292 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org>
-
Matthew Dempsky authored
Fixes #19576. Change-Id: I11034fb08e989f6eb7d54bde873b92804223598d Reviewed-on: https://go-review.googlesource.com/38291 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Cherry Zhang authored
Remove size AuxInt in Store, and alignment in Move/Zero. We still pass size AuxInt to Move/Zero, as it is used for partial Move/Zero lowering (e.g. cmd/compile/internal/ssa/gen/386.rules:288). SizeAndAlign is gone. Passes "toolstash -cmp" on std. Change-Id: I1ca34652b65dd30de886940e789fcf41d521475d Reviewed-on: https://go-review.googlesource.com/38150 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-
Cherry Zhang authored
The old writebarrier implementation fails to handle single-block loop where a memory Phi value depends on the write barrier store in the same block. The new implementation (CL 36834) doesn't have this problem. Add a test to ensure it. Fix #19067. Change-Id: Iab13c6817edc12be8a048d18699b4450fa7ed712 Reviewed-on: https://go-review.googlesource.com/36940Reviewed-by: David Chase <drchase@google.com>
-
Cherry Zhang authored
Now that the write barrier insertion is moved to SSA, the SSA building code can be simplified. Updates #17583. Change-Id: I5cacc034b11aa90b0abe6f8dd97e4e3994e2bc25 Reviewed-on: https://go-review.googlesource.com/36840 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-
Cherry Zhang authored
When the compiler insert write barriers, the frontend makes conservative decisions at an early stage. This sometimes have false positives because of the lack of information, for example, writes on stack. SSA's writebarrier pass identifies writes on stack and eliminates write barriers for them. This CL moves write barrier insertion into SSA. The frontend no longer makes decisions about write barriers, and simply does normal assignments and emits normal Store ops when building SSA. SSA writebarrier pass inserts write barrier for Stores when needed. There, it has better information about the store because Phi and Copy propagation are done at that time. This CL only changes StoreWB to Store in gc/ssa.go. A followup CL simplifies SSA building code. Updates #17583. Change-Id: I4592d9bc0067503befc169c50b4e6f4765673bec Reviewed-on: https://go-review.googlesource.com/36839 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-
Cherry Zhang authored
For SSA Store/Move/Zero ops, attach the type of the value being stored to the op as the Aux field. This type will be used for write barrier insertion (in a followup CL). Since SSA passes do not accurately propagate types of values (because of type casting), we can't simply use type of the store's arguments for write barrier insertion. Passes "toolstash -cmp" on std. Updates #17583. Change-Id: I051d5e5c482931640d1d7d879b2a6bb91f2e0056 Reviewed-on: https://go-review.googlesource.com/36838 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-
Daniel Martí authored
Found by github.com/mvdan/unparam. Change-Id: I20145440ff1bcd27fcf15a740354c52f313e536c Reviewed-on: https://go-review.googlesource.com/37894 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
-
Carlos Eduardo Seo authored
This change adds a better implementation of IndexByte for ppc64x. Improvement for bytes·IndexByte: benchmark old ns/op new ns/op delta BenchmarkIndexByte/10-16 12.5 8.48 -32.16% BenchmarkIndexByte/32-16 34.4 9.85 -71.37% BenchmarkIndexByte/4K-16 3089 217 -92.98% BenchmarkIndexByte/4M-16 3154810 207051 -93.44% BenchmarkIndexByte/64M-16 50564811 5579093 -88.97% benchmark old MB/s new MB/s speedup BenchmarkIndexByte/10-16 800.41 1179.64 1.47x BenchmarkIndexByte/32-16 930.60 3249.10 3.49x BenchmarkIndexByte/4K-16 1325.71 18832.53 14.21x BenchmarkIndexByte/4M-16 1329.49 20257.29 15.24x BenchmarkIndexByte/64M-16 1327.19 12028.63 9.06x Improvement for strings·IndexByte: benchmark old ns/op new ns/op delta BenchmarkIndexByte-16 25.9 7.69 -70.31% Fixes #19030 Change-Id: Ifb82bbb3d643ec44b98eaa2d08a07f47e5c2fd11 Reviewed-on: https://go-review.googlesource.com/37670 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
-
Keith Randall authored
Implement math/bits.TrailingZerosX using intrinsics. Generally reorganize the intrinsic spec a bit. The instrinsics data structure is now built at init time. This will make doing the other functions in math/bits easier. Update sys.CtzX to return int instead of uint{64,32} so it matches math/bits.TrailingZerosX. Improve the intrinsics a bit for amd64. We don't need the CMOV for <64 bit versions. Update #18616 Change-Id: Ic1c5339c943f961d830ae56f12674d7b29d4ff39 Reviewed-on: https://go-review.googlesource.com/38155 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>
-
- 15 Mar, 2017 10 commits
-
-
Martin Möhrmann authored
- changes tests to check that the real and imaginary part of the go complex division result is equal to the result gcc produces for c99 - changes complex division code to satisfy new complex division test - adds float functions isNan, isFinite, isInf, abs and copysign in the runtime package Fixes #14644. name old time/op new time/op delta Complex128DivNormal-4 21.8ns ± 6% 13.9ns ± 6% -36.37% (p=0.000 n=20+20) Complex128DivNisNaN-4 14.1ns ± 1% 15.0ns ± 1% +5.86% (p=0.000 n=20+19) Complex128DivDisNaN-4 12.5ns ± 1% 16.7ns ± 1% +33.79% (p=0.000 n=19+20) Complex128DivNisInf-4 10.1ns ± 1% 13.0ns ± 1% +28.25% (p=0.000 n=20+19) Complex128DivDisInf-4 11.0ns ± 1% 20.9ns ± 1% +90.69% (p=0.000 n=16+19) ComplexAlgMap-4 86.7ns ± 1% 86.8ns ± 2% ~ (p=0.804 n=20+20) Change-Id: I261f3b4a81f6cc858bc7ff48f6fd1b39c300abf0 Reviewed-on: https://go-review.googlesource.com/37441Reviewed-by: Robert Griesemer <gri@golang.org>
-
Austin Clements authored
Currently, when printing tracebacks of other threads during GOTRACEBACK=crash, if the thread is on the system stack we print only the header for the user goroutine and fail to print its stack. This happens because we passed the g0 to traceback instead of curg. The g0 never has anything set in its gobuf, so traceback doesn't print anything. Fix this by passing _g_.m.curg to traceback instead of the g0. Fixes #19494. Change-Id: Idfabf94d6a725e9cdf94a3923dead6455ef3b217 Reviewed-on: https://go-review.googlesource.com/38012 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Austin Clements authored
GOTRACEBACK=crash works by bouncing a SIGQUIT around the process sched.mcount times. However, sched.mcount includes the extra Ms allocated by oneNewExtraM for cgo callbacks. Hence, if there are any extra Ms that don't have real OS threads, we'll try to send SIGQUIT more times than there are threads to catch it. Since nothing will catch these extra signals, we'll fall back to blocking for five seconds before aborting the process. Avoid this five second delay by subtracting out the number of extra Ms when sending SIGQUITs. Of course, in a cgo binary, it's still possible for the SIGQUIT to go to a cgo thread and cause some other failure mode. This does not fix that. Change-Id: I4fbf3c52dd721812796c4c1dcb2ab4cb7026d965 Reviewed-on: https://go-review.googlesource.com/38182 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Josh Bleecher Snyder authored
This CL introduces yet another compiler pass, which checks for correct control flow constructs prior to converting from AST to SSA form. It cannot be integrated with walk, since walk rewrites switch and select statements on the fly. To reduce code duplication, this CL also does some minor refactoring. With this pass in place, the AST to SSA converter can now stop generating SSA for any known-dead code. This minor savings pays for the minor cost of the new pass. Performance is almost a wash: name old time/op new time/op delta Template 206ms ± 4% 205ms ± 4% ~ (p=0.108 n=43+43) Unicode 84.0ms ± 4% 84.0ms ± 4% ~ (p=0.979 n=43+43) GoTypes 550ms ± 3% 553ms ± 3% ~ (p=0.065 n=40+41) Compiler 2.57s ± 4% 2.58s ± 2% ~ (p=0.103 n=44+41) SSA 3.94s ± 3% 3.93s ± 2% ~ (p=0.833 n=44+42) Flate 126ms ± 6% 125ms ± 4% ~ (p=0.941 n=43+39) GoParser 147ms ± 4% 148ms ± 3% ~ (p=0.164 n=42+39) Reflect 359ms ± 3% 357ms ± 5% ~ (p=0.241 n=43+44) Tar 106ms ± 5% 106ms ± 7% ~ (p=0.853 n=40+43) XML 202ms ± 3% 203ms ± 3% ~ (p=0.488 n=42+41) name old user-ns/op new user-ns/op delta Template 240M ± 4% 239M ± 4% ~ (p=0.844 n=42+43) Unicode 107M ± 5% 107M ± 4% ~ (p=0.332 n=40+43) GoTypes 735M ± 3% 731M ± 4% ~ (p=0.141 n=43+44) Compiler 3.51G ± 3% 3.52G ± 3% ~ (p=0.208 n=42+43) SSA 5.72G ± 4% 5.72G ± 3% ~ (p=0.928 n=44+42) Flate 151M ± 7% 150M ± 8% ~ (p=0.662 n=44+43) GoParser 181M ± 5% 181M ± 4% ~ (p=0.379 n=41+44) Reflect 447M ± 4% 445M ± 4% ~ (p=0.344 n=43+43) Tar 125M ± 7% 124M ± 6% ~ (p=0.353 n=43+43) XML 248M ± 4% 250M ± 6% ~ (p=0.158 n=44+44) name old alloc/op new alloc/op delta Template 40.3MB ± 0% 40.2MB ± 0% -0.27% (p=0.000 n=10+10) Unicode 30.3MB ± 0% 30.2MB ± 0% -0.10% (p=0.015 n=10+10) GoTypes 114MB ± 0% 114MB ± 0% -0.06% (p=0.000 n=7+9) Compiler 480MB ± 0% 481MB ± 0% +0.07% (p=0.000 n=10+10) SSA 864MB ± 0% 862MB ± 0% -0.25% (p=0.000 n=9+10) Flate 25.9MB ± 0% 25.9MB ± 0% ~ (p=0.123 n=10+10) GoParser 32.1MB ± 0% 32.1MB ± 0% ~ (p=0.631 n=10+10) Reflect 79.9MB ± 0% 79.6MB ± 0% -0.39% (p=0.000 n=10+9) Tar 27.1MB ± 0% 27.0MB ± 0% -0.18% (p=0.003 n=10+10) XML 42.6MB ± 0% 42.6MB ± 0% ~ (p=0.143 n=10+10) name old allocs/op new allocs/op delta Template 401k ± 0% 401k ± 1% ~ (p=0.353 n=10+10) Unicode 322k ± 0% 322k ± 0% ~ (p=0.739 n=10+10) GoTypes 1.18M ± 0% 1.18M ± 0% +0.25% (p=0.001 n=7+8) Compiler 4.51M ± 0% 4.53M ± 0% +0.37% (p=0.000 n=10+10) SSA 7.91M ± 0% 7.93M ± 0% +0.20% (p=0.000 n=9+10) Flate 244k ± 0% 245k ± 0% ~ (p=0.123 n=10+10) GoParser 323k ± 1% 324k ± 1% +0.40% (p=0.035 n=10+10) Reflect 1.01M ± 0% 1.02M ± 0% +0.37% (p=0.000 n=10+9) Tar 258k ± 1% 258k ± 1% ~ (p=0.661 n=10+9) XML 403k ± 0% 405k ± 0% +0.47% (p=0.004 n=10+10) Updates #15756 Updates #19250 Change-Id: I647bfbb745c35630447eb79dfcaa994b490ce942 Reviewed-on: https://go-review.googlesource.com/38159 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Josh Bleecher Snyder authored
Fixes #19555 Change-Id: I7aa0551a90f6bb630c0ba721f3525a8a9cf793fd Reviewed-on: https://go-review.googlesource.com/38164 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-
Bryan C. Mills authored
Add subbenchmarks for BenchmarkZip64Test with different sizes to tease apart construction costs vs. steady-state throughput. Results remain comparable with the non-parallel version with -cpu=1: benchmark old ns/op new ns/op delta BenchmarkCompressedZipGarbage 26832835 27506953 +2.51% BenchmarkCompressedZipGarbage-6 27172377 4321534 -84.10% BenchmarkZip64Test 196758732 197765510 +0.51% BenchmarkZip64Test-6 193850605 192625458 -0.63% benchmark old allocs new allocs delta BenchmarkCompressedZipGarbage 44 44 +0.00% BenchmarkCompressedZipGarbage-6 44 44 +0.00% benchmark old bytes new bytes delta BenchmarkCompressedZipGarbage 5592 5664 +1.29% BenchmarkCompressedZipGarbage-6 5592 21946 +292.45% updates #18177 Change-Id: Icfa359d9b1a8df5e085dacc07d2b9221b284764c Reviewed-on: https://go-review.googlesource.com/36719Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Cherry Zhang authored
Put call stubs at the beginning (instead of the end). So the trampoline pass knows the addresses of the stubs, and it can insert trampolines when necessary. Fixes #19425. Change-Id: I1e06529ef837a6130df58917315610d45a6819ca Reviewed-on: https://go-review.googlesource.com/38131 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
-
Josh Bleecher Snyder authored
The line between ssa.Func and ssa.Config has blurred. Concurrent compilation in the backend will require more precision. This CL lays out an (aspirational) organization. The implementation will come in follow-up CLs, once the organization is settled. ssa.Config holds basic compiler configuration, mostly arch-specific information. It is configured once, early on, and is readonly, so it is safe for concurrent use. ssa.Func is a single-shot object used for compiling a single Func. It is not concurrency-safe and not re-usable. ssa.Cache is a multi-use object used to avoid expensive allocations during compilation. Each ssa.Func is given an ssa.Cache to use. ssa.Cache is not concurrency-safe. Change-Id: Id02809b6f3541541cac6c27bbb598834888ce1cc Reviewed-on: https://go-review.googlesource.com/38160Reviewed-by: Keith Randall <khr@golang.org>
-
David Chase authored
Previously we always issued a spill right after the op that was being spilled. This CL pushes spills father away from the generator, hopefully pushing them into unlikely branches. For example: x = ... if unlikely { call ... } ... use x ... Used to compile to x = ... spill x if unlikely { call ... restore x } It now compiles to x = ... if unlikely { spill x call ... restore x } This is particularly useful for code which appends, as the only call is an unlikely call to growslice. It also helps for the spills needed around write barrier calls. The basic algorithm is walk down the dominator tree following a path where the block still dominates all of the restores. We're looking for a block that: 1) dominates all restores 2) has the value being spilled in a register 3) has a loop depth no deeper than the value being spilled The walking-down code is iterative. I was forced to limit it to searching 100 blocks so it doesn't become O(n^2). Maybe one day we'll find a better way. I had to delete most of David's code which pushed spills out of loops. I suspect this CL subsumes most of the cases that his code handled. Generally positive performance improvements, but hard to tell for sure with all the noise. (compilebench times are unchanged.) name old time/op new time/op delta BinaryTree17-12 2.91s ±15% 2.80s ±12% ~ (p=0.063 n=10+10) Fannkuch11-12 3.47s ± 0% 3.30s ± 4% -4.91% (p=0.000 n=9+10) FmtFprintfEmpty-12 48.0ns ± 1% 47.4ns ± 1% -1.32% (p=0.002 n=9+9) FmtFprintfString-12 85.6ns ±11% 79.4ns ± 3% -7.27% (p=0.005 n=10+10) FmtFprintfInt-12 91.8ns ±10% 85.9ns ± 4% ~ (p=0.203 n=10+9) FmtFprintfIntInt-12 135ns ±13% 127ns ± 1% -5.72% (p=0.025 n=10+9) FmtFprintfPrefixedInt-12 167ns ± 1% 168ns ± 2% ~ (p=0.580 n=9+10) FmtFprintfFloat-12 249ns ±11% 230ns ± 1% -7.32% (p=0.000 n=10+10) FmtManyArgs-12 504ns ± 7% 506ns ± 1% ~ (p=0.198 n=9+9) GobDecode-12 6.95ms ± 1% 7.04ms ± 1% +1.37% (p=0.001 n=10+10) GobEncode-12 6.32ms ±13% 6.04ms ± 1% ~ (p=0.063 n=10+10) Gzip-12 233ms ± 1% 235ms ± 0% +1.01% (p=0.000 n=10+9) Gunzip-12 40.1ms ± 1% 39.6ms ± 0% -1.12% (p=0.000 n=10+8) HTTPClientServer-12 227µs ± 9% 221µs ± 5% ~ (p=0.114 n=9+8) JSONEncode-12 16.1ms ± 2% 15.8ms ± 1% -2.09% (p=0.002 n=9+8) JSONDecode-12 61.8ms ±11% 57.9ms ± 1% -6.30% (p=0.000 n=10+9) Mandelbrot200-12 4.30ms ± 3% 4.28ms ± 1% ~ (p=0.203 n=10+8) GoParse-12 3.18ms ± 2% 3.18ms ± 2% ~ (p=0.579 n=10+10) RegexpMatchEasy0_32-12 76.7ns ± 1% 77.5ns ± 1% +0.92% (p=0.002 n=9+8) RegexpMatchEasy0_1K-12 239ns ± 3% 239ns ± 1% ~ (p=0.204 n=10+10) RegexpMatchEasy1_32-12 71.4ns ± 1% 70.6ns ± 0% -1.15% (p=0.000 n=10+9) RegexpMatchEasy1_1K-12 383ns ± 2% 390ns ±10% ~ (p=0.181 n=8+9) RegexpMatchMedium_32-12 114ns ± 0% 113ns ± 1% -0.88% (p=0.000 n=9+8) RegexpMatchMedium_1K-12 36.3µs ± 1% 36.8µs ± 1% +1.59% (p=0.000 n=10+8) RegexpMatchHard_32-12 1.90µs ± 1% 1.90µs ± 1% ~ (p=0.341 n=10+10) RegexpMatchHard_1K-12 59.4µs ±11% 57.8µs ± 1% ~ (p=0.968 n=10+9) Revcomp-12 461ms ± 1% 462ms ± 1% ~ (p=1.000 n=9+9) Template-12 67.5ms ± 1% 66.3ms ± 1% -1.77% (p=0.000 n=10+8) TimeParse-12 314ns ± 3% 309ns ± 0% -1.56% (p=0.000 n=9+8) TimeFormat-12 340ns ± 2% 331ns ± 1% -2.79% (p=0.000 n=10+10) The go binary is 0.2% larger. Not really sure why the size would change. Change-Id: Ia5116e53a3aeb025ef350ffc51c14ae5cc17871c Reviewed-on: https://go-review.googlesource.com/34822Reviewed-by: David Chase <drchase@google.com>
-
Philip Hofer authored
Interface wrapper functions now get compiled eagerly in some cases. Consequently, they may be present in multiple translation units. Mark them as DUPOK, just like closures. Fixes #19548 Fixes #19550 Change-Id: Ibe74adb5a62dbf6447db37fde22dcbb3479969ef Reviewed-on: https://go-review.googlesource.com/38156Reviewed-by: David Chase <drchase@google.com>
-
- 14 Mar, 2017 8 commits
-
-
Cherry Zhang authored
Fixes #19515. Change-Id: I4bcce152cef52d00fbb5ab4daf72a6e742bae27c Reviewed-on: https://go-review.googlesource.com/38158Reviewed-by: Keith Randall <khr@golang.org>
-
Matthew Dempsky authored
In the SSA CFG, TEXT, RET, and JMP instructions correspond to Blocks, not Values. Rework liveness analysis so that progeffects only cares about Progs that result from Values, and handle Blocks separately. Passes toolstash-check -all. Change-Id: Ic23719c75b0421fdb51382a08dac18c3ba042b32 Reviewed-on: https://go-review.googlesource.com/38085 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
-
Daniel Martí authored
POSIX Shell only supports = to compare variables inside '[' tests. But this is Bash, where == is an alias for =. In practice they're the same, but the current form is inconsisnent and breaks POSIX for no good reason. Change-Id: I38fa7a5a90658dc51acc2acd143049e510424ed8 Reviewed-on: https://go-review.googlesource.com/38031Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Josh Bleecher Snyder authored
The fmtmode and fmtpkgpfx globals stand in the way of making the compiler more concurrent (#15756). This CL removes them. The natural way to eliminate a global is to explicitly thread it as a parameter through all function calls. However, most of the functions in gc/fmt.go get called indirectly, by way of fmt format strings, so there's nowhere natural to add a parameter. Since there are only a few fmtmode modes, use named types to distinguish between modes. For example, fmtNodeErr, fmtNodeDbg, and fmtNodeTypeId are all gc.Node, but they print in different modes. Varying the type allows us to thread mode through fmt. Handle fmtpkgpfx by converting it to a printing mode, FTypeIdName, and using the same type-based approach. To avoid a loss of readability and danger of bugs from introducing conversions at all call sites, instead add a helper that systematically modifies the args. The only remaining gc/fmt.go global is dumpdepth. Since that is used for debugging only, it that can be handled with a global mutex, or some similarly basic, if inefficient, protection. Passes toolstash -cmp. No compiler performance impact. For future reference, other options for threading state that were considered and rejected: * Wrapping values in structs, such as: type fmtNode struct { n *Node mode fmtMode } This reduces the proliferation of types, and supports easily adding extra local parameters. However, putting such a struct into an interface{} allocates. This is unacceptable in this particular area of code. * Passing state via precision, such as: fmt.Fprintf("%*v", mode, n) where mode is the state encoded as an integer. This avoids extra allocations, but it is out of keeping with the intended semantics of precision, and is less readable. * Modify the fmt package to support setting/getting context via fmt.State. Unavailable due to Go 1 compatibility, and probably the wrong solution anyway. * Give up on package fmt. This would be a huge readability regression and cause high code churn. * Attempt a de-novo rewrite that circumvents these problems. Too high a risk of bugs, with insufficient reward for the effort, particularly since long term plans call for elimination of gc.Node. Change-Id: Iea2440d5a34a938e64273707de27e3a897cb41d1 Reviewed-on: https://go-review.googlesource.com/38147 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>
-
Michael Stapelberg authored
Given the following test cases: $ cat left_too_many.go package main func main() { a, err := make([]int, 1) } $ cat right_too_many.go package main func main() { a := "foo", "bar" } Before this change, the error messages are: ./left_too_many.go:4: assignment count mismatch: 2 = 1 ./right_too_many.go:4: assignment count mismatch: 1 = 2 After this change, the error messages are: ./left_too_many.go:4: assignment count mismatch: want 2 values, got 1 ./right_too_many.go:4: assignment count mismatch: want 1 values, got 2 Change-Id: I9ad346f122406bc9a785bf690ed7b3de76a422da Reviewed-on: https://go-review.googlesource.com/33616Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Matthew Dempsky authored
Change-Id: I0f7eec2e0c15a355422d5ae7289508a5bd33b971 Reviewed-on: https://go-review.googlesource.com/38171 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
philhofer authored
With this change, code like h := sha1.New() h.Write(buf) sum := h.Sum() gets compiled into static calls rather than interface calls, because the compiler is able to prove that 'h' is really a *sha1.digest. The InterCall re-write rule hits a few dozen times during make.bash, and hundreds of times during all.bash. The most common pattern identified by the compiler is a constructor like func New() Interface { return &impl{...} } where the constructor gets inlined into the caller, and the result is used immediately. Examples include {sha1,md5,crc32,crc64,...}.New, base64.NewEncoder, base64.NewDecoder, errors.New, net.Pipe, and so on. Some existing benchmarks that change on darwin/amd64: Crc64/ISO4KB-8 2.67µs ± 1% 2.66µs ± 0% -0.36% (p=0.015 n=10+10) Crc64/ISO1KB-8 694ns ± 0% 690ns ± 1% -0.59% (p=0.001 n=10+10) Adler32KB-8 473ns ± 1% 471ns ± 0% -0.39% (p=0.010 n=10+9) On architectures like amd64, the reduction in code size appears to contribute more to benchmark improvements than just removing the indirect call, since that branch gets predicted accurately when called in a loop. Updates #19361 Change-Id: I57d4dc21ef40a05ec0fbd55a9bb0eb74cdc67a3d Reviewed-on: https://go-review.googlesource.com/38139 Run-TryBot: Philip Hofer <phofer@umich.edu> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
-
Matthew Dempsky authored
Changes to ${GOARCH}Ops.go files were mechanically produced using github.com/mdempsky/ssa-symops, a one-off tool that inserts "SymEffect: X" elements by pattern matching against the Op names. Change-Id: Ibf3e481ffd588647f2a31662d72114b740ccbfcf Reviewed-on: https://go-review.googlesource.com/38084 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-