Commits · 6b5a0b5c1674e82749664b124daf0e6b15af04dd · go / golang

09 Mar, 2018 4 commits

cmd/trace: set cname for span slices · 6b5a0b5c

Hana Kim authored Mar 06, 2018

Define a set of color names available in trace viewer

https://user-images.githubusercontent.com/4999471/37063995-5d0bad48-2169-11e8-92be-9cb363e21c38.png

Change-Id: I312fcbc5430d7512b4c39ddc79a769259bad8c22
Reviewed-on: https://go-review.googlesource.com/99055Reviewed-by: Heschi Kreinick <heschi@google.com>

6b5a0b5c

cmd/trace: remove unrelated arrows in task-oriented traceview · 476f71c9

Hana Kim authored Mar 02, 2018

Also grey out instants that represent events occurred outside the
task's span. Furthermore, if the unrelated instants represent user
annotation events but not for the task of the interest, skip rendering
completely.

This helps users to focus on the task-related events better.

UI screen shot:
https://gist.github.com/hyangah/1df5d2c8f429fd933c481e9636b89b55#file-golang-org_cl_99035

Change-Id: I2b5aef41584c827f8c1e915d0d8e5c95fe2b4b65
Reviewed-on: https://go-review.googlesource.com/99035
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>

476f71c9

go/printer: simplify handling of line directives · 53c41639

Robert Griesemer authored Mar 09, 2018

Strangely enough, the existing implementation used adjusted (by line
directives) source positions to determine layout and thus required
position corrections when printing a line directive.

Instead, just use the unadjusted, absolute source positions and then
printing a line directive doesn't require any adjustments, only some
care to make sure it remains in column 1 as before.

The new code doesn't need to parse line directives anymore and simply
ensures that comments with the //line prefix and starting in column 1
remain in that position. That is a slight change from the old behavior
(which ignored incorrect line directives, e.g. because they had an
invalid line number) but unlikely to show up in real code.

This is prep work for handling of line directives that also specify
columns (which now won't require much special handling anymore).

For #24143.

Change-Id: I07eb2e1b35b37337e632e3dbf5b70c783c615f8a
Reviewed-on: https://go-review.googlesource.com/99621Reviewed-by: Matthew Dempsky <mdempsky@google.com>

53c41639

encoding/csv: disallow quote for use as Comma · 2cc15b18

Joe Tsai authored Mar 08, 2018

'"' has special semantic meaning that conflicts with using it as Comma.

Change-Id: Ife25ba43ca25dba2ea184c1bb7579a230d376059
Reviewed-on: https://go-review.googlesource.com/99696
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

2cc15b18

08 Mar, 2018 27 commits

runtime: explain and enforce that _panic values live on the stack · 5d22cebb

Austin Clements authored Mar 08, 2018

It's a bit mysterious that _defer.sp is a uintptr that gets
stack-adjusted explicitly while _panic.argp is an unsafe.Pointer that
doesn't, but turns out to be critically important when a deferred
function grows the stack before doing a recover.

Add a comment explaining that this works because _panic values live on
the stack. Enforce this by marking _panic go:notinheap.

Change-Id: I9ca49e84ee1f86d881552c55dccd0662b530836b
Reviewed-on: https://go-review.googlesource.com/99735
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>

5d22cebb

runtime: ensure abort actually crashes the process · 60a9e5d6

Austin Clements authored Jan 18, 2018

On all non-x86 arches, runtime.abort simply reads from nil.
Unfortunately, if this happens on a user stack, the signal handler
will dutifully turn this into a panicmem, which lets user defers run
and which user code can even recover from.

To fix this, add an explicit check to the signal handler that turns
faults in abort into hard crashes directly in the signal handler. This
has the added benefit of giving a register dump at the abort point.

Change-Id: If26a7f13790745ee3867db7f53b72d8281176d70
Reviewed-on: https://go-review.googlesource.com/93661
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>

60a9e5d6

runtime: call abort instead of raw INT $3 or bad MOV · c950a90d

Austin Clements authored Jan 17, 2018

Everything except for amd64, amd64p32, and 386 currently defines and
uses an abort function. This CL makes these match. The next CL will
recognize the abort function to make this more useful.

Change-Id: I7c155871ea48919a9220417df0630005b444f488
Reviewed-on: https://go-review.googlesource.com/93660
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>

c950a90d

runtime: make throw safer to call · 7f1b2738

Austin Clements authored Jan 12, 2018

Currently, throw may grow the stack, which means whenever we call it
from a context where it's not safe to grow the stack, we first have to
switch to the system stack. This is pretty easy to get wrong.

Fix this by making throw switch to the system stack so it doesn't grow
the stack and is hence safe to call without a system stack switch at
the call site.

The only thing this complicates is badsystemstack itself, which would
now go into an infinite loop before printing anything (previously it
would also go into an infinite loop, but would at least print the
error first). Fix this by making badsystemstack do a direct write and
then crash hard.

Change-Id: Ic5b4a610df265e47962dcfa341cabac03c31c049
Reviewed-on: https://go-review.googlesource.com/93659
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>

7f1b2738

runtime: move unrecoverable panic handling to the system stack · 9d59234c

Austin Clements authored Jan 18, 2018

Currently parts of unrecoverable panic handling (notably, printing
panic messages) can happen on the user stack. This may grow the stack,
which is generally fine, but if we're handling a runtime panic, it's
better to do as little as possible in case the runtime is in an
inconsistent state.

Hence, this commit rearranges the handling of unrecoverable panics so
that it's done entirely on the system stack.

This is mostly a matter of shuffling code a bit so everything can move
into a systemstack block. The one slight subtlety is in the "panic
during panic" case, where we now depend on startpanic_m's caller to
print the stack rather than startpanic_m itself. To make this work,
startpanic_m now returns a boolean indicating that the caller should
avoid trying to print any panic messages and get right to the stack
trace. Since the caller is already in a position to do this, this
actually simplifies things a little.

Change-Id: Id72febe8c0a9fb31d9369b600a1816d65a49bfed
Reviewed-on: https://go-review.googlesource.com/93658
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>

9d59234c

cmd/compile: simplify OpSlicemask optimization · da022da9

Austin Clements authored Jan 11, 2018

The previous CL introduced isConstDelta. Use it to simplify the
OpSlicemask optimization in the prove pass. This passes toolstash
-cmp.

Change-Id: If2aa762db4cdc0cd1c581a536340530a9831081b
Reviewed-on: https://go-review.googlesource.com/87481Reviewed-by: Keith Randall <khr@golang.org>

da022da9

cmd/compile: add fence-post implications to prove · 6436270d

Austin Clements authored Jan 11, 2018

This adds four new deductions to the prove pass, all related to adding
or subtracting one from a value. This is the first hint of actual
arithmetic relations in the prove pass.

The most effective of these is

   x-1 >= w && x > min  ⇒  x > w

This helps eliminate bounds checks in code like

  if x > 0 {
    // do something with s[x-1]
  }

Altogether, these deductions prove an additional 260 branches in std
and cmd. Furthermore, they will let us eliminate some tricky
compiler-inserted panics in the runtime that are interfering with
static analysis.

Fixes #23354.

Change-Id: I7088223e0e0cd6ff062a75c127eb4bb60e6dce02
Reviewed-on: https://go-review.googlesource.com/87480Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>

6436270d

cmd/compile: derive unsigned limits from signed limits in prove · 941fc129

Austin Clements authored Jan 11, 2018

This adds a few simple deductions to the prove pass' fact table to
derive unsigned concrete limits from signed concrete limits where
possible.

This tweak lets the pass prove 70 additional branch conditions in std
and cmd.

This is based on a comment from the recently-deleted factsTable.get:
"// TODO: also use signed data if lim.min >= 0".

Change-Id: Ib4340249e7733070f004a0aa31254adf5df8a392
Reviewed-on: https://go-review.googlesource.com/87479Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
Reviewed-by: Keith Randall <khr@golang.org>

941fc129

cmd/compile: make prove pass use unsatisfiability · 669db2ce

Austin Clements authored Jan 10, 2018

Currently the prove pass uses implication queries. For each block, it
collects the set of branch conditions leading to that block, and
queries this fact table for whether any of these facts imply the
block's own branch condition (or its inverse). This works remarkably
well considering it doesn't do any deduction on these facts, but it
has various downsides:

1. It requires an implementation both of adding facts to the table and
determining implications. These are very nearly duals of each
other, but require separate implementations. Likewise, the process
of asserting facts of dominating branch conditions is very nearly
the dual of the process of querying implied branch conditions.

2. It leads to less effective use of derived facts. For example, the
prove pass currently derives facts about the relations between len
and cap, but can't make use of these unless a branch condition is
in the exact form of a derived fact. If one of these derived facts
contradicts another fact, it won't notice or make use of this.

This CL changes the approach of the prove pass to instead use
*contradiction* instead of implication. Rather than ever querying a
branch condition, it simply adds branch conditions to the fact table.
If this leads to a contradiction (specifically, it makes the fact set
unsatisfiable), that branch is impossible and can be cut. As a result,

1. We can eliminate the code for determining implications
(factsTable.get disappears entirely). Also, there is now a single
implementation of visiting and asserting branch conditions, since
we don't have to flip them around to treat them as facts in one
place and queries in another.

2. Derived facts can be used effectively. It doesn't matter *why* the
fact table is unsatisfiable; a contradiction in any of the facts is
enough.

3. As an added benefit, it's now quite easy to avoid traversing beyond
provably-unreachable blocks. In contrast, the current
implementation always visits all blocks.

The prove pass already has nearly all of the mechanism necessary to
compute unsatisfiability, which means this both simplifies the code
and makes it more powerful.

The only complication is that the current implication procedure has a
hack for dealing with the 0 <= Args[0] condition of OpIsInBounds and
OpIsSliceInBounds. We replace this with asserting the appropriate fact
when we process one of these conditions. This seems much cleaner
anyway, and works because we can now take advantage of derived facts.

This has no measurable effect on compiler performance.

Effectiveness:

There is exactly one condition in all of std and cmd that this fails
to prove that the old implementation could: (int64(^uint(0)>>1) < x)
in encoding/gob. This can never be true because x is an int, and it's
basically coincidence that the old code gets this. (For example, it
fails to prove the similar (x < ^int64(^uint(0)>>1)) condition that
immediately precedes it, and even though the conditions are logically
unrelated, it wouldn't get the second one if it hadn't first processed
the first!)

It does, however, prove a few dozen additional branches. These come
from facts that are added to the fact table about the relations
between len and cap. These were almost never queried directly before,
but could lead to contradictions, which the unsat-based approach is
able to use.

There are exactly two branches in std and cmd that this implementation
proves in the *other* direction. This sounds scary, but is okay
because both occur in already-unreachable blocks, so it doesn't matter
what we chose. Because the fact table logic is sound but incomplete,
it fails to prove that the block isn't reachable, even though it is
able to prove that both outgoing branches are impossible. We could
turn these blocks into BlockExit blocks, but it doesn't seem worth the
trouble of the extra proof effort for something that happens twice in
all of std and cmd.

Tests:

This CL updates test/prove.go to change the expected messages because
it can no longer give a "reason" why it proved or disproved a
condition. It also adds a new test of a branch it couldn't prove
before.

It mostly guts test/sliceopt.go, removing everything related to slice
bounds optimizations and moving a few relevant tests to test/prove.go.
Much of this test is actually unreachable. The new prove pass figures
this out and doesn't try to prove anything about the unreachable
parts. The output on the unreachable parts is already suspect because
anything can be proved at that point, so it's really just a regression
test for an algorithm the compiler no longer uses.

This is a step toward fixing #23354. That issue is quite easy to fix
once we can use derived facts effectively.

Change-Id: Ia48a1b9ee081310579fe474e4a61857424ff8ce8
Reviewed-on: https://go-review.googlesource.com/87478Reviewed-by: Keith Randall <khr@golang.org>

669db2ce

cmd/compile: simplify limit logic in prove · 2e9cf5f6

Austin Clements authored Jan 10, 2018

This replaces the open-coded intersection of limits in the prove pass
with a general limit intersection operation. This should get identical
results except in one case where it's more precise: when handling an
equality relation, if the value is *outside* the existing range, this
will reduce the range to empty rather than resetting it. This will be
important to a follow-up CL where we can take advantage of empty
ranges.

For #23354.

Change-Id: I3d3d75924f61b1da1cb604b3a9d189b26fb3a14e
Reviewed-on: https://go-review.googlesource.com/87477
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>

2e9cf5f6

cmd/compile: more String methods for prove types · 44e20b64

Austin Clements authored Jan 09, 2018

These aid in debugging.

Change-Id: Ieb38c996765f780f6103f8c3292639d408c25123
Reviewed-on: https://go-review.googlesource.com/87476
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>

44e20b64

cmd/compile: minor comment improvements/corrections · 491f409a

Austin Clements authored Jan 09, 2018

Change-Id: Ie0934f1528d58d4971cdef726d3e2d23cf3935d3
Reviewed-on: https://go-review.googlesource.com/87475
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>

491f409a

Revert "cmd/compile: cleanup nodpc and nodfp" · b55eedd1

Matthew Dempsky authored Mar 08, 2018

This reverts commit dcac984b.

Reason for revert: broke LR architectures (arm64, ppc64, s390x)

Change-Id: I531d311c9053e81503c8c78d6cf044b318fc828b
Reviewed-on: https://go-review.googlesource.com/99695
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>

b55eedd1

math/big: allocate less in Float.Sqrt · 010579c2

Alberto Donizetti authored Jan 14, 2018

The Newton sqrtInverse procedure we use to compute Float.Sqrt should
not allocate a number of times proportional to the number of Newton
iterations we need to reach the desired precision.

At the beginning the function the target precision is known, so even
if we do want to perform the early steps at low precisions (to save
time), it's still possible to pre-allocate larger backing arrays, both
for the temp variables in the loop and the variable that'll hold the
final result.

There's one complication. At the following line:

u.Sub(three, u)

the Sub method will allocate, because the receiver aliases one of the
arguments, and the large backing array we initially allocated for u
will be replaced by a smaller one allocated by Sub. We can work around
this by introducing a second temp variable u2 that we use to hold the
Sub call result.

Overall, the sqrtInverse procedure still allocates a number of times
proportional to the number of Newton steps, because unfortunately a
few of the Mul calls in the Newton function allocate; but at least we
allocate less in the function itself.

FloatSqrt/256-4 1.97µs ± 1% 1.84µs ± 1% -6.61% (p=0.000 n=8+8)
FloatSqrt/1000-4 4.80µs ± 3% 4.28µs ± 1% -10.78% (p=0.000 n=8+8)
FloatSqrt/10000-4 40.0µs ± 1% 38.3µs ± 1% -4.15% (p=0.000 n=8+8)
FloatSqrt/100000-4 955µs ± 1% 932µs ± 0% -2.49% (p=0.000 n=8+7)
FloatSqrt/1000000-4 79.8ms ± 1% 79.4ms ± 1% ~ (p=0.105 n=8+8)

name old alloc/op new alloc/op delta
FloatSqrt/256-4 816B ± 0% 512B ± 0% -37.25% (p=0.000 n=8+8)
FloatSqrt/1000-4 2.50kB ± 0% 1.47kB ± 0% -41.03% (p=0.000 n=8+8)
FloatSqrt/10000-4 23.5kB ± 0% 18.2kB ± 0% -22.62% (p=0.000 n=8+8)
FloatSqrt/100000-4 251kB ± 0% 173kB ± 0% -31.26% (p=0.000 n=8+8)
FloatSqrt/1000000-4 4.61MB ± 0% 2.86MB ± 0% -37.90% (p=0.000 n=8+8)

name old allocs/op new allocs/op delta
FloatSqrt/256-4 12.0 ± 0% 8.0 ± 0% -33.33% (p=0.000 n=8+8)
FloatSqrt/1000-4 19.0 ± 0% 9.0 ± 0% -52.63% (p=0.000 n=8+8)
FloatSqrt/10000-4 35.0 ± 0% 14.0 ± 0% -60.00% (p=0.000 n=8+8)
FloatSqrt/100000-4 55.0 ± 0% 23.0 ± 0% -58.18% (p=0.000 n=8+8)
FloatSqrt/1000000-4 122 ± 0% 75 ± 0% -38.52% (p=0.000 n=8+8)

Change-Id: I950dbf61a40267a6cca82ae72524c3024bcb149c
Reviewed-on: https://go-review.googlesource.com/87659Reviewed-by: Robert Griesemer <gri@golang.org>

010579c2

math/big: speedup nat.setBytes for bigger slices · d2a5263a

isharipo authored Mar 06, 2018

Set up to _S (number of bytes in Uint) bytes at time
by using BigEndian.Uint32 and BigEndian.Uint64.

The performance improves for slices bigger than _S bytes.
This is the case for 128/256bit arith that initializes
it's objects from bytes.

name               old time/op  new time/op  delta
NatSetBytes/8-4    29.8ns ± 1%  11.4ns ± 0%  -61.63%  (p=0.000 n=9+8)
NatSetBytes/24-4    109ns ± 1%    56ns ± 0%  -48.75%  (p=0.000 n=9+8)
NatSetBytes/128-4   420ns ± 2%   110ns ± 1%  -73.83%  (p=0.000 n=10+10)
NatSetBytes/7-4    26.2ns ± 1%  21.3ns ± 2%  -18.63%  (p=0.000 n=8+9)
NatSetBytes/23-4    106ns ± 1%    67ns ± 1%  -36.93%  (p=0.000 n=9+10)
NatSetBytes/127-4   410ns ± 2%   121ns ± 0%  -70.46%  (p=0.000 n=9+8)

Found this optimization opportunity by looking at ethereum_corevm
community benchmark cpuprofile.

name        old time/op  new time/op  delta
OpDiv256-4   715ns ± 1%   596ns ± 1%  -16.57%  (p=0.008 n=5+5)
OpDiv128-4   373ns ± 1%   314ns ± 1%  -15.83%  (p=0.008 n=5+5)
OpDiv64-4    301ns ± 0%   285ns ± 1%   -5.12%  (p=0.008 n=5+5)

Change-Id: I8e5a680ae6284c8233d8d7431d51253a8a740b57
Reviewed-on: https://go-review.googlesource.com/98775
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

d2a5263a

cmd/compile: cleanup nodpc and nodfp · dcac984b

Matthew Dempsky authored Mar 08, 2018

Instead of creating a new &nodfp expression for every recover() call,
or a new nodpc variable for every function instrumented by the race
detector, this CL introduces two new uintptr-typed pseudo-variables
callerSP and callerPC. These pseudo-variables act just like calls to
the runtime's getcallersp() and getcallerpc() functions.

For consistency, change runtime.gorecover's builtin stub's parameter
type from "*int32" to "uintptr".

Passes toolstash-check, but toolstash-check -race fails because of
register allocator changes.

Change-Id: I985d644653de2dac8b7b03a28829ad04dfd4f358
Reviewed-on: https://go-review.googlesource.com/99416
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

dcac984b

cmd/compile: remove two out-of-phase calls to walk · 6a5cfa8b

Matthew Dempsky authored Mar 08, 2018

All calls to walkstmt/walkexpr/etc should be rooted from funccompile,
whereas transformclosure and fninit are called by main.

Passes toolstash-check.

Change-Id: Ic880e2d2d83af09618ce4daa8e7716f6b389e53e
Reviewed-on: https://go-review.googlesource.com/99418
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

6a5cfa8b

cmd/compile: remove state.exitCode · 8b766e5d

Matthew Dempsky authored Mar 08, 2018

We're holding onto the function's complete AST anyway, so might as
well grab the exit code from there.

Passes toolstash-check.

Change-Id: I851b5dfdb53f991e9cd9488d25d0d2abc2a8379f
Reviewed-on: https://go-review.googlesource.com/99417
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

8b766e5d

cmd/compile: fuse escape analysis parameter tagging loops · e3127f02

Matthew Dempsky authored Mar 07, 2018

Simplifies the code somewhat and allows removing Param.Field.

Passes toolstash-check.

Change-Id: Id854416aea8afd27ce4830ff0f5ff940f7353792
Reviewed-on: https://go-review.googlesource.com/99336
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>

e3127f02

net/http: panic when a nil handler is passed to (*ServeMux)HandleFunc · 7d654af5

Kunpei Sakai authored Mar 08, 2018

Fixes #24297

Change-Id: I759e88655632fda97dced240b3f13392b2785d0a
Reviewed-on: https://go-review.googlesource.com/99575Reviewed-by: Andrew Bonventre <andybons@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Andrew Bonventre <andybons@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7d654af5

time: add support for parsing timezones denoted by sign and offset · 9f2c611f

Michael Kasch authored Mar 02, 2018

IANA Zoneinfo does not provide names for all timezones. Some are denoted
by a sign and an offset only. E.g: Europe/Turkey is currently +03 or
America/La_Paz which is -04 (https://data.iana.org/time-zones/releases/tzdata2018c.tar.gz)

Fixes #24071

Change-Id: I9c230a719945e1263c5b52bab82084d22861be3e
Reviewed-on: https://go-review.googlesource.com/98157Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

9f2c611f

runtime: use systemstack around throw in sysSigaction · 3d69ef37

Ian Lance Taylor authored Mar 08, 2018

Try to fix the build on ppc64-linux and ppc64le-linux, avoiding:

--- FAIL: TestInlinedRoutineRecords (2.12s)
	dwarf_test.go:97: build: # command-line-arguments
		runtime.systemstack: nosplit stack overflow
			752	assumed on entry to runtime.sigtrampgo (nosplit)
			480	after runtime.sigtrampgo (nosplit) uses 272
			400	after runtime.sigfwdgo (nosplit) uses 80
			264	after runtime.setsig (nosplit) uses 136
			208	after runtime.sigaction (nosplit) uses 56
			136	after runtime.sysSigaction (nosplit) uses 72
			88	after runtime.throw (nosplit) uses 48
			16	after runtime.dopanic (nosplit) uses 72
			-16	after runtime.systemstack (nosplit) uses 32

	dwarf_test.go:98: build error: exit status 2
--- FAIL: TestAbstractOriginSanity (10.22s)
	dwarf_test.go:97: build: # command-line-arguments
		runtime.systemstack: nosplit stack overflow
			752	assumed on entry to runtime.sigtrampgo (nosplit)
			480	after runtime.sigtrampgo (nosplit) uses 272
			400	after runtime.sigfwdgo (nosplit) uses 80
			264	after runtime.setsig (nosplit) uses 136
			208	after runtime.sigaction (nosplit) uses 56
			136	after runtime.sysSigaction (nosplit) uses 72
			88	after runtime.throw (nosplit) uses 48
			16	after runtime.dopanic (nosplit) uses 72
			-16	after runtime.systemstack (nosplit) uses 32

	dwarf_test.go:98: build error: exit status 2
FAIL
FAIL	cmd/link/internal/ld	13.404s

Change-Id: I4840604adb0e9f68a8d8e24f2f2a1a17d1634a58
Reviewed-on: https://go-review.googlesource.com/99415Reviewed-by: Austin Clements <austin@google.com>

3d69ef37

test/codegen: port 2^n muls tests to codegen harness · 3772b2e1

Alberto Donizetti authored Mar 08, 2018

And delete them from the asm_test.go file.

Change-Id: I124c8c352299646ec7db0968cdb0fe59a3b5d83d
Reviewed-on: https://go-review.googlesource.com/99475
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>

3772b2e1

math/big: optimize addVW and subVW on arm64 · 0585d41c

erifan01 authored Oct 16, 2017

The biggest hot spot of the existing implementation is "load" operations, which lead to poor performance.
By unrolling the cycle 4 times and 2 times, and using "LDP", "STP" instructions,
this CL can reduce the "load" cost and improve performance.

Benchmarks:

name old time/op new time/op delta
AddVV/1-8 21.5ns ± 0% 21.5ns ± 0% ~ (all equal)
AddVV/2-8 13.5ns ± 0% 13.5ns ± 0% ~ (all equal)
AddVV/3-8 15.5ns ± 0% 15.5ns ± 0% ~ (all equal)
AddVV/4-8 17.5ns ± 0% 17.5ns ± 0% ~ (all equal)
AddVV/5-8 19.5ns ± 0% 19.5ns ± 0% ~ (all equal)
AddVV/10-8 29.5ns ± 0% 29.5ns ± 0% ~ (all equal)
AddVV/100-8 217ns ± 0% 217ns ± 0% ~ (all equal)
AddVV/1000-8 2.02µs ± 0% 2.02µs ± 0% ~ (all equal)
AddVV/10000-8 20.3µs ± 0% 20.3µs ± 0% ~ (p=0.603 n=5+5)
AddVV/100000-8 223µs ± 7% 228µs ± 8% ~ (p=0.548 n=5+5)
AddVW/1-8 9.32ns ± 0% 9.26ns ± 0% -0.64% (p=0.008 n=5+5)
AddVW/2-8 19.8ns ± 3% 10.5ns ± 0% -46.92% (p=0.008 n=5+5)
AddVW/3-8 11.5ns ± 0% 11.0ns ± 0% -4.35% (p=0.008 n=5+5)
AddVW/4-8 13.0ns ± 0% 12.0ns ± 0% -7.69% (p=0.008 n=5+5)
AddVW/5-8 14.5ns ± 0% 12.5ns ± 0% -13.79% (p=0.008 n=5+5)
AddVW/10-8 22.0ns ± 0% 15.5ns ± 0% -29.55% (p=0.008 n=5+5)
AddVW/100-8 167ns ± 0% 81ns ± 0% -51.44% (p=0.008 n=5+5)
AddVW/1000-8 1.52µs ± 0% 0.64µs ± 0% -57.58% (p=0.008 n=5+5)
AddVW/10000-8 15.1µs ± 0% 7.2µs ± 0% -52.55% (p=0.008 n=5+5)
AddVW/100000-8 150µs ± 0% 71µs ± 0% -52.95% (p=0.008 n=5+5)
SubVW/1-8 9.32ns ± 0% 9.26ns ± 0% -0.64% (p=0.008 n=5+5)
SubVW/2-8 19.7ns ± 2% 10.5ns ± 0% -46.70% (p=0.008 n=5+5)
SubVW/3-8 11.5ns ± 0% 11.0ns ± 0% -4.35% (p=0.008 n=5+5)
SubVW/4-8 13.0ns ± 0% 12.0ns ± 0% -7.69% (p=0.008 n=5+5)
SubVW/5-8 14.5ns ± 0% 12.5ns ± 0% -13.79% (p=0.008 n=5+5)
SubVW/10-8 22.0ns ± 0% 15.5ns ± 0% -29.55% (p=0.008 n=5+5)
SubVW/100-8 167ns ± 0% 81ns ± 0% -51.44% (p=0.008 n=5+5)
SubVW/1000-8 1.52µs ± 0% 0.64µs ± 0% -57.58% (p=0.008 n=5+5)
SubVW/10000-8 15.1µs ± 0% 7.2µs ± 0% -52.49% (p=0.008 n=5+5)
SubVW/100000-8 150µs ± 0% 71µs ± 0% -52.91% (p=0.008 n=5+5)
AddMulVVW/1-8 32.4ns ± 1% 32.6ns ± 1% ~ (p=0.119 n=5+5)
AddMulVVW/2-8 57.0ns ± 0% 57.0ns ± 0% ~ (p=0.643 n=5+5)
AddMulVVW/3-8 90.8ns ± 0% 90.7ns ± 0% ~ (p=0.524 n=5+5)
AddMulVVW/4-8 118ns ± 0% 118ns ± 1% ~ (p=1.000 n=4+5)
AddMulVVW/5-8 144ns ± 1% 144ns ± 0% ~ (p=0.794 n=5+4)
AddMulVVW/10-8 294ns ± 1% 296ns ± 0% +0.48% (p=0.040 n=5+5)
AddMulVVW/100-8 2.73µs ± 0% 2.73µs ± 0% ~ (p=0.278 n=5+5)
AddMulVVW/1000-8 26.0µs ± 0% 26.5µs ± 0% +2.14% (p=0.008 n=5+5)
AddMulVVW/10000-8 297µs ± 0% 297µs ± 0% +0.24% (p=0.008 n=5+5)
AddMulVVW/100000-8 3.15ms ± 1% 3.13ms ± 0% ~ (p=0.690 n=5+5)
DecimalConversion-8 311µs ± 2% 309µs ± 2% ~ (p=0.310 n=5+5)
FloatString/100-8 2.55µs ± 2% 2.54µs ± 2% ~ (p=1.000 n=5+5)
FloatString/1000-8 58.1µs ± 0% 58.1µs ± 0% ~ (p=0.151 n=5+5)
FloatString/10000-8 4.59ms ± 0% 4.59ms ± 0% ~ (p=0.151 n=5+5)
FloatString/100000-8 446ms ± 0% 446ms ± 0% +0.01% (p=0.016 n=5+5)
FloatAdd/10-8 183ns ± 0% 183ns ± 0% ~ (p=0.333 n=4+5)
FloatAdd/100-8 187ns ± 1% 192ns ± 2% ~ (p=0.056 n=5+5)
FloatAdd/1000-8 369ns ± 0% 371ns ± 0% +0.54% (p=0.016 n=4+5)
FloatAdd/10000-8 1.88µs ± 0% 1.88µs ± 0% -0.14% (p=0.000 n=4+5)
FloatAdd/100000-8 17.2µs ± 0% 17.1µs ± 0% -0.37% (p=0.008 n=5+5)
FloatSub/10-8 147ns ± 0% 147ns ± 0% ~ (all equal)
FloatSub/100-8 145ns ± 0% 146ns ± 0% ~ (p=0.238 n=5+4)
FloatSub/1000-8 241ns ± 0% 241ns ± 0% ~ (p=0.333 n=5+4)
FloatSub/10000-8 1.06µs ± 0% 1.06µs ± 0% ~ (p=0.444 n=5+5)
FloatSub/100000-8 9.50µs ± 0% 9.48µs ± 0% -0.14% (p=0.008 n=5+5)
ParseFloatSmallExp-8 28.4µs ± 2% 28.5µs ± 1% ~ (p=0.690 n=5+5)
ParseFloatLargeExp-8 125µs ± 1% 124µs ± 1% ~ (p=0.095 n=5+5)
GCD10x10/WithoutXY-8 277ns ± 2% 278ns ± 3% ~ (p=0.937 n=5+5)
GCD10x10/WithXY-8 2.08µs ± 3% 2.15µs ± 3% ~ (p=0.056 n=5+5)
GCD10x100/WithoutXY-8 592ns ± 3% 613ns ± 4% ~ (p=0.056 n=5+5)
GCD10x100/WithXY-8 3.40µs ± 2% 3.42µs ± 4% ~ (p=0.841 n=5+5)
GCD10x1000/WithoutXY-8 1.37µs ± 2% 1.35µs ± 3% ~ (p=0.460 n=5+5)
GCD10x1000/WithXY-8 7.34µs ± 2% 7.33µs ± 4% ~ (p=0.841 n=5+5)
GCD10x10000/WithoutXY-8 8.52µs ± 0% 8.51µs ± 1% ~ (p=0.421 n=5+5)
GCD10x10000/WithXY-8 27.5µs ± 2% 27.2µs ± 1% ~ (p=0.151 n=5+5)
GCD10x100000/WithoutXY-8 78.3µs ± 1% 78.5µs ± 1% ~ (p=0.690 n=5+5)
GCD10x100000/WithXY-8 231µs ± 0% 229µs ± 1% -1.11% (p=0.016 n=5+5)
GCD100x100/WithoutXY-8 1.86µs ± 2% 1.86µs ± 2% ~ (p=0.881 n=5+5)
GCD100x100/WithXY-8 27.1µs ± 2% 27.2µs ± 1% ~ (p=0.421 n=5+5)
GCD100x1000/WithoutXY-8 4.44µs ± 2% 4.41µs ± 1% ~ (p=0.310 n=5+5)
GCD100x1000/WithXY-8 36.3µs ± 1% 36.2µs ± 1% ~ (p=0.310 n=5+5)
GCD100x10000/WithoutXY-8 22.6µs ± 2% 22.5µs ± 1% ~ (p=0.690 n=5+5)
GCD100x10000/WithXY-8 145µs ± 1% 145µs ± 1% ~ (p=1.000 n=5+5)
GCD100x100000/WithoutXY-8 195µs ± 0% 196µs ± 1% ~ (p=0.548 n=5+5)
GCD100x100000/WithXY-8 1.10ms ± 0% 1.10ms ± 0% -0.30% (p=0.016 n=5+5)
GCD1000x1000/WithoutXY-8 25.0µs ± 1% 25.2µs ± 2% ~ (p=0.222 n=5+5)
GCD1000x1000/WithXY-8 520µs ± 0% 520µs ± 1% ~ (p=0.151 n=5+5)
GCD1000x10000/WithoutXY-8 57.0µs ± 1% 56.9µs ± 1% ~ (p=0.690 n=5+5)
GCD1000x10000/WithXY-8 1.21ms ± 0% 1.21ms ± 1% ~ (p=0.881 n=5+5)
GCD1000x100000/WithoutXY-8 358µs ± 0% 359µs ± 1% ~ (p=0.548 n=5+5)
GCD1000x100000/WithXY-8 8.73ms ± 0% 8.73ms ± 0% ~ (p=0.548 n=5+5)
GCD10000x10000/WithoutXY-8 686µs ± 0% 687µs ± 0% ~ (p=0.548 n=5+5)
GCD10000x10000/WithXY-8 15.9ms ± 0% 15.9ms ± 0% ~ (p=0.841 n=5+5)
GCD10000x100000/WithoutXY-8 2.08ms ± 0% 2.08ms ± 0% ~ (p=1.000 n=5+5)
GCD10000x100000/WithXY-8 86.7ms ± 0% 86.7ms ± 0% ~ (p=1.000 n=5+5)
GCD100000x100000/WithoutXY-8 51.1ms ± 0% 51.0ms ± 0% ~ (p=0.151 n=5+5)
GCD100000x100000/WithXY-8 1.23s ± 0% 1.23s ± 0% ~ (p=0.841 n=5+5)
Hilbert-8 2.41ms ± 1% 2.42ms ± 2% ~ (p=0.690 n=5+5)
Binomial-8 4.86µs ± 1% 4.86µs ± 1% ~ (p=0.889 n=5+5)
QuoRem-8 7.09µs ± 0% 7.08µs ± 0% -0.09% (p=0.024 n=5+5)
Exp-8 161ms ± 0% 161ms ± 0% -0.08% (p=0.032 n=5+5)
Exp2-8 161ms ± 0% 161ms ± 0% ~ (p=1.000 n=5+5)
Bitset-8 40.7ns ± 0% 40.6ns ± 0% ~ (p=0.095 n=4+5)
BitsetNeg-8 159ns ± 4% 148ns ± 0% -6.92% (p=0.016 n=5+4)
BitsetOrig-8 378ns ± 1% 378ns ± 1% ~ (p=0.937 n=5+5)
BitsetNegOrig-8 647ns ± 5% 647ns ± 4% ~ (p=1.000 n=5+5)
ModSqrt225_Tonelli-8 7.26ms ± 0% 7.27ms ± 0% ~ (p=1.000 n=5+5)
ModSqrt224_3Mod4-8 2.24ms ± 0% 2.24ms ± 0% ~ (p=0.690 n=5+5)
ModSqrt5430_Tonelli-8 62.8s ± 1% 62.5s ± 0% ~ (p=0.063 n=5+4)
ModSqrt5430_3Mod4-8 20.8s ± 0% 20.8s ± 0% ~ (p=0.310 n=5+5)
Sqrt-8 101µs ± 1% 101µs ± 0% -0.35% (p=0.032 n=5+5)
IntSqr/1-8 32.3ns ± 1% 32.5ns ± 1% ~ (p=0.421 n=5+5)
IntSqr/2-8 157ns ± 5% 156ns ± 5% ~ (p=0.651 n=5+5)
IntSqr/3-8 292ns ± 2% 291ns ± 3% ~ (p=0.881 n=5+5)
IntSqr/5-8 738ns ± 6% 740ns ± 5% ~ (p=0.841 n=5+5)
IntSqr/8-8 1.82µs ± 4% 1.83µs ± 4% ~ (p=0.730 n=5+5)
IntSqr/10-8 2.92µs ± 1% 2.93µs ± 1% ~ (p=0.643 n=5+5)
IntSqr/20-8 6.28µs ± 2% 6.28µs ± 2% ~ (p=1.000 n=5+5)
IntSqr/30-8 13.8µs ± 2% 13.9µs ± 3% ~ (p=1.000 n=5+5)
IntSqr/50-8 37.8µs ± 4% 37.9µs ± 4% ~ (p=0.690 n=5+5)
IntSqr/80-8 95.9µs ± 1% 95.8µs ± 1% ~ (p=0.841 n=5+5)
IntSqr/100-8 148µs ± 1% 148µs ± 1% ~ (p=0.310 n=5+5)
IntSqr/200-8 586µs ± 1% 586µs ± 1% ~ (p=0.841 n=5+5)
IntSqr/300-8 1.32ms ± 0% 1.31ms ± 0% ~ (p=0.222 n=5+5)
IntSqr/500-8 2.48ms ± 0% 2.48ms ± 0% ~ (p=0.556 n=5+4)
IntSqr/800-8 4.68ms ± 0% 4.68ms ± 0% ~ (p=0.548 n=5+5)
IntSqr/1000-8 7.57ms ± 0% 7.56ms ± 0% ~ (p=0.421 n=5+5)
Mul-8 311ms ± 0% 311ms ± 0% ~ (p=0.548 n=5+5)
Exp3Power/0x10-8 559ns ± 1% 560ns ± 1% ~ (p=0.984 n=5+5)
Exp3Power/0x40-8 641ns ± 1% 634ns ± 1% ~ (p=0.063 n=5+5)
Exp3Power/0x100-8 1.39µs ± 2% 1.40µs ± 2% ~ (p=0.381 n=5+5)
Exp3Power/0x400-8 8.27µs ± 1% 8.26µs ± 0% ~ (p=0.571 n=5+5)
Exp3Power/0x1000-8 59.9µs ± 0% 59.7µs ± 0% -0.23% (p=0.008 n=5+5)
Exp3Power/0x4000-8 816µs ± 0% 816µs ± 0% ~ (p=1.000 n=5+5)
Exp3Power/0x10000-8 7.77ms ± 0% 7.77ms ± 0% ~ (p=0.841 n=5+5)
Exp3Power/0x40000-8 73.4ms ± 0% 73.4ms ± 0% ~ (p=0.690 n=5+5)
Exp3Power/0x100000-8 665ms ± 0% 664ms ± 0% -0.14% (p=0.008 n=5+5)
Exp3Power/0x400000-8 5.98s ± 0% 5.98s ± 0% -0.09% (p=0.008 n=5+5)
Fibo-8 116ms ± 0% 116ms ± 0% -0.25% (p=0.008 n=5+5)
NatSqr/1-8 115ns ± 3% 116ns ± 2% ~ (p=0.238 n=5+5)
NatSqr/2-8 237ns ± 1% 237ns ± 1% ~ (p=0.683 n=5+5)
NatSqr/3-8 367ns ± 3% 368ns ± 3% ~ (p=0.817 n=5+5)
NatSqr/5-8 807ns ± 3% 812ns ± 3% ~ (p=0.913 n=5+5)
NatSqr/8-8 1.93µs ± 2% 1.93µs ± 3% ~ (p=0.651 n=5+5)
NatSqr/10-8 2.98µs ± 2% 2.99µs ± 2% ~ (p=0.690 n=5+5)
NatSqr/20-8 6.49µs ± 2% 6.46µs ± 2% ~ (p=0.548 n=5+5)
NatSqr/30-8 14.4µs ± 2% 14.3µs ± 2% ~ (p=0.690 n=5+5)
NatSqr/50-8 38.6µs ± 2% 38.7µs ± 2% ~ (p=0.841 n=5+5)
NatSqr/80-8 96.1µs ± 2% 95.8µs ± 2% ~ (p=0.548 n=5+5)
NatSqr/100-8 149µs ± 1% 149µs ± 1% ~ (p=0.841 n=5+5)
NatSqr/200-8 593µs ± 1% 590µs ± 1% ~ (p=0.421 n=5+5)
NatSqr/300-8 1.32ms ± 0% 1.32ms ± 1% ~ (p=0.222 n=5+5)
NatSqr/500-8 2.49ms ± 0% 2.49ms ± 0% ~ (p=0.690 n=5+5)
NatSqr/800-8 4.69ms ± 0% 4.69ms ± 0% ~ (p=1.000 n=5+5)
NatSqr/1000-8 7.59ms ± 0% 7.58ms ± 0% ~ (p=0.841 n=5+5)
ScanPi-8 322µs ± 0% 321µs ± 0% ~ (p=0.095 n=5+5)
StringPiParallel-8 71.4µs ± 5% 68.8µs ± 4% ~ (p=0.151 n=5+5)
Scan/10/Base2-8 1.10µs ± 0% 1.09µs ± 0% -0.36% (p=0.032 n=5+5)
Scan/100/Base2-8 7.78µs ± 0% 7.79µs ± 0% +0.14% (p=0.008 n=5+5)
Scan/1000/Base2-8 78.8µs ± 0% 79.0µs ± 0% +0.24% (p=0.008 n=5+5)
Scan/10000/Base2-8 1.22ms ± 0% 1.22ms ± 0% ~ (p=0.056 n=5+5)
Scan/100000/Base2-8 55.1ms ± 0% 55.0ms ± 0% -0.15% (p=0.008 n=5+5)
Scan/10/Base8-8 514ns ± 0% 515ns ± 0% ~ (p=0.079 n=5+5)
Scan/100/Base8-8 2.89µs ± 0% 2.89µs ± 0% +0.15% (p=0.008 n=5+5)
Scan/1000/Base8-8 31.0µs ± 0% 31.1µs ± 0% +0.12% (p=0.008 n=5+5)
Scan/10000/Base8-8 740µs ± 0% 740µs ± 0% ~ (p=0.222 n=5+5)
Scan/100000/Base8-8 50.6ms ± 0% 50.5ms ± 0% -0.06% (p=0.016 n=4+5)
Scan/10/Base10-8 492ns ± 1% 490ns ± 1% ~ (p=0.310 n=5+5)
Scan/100/Base10-8 2.67µs ± 0% 2.67µs ± 0% ~ (p=0.056 n=5+5)
Scan/1000/Base10-8 28.7µs ± 0% 28.7µs ± 0% ~ (p=1.000 n=5+5)
Scan/10000/Base10-8 717µs ± 0% 716µs ± 0% ~ (p=0.222 n=5+5)
Scan/100000/Base10-8 50.2ms ± 0% 50.3ms ± 0% +0.05% (p=0.008 n=5+5)
Scan/10/Base16-8 442ns ± 1% 442ns ± 0% ~ (p=0.468 n=5+5)
Scan/100/Base16-8 2.46µs ± 0% 2.45µs ± 0% ~ (p=0.159 n=5+5)
Scan/1000/Base16-8 27.2µs ± 0% 27.2µs ± 0% ~ (p=0.841 n=5+5)
Scan/10000/Base16-8 721µs ± 0% 722µs ± 0% ~ (p=0.548 n=5+5)
Scan/100000/Base16-8 52.6ms ± 0% 52.6ms ± 0% +0.07% (p=0.008 n=5+5)
String/10/Base2-8 244ns ± 1% 242ns ± 1% ~ (p=0.103 n=5+5)
String/100/Base2-8 1.48µs ± 0% 1.48µs ± 1% ~ (p=0.786 n=5+5)
String/1000/Base2-8 13.3µs ± 1% 13.3µs ± 0% ~ (p=0.222 n=5+5)
String/10000/Base2-8 132µs ± 1% 132µs ± 1% ~ (p=1.000 n=5+5)
String/100000/Base2-8 1.30ms ± 1% 1.30ms ± 1% ~ (p=1.000 n=5+5)
String/10/Base8-8 167ns ± 1% 168ns ± 1% ~ (p=0.135 n=5+5)
String/100/Base8-8 623ns ± 1% 626ns ± 1% ~ (p=0.151 n=5+5)
String/1000/Base8-8 5.24µs ± 1% 5.24µs ± 0% ~ (p=1.000 n=5+5)
String/10000/Base8-8 50.0µs ± 1% 50.0µs ± 1% ~ (p=1.000 n=5+5)
String/100000/Base8-8 492µs ± 1% 489µs ± 1% ~ (p=0.056 n=5+5)
String/10/Base10-8 503ns ± 1% 501ns ± 0% ~ (p=0.183 n=5+5)
String/100/Base10-8 1.96µs ± 0% 1.97µs ± 0% ~ (p=0.389 n=5+5)
String/1000/Base10-8 12.4µs ± 1% 12.4µs ± 1% ~ (p=0.841 n=5+5)
String/10000/Base10-8 56.7µs ± 1% 56.6µs ± 0% ~ (p=1.000 n=5+5)
String/100000/Base10-8 25.6ms ± 0% 25.6ms ± 0% ~ (p=0.222 n=5+5)
String/10/Base16-8 147ns ± 0% 148ns ± 2% ~ (p=1.000 n=4+5)
String/100/Base16-8 505ns ± 0% 505ns ± 1% ~ (p=0.778 n=5+5)
String/1000/Base16-8 3.94µs ± 0% 3.94µs ± 0% ~ (p=0.841 n=5+5)
String/10000/Base16-8 37.4µs ± 1% 37.2µs ± 1% ~ (p=0.095 n=5+5)
String/100000/Base16-8 367µs ± 1% 367µs ± 0% ~ (p=1.000 n=5+5)
LeafSize/0-8 6.64ms ± 0% 6.65ms ± 0% ~ (p=0.690 n=5+5)
LeafSize/1-8 72.5µs ± 1% 72.4µs ± 1% ~ (p=0.841 n=5+5)
LeafSize/2-8 72.6µs ± 1% 72.6µs ± 1% ~ (p=1.000 n=5+5)
LeafSize/3-8 377µs ± 0% 377µs ± 0% ~ (p=0.421 n=5+5)
LeafSize/4-8 71.2µs ± 1% 71.3µs ± 0% ~ (p=0.278 n=5+5)
LeafSize/5-8 469µs ± 0% 469µs ± 0% ~ (p=0.310 n=5+5)
LeafSize/6-8 376µs ± 0% 376µs ± 0% ~ (p=0.841 n=5+5)
LeafSize/7-8 244µs ± 0% 244µs ± 0% ~ (p=0.841 n=5+5)
LeafSize/8-8 71.9µs ± 1% 72.1µs ± 1% ~ (p=0.548 n=5+5)
LeafSize/9-8 536µs ± 0% 536µs ± 0% ~ (p=0.151 n=5+5)
LeafSize/10-8 470µs ± 0% 471µs ± 0% +0.10% (p=0.032 n=5+5)
LeafSize/11-8 458µs ± 0% 458µs ± 0% ~ (p=0.881 n=5+5)
LeafSize/12-8 376µs ± 0% 376µs ± 0% ~ (p=0.548 n=5+5)
LeafSize/13-8 341µs ± 0% 342µs ± 0% ~ (p=0.222 n=5+5)
LeafSize/14-8 246µs ± 0% 245µs ± 0% ~ (p=0.167 n=5+5)
LeafSize/15-8 168µs ± 0% 168µs ± 0% ~ (p=0.548 n=5+5)
LeafSize/16-8 72.1µs ± 1% 72.2µs ± 1% ~ (p=0.690 n=5+5)
LeafSize/32-8 81.5µs ± 1% 81.4µs ± 1% ~ (p=1.000 n=5+5)
LeafSize/64-8 133µs ± 1% 134µs ± 1% ~ (p=0.690 n=5+5)
ProbablyPrime/n=0-8 44.3ms ± 0% 44.2ms ± 0% -0.28% (p=0.008 n=5+5)
ProbablyPrime/n=1-8 64.8ms ± 0% 64.7ms ± 0% -0.15% (p=0.008 n=5+5)
ProbablyPrime/n=5-8 147ms ± 0% 147ms ± 0% -0.11% (p=0.008 n=5+5)
ProbablyPrime/n=10-8 250ms ± 0% 250ms ± 0% ~ (p=0.056 n=5+5)
ProbablyPrime/n=20-8 456ms ± 0% 455ms ± 0% -0.05% (p=0.008 n=5+5)
ProbablyPrime/Lucas-8 23.6ms ± 0% 23.5ms ± 0% -0.29% (p=0.008 n=5+5)
ProbablyPrime/MillerRabinBase2-8 20.6ms ± 0% 20.6ms ± 0% ~ (p=0.690 n=5+5)
FloatSqrt/64-8 2.01µs ± 1% 2.02µs ± 1% ~ (p=0.421 n=5+5)
FloatSqrt/128-8 4.43µs ± 2% 4.38µs ± 2% ~ (p=0.222 n=5+5)
FloatSqrt/256-8 6.64µs ± 1% 6.68µs ± 2% ~ (p=0.516 n=5+5)
FloatSqrt/1000-8 31.9µs ± 0% 31.8µs ± 0% ~ (p=0.095 n=5+5)
FloatSqrt/10000-8 595µs ± 0% 594µs ± 0% ~ (p=0.056 n=5+5)
FloatSqrt/100000-8 17.9ms ± 0% 17.9ms ± 0% ~ (p=0.151 n=5+5)
FloatSqrt/1000000-8 1.52s ± 0% 1.52s ± 0% ~ (p=0.841 n=5+5)

name old speed new speed delta
AddVV/1-8 2.97GB/s ± 0% 2.97GB/s ± 0% ~ (p=0.971 n=4+4)
AddVV/2-8 9.47GB/s ± 0% 9.47GB/s ± 0% +0.01% (p=0.016 n=5+5)
AddVV/3-8 12.4GB/s ± 0% 12.4GB/s ± 0% ~ (p=0.548 n=5+5)
AddVV/4-8 14.6GB/s ± 0% 14.6GB/s ± 0% ~ (p=1.000 n=5+5)
AddVV/5-8 16.4GB/s ± 0% 16.4GB/s ± 0% ~ (p=1.000 n=5+5)
AddVV/10-8 21.7GB/s ± 0% 21.7GB/s ± 0% ~ (p=0.548 n=5+5)
AddVV/100-8 29.4GB/s ± 0% 29.4GB/s ± 0% ~ (p=1.000 n=5+5)
AddVV/1000-8 31.7GB/s ± 0% 31.7GB/s ± 0% ~ (p=0.524 n=5+4)
AddVV/10000-8 31.5GB/s ± 0% 31.5GB/s ± 0% ~ (p=0.690 n=5+5)
AddVV/100000-8 28.8GB/s ± 7% 28.1GB/s ± 8% ~ (p=0.548 n=5+5)
AddVW/1-8 859MB/s ± 0% 864MB/s ± 0% +0.61% (p=0.008 n=5+5)
AddVW/2-8 809MB/s ± 2% 1520MB/s ± 0% +87.78% (p=0.008 n=5+5)
AddVW/3-8 2.08GB/s ± 0% 2.18GB/s ± 0% +4.54% (p=0.008 n=5+5)
AddVW/4-8 2.46GB/s ± 0% 2.66GB/s ± 0% +8.33% (p=0.016 n=4+5)
AddVW/5-8 2.76GB/s ± 0% 3.20GB/s ± 0% +16.03% (p=0.008 n=5+5)
AddVW/10-8 3.63GB/s ± 0% 5.15GB/s ± 0% +41.83% (p=0.008 n=5+5)
AddVW/100-8 4.79GB/s ± 0% 9.87GB/s ± 0% +106.12% (p=0.008 n=5+5)
AddVW/1000-8 5.27GB/s ± 0% 12.42GB/s ± 0% +135.74% (p=0.008 n=5+5)
AddVW/10000-8 5.31GB/s ± 0% 11.19GB/s ± 0% +110.71% (p=0.008 n=5+5)
AddVW/100000-8 5.32GB/s ± 0% 11.32GB/s ± 0% +112.56% (p=0.008 n=5+5)
SubVW/1-8 859MB/s ± 0% 864MB/s ± 0% +0.61% (p=0.008 n=5+5)
SubVW/2-8 812MB/s ± 2% 1520MB/s ± 0% +87.09% (p=0.008 n=5+5)
SubVW/3-8 2.08GB/s ± 0% 2.18GB/s ± 0% +4.55% (p=0.008 n=5+5)
SubVW/4-8 2.46GB/s ± 0% 2.66GB/s ± 0% +8.33% (p=0.008 n=5+5)
SubVW/5-8 2.75GB/s ± 0% 3.20GB/s ± 0% +16.03% (p=0.008 n=5+5)
SubVW/10-8 3.63GB/s ± 0% 5.15GB/s ± 0% +41.82% (p=0.008 n=5+5)
SubVW/100-8 4.79GB/s ± 0% 9.87GB/s ± 0% +106.13% (p=0.008 n=5+5)
SubVW/1000-8 5.27GB/s ± 0% 12.42GB/s ± 0% +135.74% (p=0.008 n=5+5)
SubVW/10000-8 5.31GB/s ± 0% 11.17GB/s ± 0% +110.44% (p=0.008 n=5+5)
SubVW/100000-8 5.32GB/s ± 0% 11.31GB/s ± 0% +112.35% (p=0.008 n=5+5)
AddMulVVW/1-8 1.97GB/s ± 1% 1.96GB/s ± 1% ~ (p=0.151 n=5+5)
AddMulVVW/2-8 2.24GB/s ± 0% 2.25GB/s ± 0% ~ (p=0.095 n=5+5)
AddMulVVW/3-8 2.11GB/s ± 0% 2.12GB/s ± 0% ~ (p=0.548 n=5+5)
AddMulVVW/4-8 2.17GB/s ± 1% 2.17GB/s ± 1% ~ (p=0.548 n=5+5)
AddMulVVW/5-8 2.22GB/s ± 1% 2.21GB/s ± 1% ~ (p=0.421 n=5+5)
AddMulVVW/10-8 2.17GB/s ± 1% 2.16GB/s ± 0% ~ (p=0.095 n=5+5)
AddMulVVW/100-8 2.35GB/s ± 0% 2.35GB/s ± 0% ~ (p=0.421 n=5+5)
AddMulVVW/1000-8 2.47GB/s ± 0% 2.41GB/s ± 0% -2.09% (p=0.008 n=5+5)
AddMulVVW/10000-8 2.16GB/s ± 0% 2.15GB/s ± 0% -0.23% (p=0.008 n=5+5)
AddMulVVW/100000-8 2.03GB/s ± 1% 2.04GB/s ± 0% ~ (p=0.690 n=5+5)

name old alloc/op new alloc/op delta
FloatString/100-8 400B ± 0% 400B ± 0% ~ (all equal)
FloatString/1000-8 3.22kB ± 0% 3.22kB ± 0% ~ (all equal)
FloatString/10000-8 55.6kB ± 0% 55.5kB ± 0% ~ (p=0.206 n=5+5)
FloatString/100000-8 627kB ± 0% 627kB ± 0% ~ (all equal)
FloatAdd/10-8 0.00B 0.00B ~ (all equal)
FloatAdd/100-8 0.00B 0.00B ~ (all equal)
FloatAdd/1000-8 0.00B 0.00B ~ (all equal)
FloatAdd/10000-8 0.00B 0.00B ~ (all equal)
FloatAdd/100000-8 0.00B 0.00B ~ (all equal)
FloatSub/10-8 0.00B 0.00B ~ (all equal)
FloatSub/100-8 0.00B 0.00B ~ (all equal)
FloatSub/1000-8 0.00B 0.00B ~ (all equal)
FloatSub/10000-8 0.00B 0.00B ~ (all equal)
FloatSub/100000-8 0.00B 0.00B ~ (all equal)
FloatSqrt/64-8 416B ± 0% 416B ± 0% ~ (all equal)
FloatSqrt/128-8 720B ± 0% 720B ± 0% ~ (all equal)
FloatSqrt/256-8 816B ± 0% 816B ± 0% ~ (all equal)
FloatSqrt/1000-8 2.50kB ± 0% 2.50kB ± 0% ~ (all equal)
FloatSqrt/10000-8 23.5kB ± 0% 23.5kB ± 0% ~ (all equal)
FloatSqrt/100000-8 251kB ± 0% 251kB ± 0% ~ (all equal)
FloatSqrt/1000000-8 4.61MB ± 0% 4.61MB ± 0% ~ (all equal)

name old allocs/op new allocs/op delta
FloatString/100-8 8.00 ± 0% 8.00 ± 0% ~ (all equal)
FloatString/1000-8 10.0 ± 0% 10.0 ± 0% ~ (all equal)
FloatString/10000-8 42.0 ± 0% 42.0 ± 0% ~ (all equal)
FloatString/100000-8 346 ± 0% 346 ± 0% ~ (all equal)
FloatAdd/10-8 0.00 0.00 ~ (all equal)
FloatAdd/100-8 0.00 0.00 ~ (all equal)
FloatAdd/1000-8 0.00 0.00 ~ (all equal)
FloatAdd/10000-8 0.00 0.00 ~ (all equal)
FloatAdd/100000-8 0.00 0.00 ~ (all equal)
FloatSub/10-8 0.00 0.00 ~ (all equal)
FloatSub/100-8 0.00 0.00 ~ (all equal)
FloatSub/1000-8 0.00 0.00 ~ (all equal)
FloatSub/10000-8 0.00 0.00 ~ (all equal)
FloatSub/100000-8 0.00 0.00 ~ (all equal)
FloatSqrt/64-8 9.00 ± 0% 9.00 ± 0% ~ (all equal)
FloatSqrt/128-8 13.0 ± 0% 13.0 ± 0% ~ (all equal)
FloatSqrt/256-8 12.0 ± 0% 12.0 ± 0% ~ (all equal)
FloatSqrt/1000-8 19.0 ± 0% 19.0 ± 0% ~ (all equal)
FloatSqrt/10000-8 35.0 ± 0% 35.0 ± 0% ~ (all equal)
FloatSqrt/100000-8 55.0 ± 0% 55.0 ± 0% ~ (all equal)
FloatSqrt/1000000-8 122 ± 0% 122 ± 0% ~ (all equal)

Change-Id: I6888d84c037d91f9e2199f3492ea3f6a0ed77b24
Reviewed-on: https://go-review.googlesource.com/77832Reviewed-by: Vlad Krasnov <vlad@cloudflare.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

0585d41c

cmd/asm, cmd/internal/obj/ppc64: avoid unnecessary load zeros · 5b14c7b3

Lynn Boger authored Feb 19, 2018

When instructions add, and, or, xor, and movd have
constant operands in some cases more instructions are
generated than necessary by the assembler.

This adds more opcode/operand combinations to the optab
and improves the code generation for the cases where the
size and sign of the constant allows the use of 1
instructions instead of 2.

Example of previous code:
	oris r3, r0, 0
	ori  r3, r3, 65533

now:
	ori r3, r0, 65533

This does not significantly reduce the overall binary size
because the improvement depends on the constant value.
Some procedures show a 1-2% reduction in size. This improvement
could also be significant in cases where the extra instructions
occur in a critical loop.

Testcase ppc64enc.s was added to cmd/asm/internal/asm/testdata
with the variations affected by this change.

Updates #23845

Change-Id: I7fdf2320c95815d99f2755ba77d0c6921cd7fad7
Reviewed-on: https://go-review.googlesource.com/95135
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>

5b14c7b3

encoding/csv: avoid mangling invalid UTF-8 in Writer · 0add9a4d

Joe Tsai authored Mar 07, 2018

In the situation where a quoted field is necessary, avoid processing
each UTF-8 rune one-by-one, which causes mangling of invalid sequences
into utf8.RuneError, causing a loss of information.
Instead, search only for the escaped characters, handle those specially
and copy everything else in between verbatim.

This symmetrically matches the behavior of Reader.

Fixes #24298

Change-Id: I9276f64891084ce8487678f663fad711b4095dbb
Reviewed-on: https://go-review.googlesource.com/99297
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>

0add9a4d

cmd/compile: mark anonymous receiver parameters as non-escaping · 88466e93

Matthew Dempsky authored Mar 07, 2018

This was already done for normal parameters, and the same logic
applies for receiver parameters too.

Updates #24305.

Change-Id: Ia2a46f68d14e8fb62004ff0da1db0f065a95a1b7
Reviewed-on: https://go-review.googlesource.com/99335
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

88466e93

07 Mar, 2018 9 commits

cmd/cover: don't crash on non-gofmt'ed input · 8b8625a3

Ian Lance Taylor authored Mar 06, 2018

Without the change to cover.go, the new test fails with

panic: overlapping edits: [4946,4950)->"", [4947,4947)->"thisNameMustBeVeryLongToCauseOverflowOfCounterIncrementStatementOntoNextLineForTest.Count[112]++;"

The original code inserts "else{", deletes "else", and then positions
a new block just after the "}" that must come before the "else".
That works on gofmt'ed code, but fails if the code looks like "}else".
When there is no space between the "{" and the "else", the new block
is inserted into a location that we are deleting, leading to the
"overlapping edits" mentioned above.

This CL fixes this case by not deleting the "else" but just using the
one that is already there. That requires adjust the block offset to
come after the "{" that we insert.

Fixes #23927

Change-Id: I40ef592490878765bbce6550ddb439e43ac525b2
Reviewed-on: https://go-review.googlesource.com/98935
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>

8b8625a3

runtime: get traceback from VDSO code · 419c0645

Ian Lance Taylor authored Feb 26, 2018

Currently if a profiling signal arrives while executing within a VDSO
the profiler will report _ExternalCode, which is needlessly confusing
for a pure Go program. Change the VDSO calling code to record the
caller's PC/SP, so that we can do a traceback from that point. If that
fails for some reason, report _VDSO rather than _ExternalCode, which
should at least point in the right direction.

This adds some instructions to the code that calls the VDSO, but the
slowdown is reasonably negligible:

name                                  old time/op  new time/op  delta
ClockVDSOAndFallbackPaths/vDSO-8      40.5ns ± 2%  41.3ns ± 1%  +1.85%  (p=0.002 n=10+10)
ClockVDSOAndFallbackPaths/Fallback-8  41.9ns ± 1%  43.5ns ± 1%  +3.84%  (p=0.000 n=9+9)
TimeNow-8                             41.5ns ± 3%  41.5ns ± 2%    ~     (p=0.723 n=10+10)

Fixes #24142

Change-Id: Iacd935db3c4c782150b3809aaa675a71799b1c9c
Reviewed-on: https://go-review.googlesource.com/97315
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>

419c0645

runtime: change from rt_sigaction to sigaction · c2f28de7

Ian Lance Taylor authored Mar 07, 2018

This normalizes the Linux code to act like other targets. The size
argument to the rt_sigaction system call is pushed to a single
function, sysSigaction.

This is intended as a simplification step for CL 93875 for #14327.

Change-Id: I594788e235f0da20e16e8a028e27ac8c883907c4
Reviewed-on: https://go-review.googlesource.com/99077
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>

c2f28de7

cmd/dist: skip rebuild before running tests when on the build systems · d8c9ef9e

Brad Fitzpatrick authored Mar 07, 2018

Updates #24300

Change-Id: I7752dab67e15a6dfe5fffe5b5ccbf3373bbc2c13
Reviewed-on: https://go-review.googlesource.com/99296Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

d8c9ef9e

math/big: implement addMulVVW on arm64 · fd3d2793

Vlad Krasnov authored Nov 06, 2017

The lack of proper addMulVVW implementation for arm64 hurts RSA performance.

This assembly implementation is optimized for arm64 based servers.

name old time/op new time/op delta
pkg:math/big goos:linux goarch:arm64
AddMulVVW/1 55.2ns ± 0% 11.9ns ± 1% -78.37% (p=0.000 n=8+10)
AddMulVVW/2 67.0ns ± 0% 11.2ns ± 0% -83.28% (p=0.000 n=7+10)
AddMulVVW/3 93.2ns ± 0% 13.2ns ± 0% -85.84% (p=0.000 n=10+10)
AddMulVVW/4 126ns ± 0% 13ns ± 1% -89.82% (p=0.000 n=10+10)
AddMulVVW/5 151ns ± 0% 17ns ± 0% -88.87% (p=0.000 n=10+9)
AddMulVVW/10 323ns ± 0% 25ns ± 0% -92.20% (p=0.000 n=10+10)
AddMulVVW/100 3.28µs ± 0% 0.14µs ± 0% -95.82% (p=0.000 n=10+10)
AddMulVVW/1000 31.7µs ± 0% 1.3µs ± 0% -96.00% (p=0.000 n=10+8)
AddMulVVW/10000 313µs ± 0% 13µs ± 0% -95.98% (p=0.000 n=10+10)
AddMulVVW/100000 3.24ms ± 0% 0.13ms ± 1% -96.13% (p=0.000 n=9+9)
pkg:crypto/rsa goos:linux goarch:arm64
RSA2048Decrypt 44.7ms ± 0% 4.0ms ± 6% -91.08% (p=0.000 n=8+10)
RSA2048Sign 46.3ms ± 0% 5.0ms ± 0% -89.29% (p=0.000 n=9+10)
3PrimeRSA2048Decrypt 22.3ms ± 0% 2.4ms ± 0% -89.26% (p=0.000 n=10+10)

Change-Id: I295f0bd5c51a4442d02c44ece1f6026d30dff0bc
Reviewed-on: https://go-review.googlesource.com/76270Reviewed-by: Vlad Krasnov <vlad@cloudflare.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Vlad Krasnov <vlad@cloudflare.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

fd3d2793

cmd/go: skip TestVetWithOnlyCgoFiles when cgo is disabled · b1335037

David du Colombier authored Mar 07, 2018

CL 99175 added TestVetWithOnlyCgoFiles. However, this
test is failing on platforms where cgo is disabled,
because no file can be built.

This change fixes TestVetWithOnlyCgoFiles by skipping
this test when cgo is disabled.

Fixes #24304.

Change-Id: Ibb38fcd3e0ed1a791782145d3f2866f12117c6fe
Reviewed-on: https://go-review.googlesource.com/99275Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

b1335037

runtime/cgo: make sure nil is undefined before defining it · 7a2a96d6

Elias Naur authored Mar 07, 2018

While working on standalone builds of gomobile bindings, I ran into
errors on the form:

gcc_darwin_arm.c:30:31: error: ambiguous expansion of macro 'nil' [-Werror,-Wambiguous-macro]
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS11.2.sdk/usr/include/MacTypes.h:94:15: note: expanding this definition of 'nil'

Fix it by undefining nil before defining it in libcgo.h.

Change-Id: I8e9660a68c6c351e592684d03d529f0d182c0493
Reviewed-on: https://go-review.googlesource.com/99215
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>

7a2a96d6

cmd/go: run vet on packages with only cgo files · 709da955

Ian Lance Taylor authored Mar 07, 2018

CgoFiles is not included in GoFiles, so we need to check both.

Fixes #24193

Change-Id: I6a67bd912e3d9a4be0eae8fa8db6fa8a07fb5df3
Reviewed-on: https://go-review.googlesource.com/99175
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

709da955

cmd/compile: prevent untyped types from reaching walk · a3b3284d

Matthew Dempsky authored Mar 02, 2018

We already require expressions to have already been typechecked before
reaching walk. Moreover, all untyped expressions should have been
converted to their default type by walk.

However, in practice, we've been somewhat sloppy and inconsistent
about ensuring this. In particular, a lot of AST rewrites ended up
leaving untyped bool expressions scattered around. These likely aren't
harmful in practice, but it seems worth cleaning up.

The two most common cases addressed by this CL are:

1) When generating OIF and OFOR nodes, we would often typecheck the
conditional expression, but not apply defaultlit to force it to the
expression's default type.

2) When rewriting string comparisons into more fundamental primitives,
we were simply overwriting r.Type with the desired type, which didn't
propagate the type to nested subexpressions. These are fixed by
utilizing finishcompare, which correctly handles this (and is already
used by other comparison lowering rewrites).

Lastly, walkexpr is extended to assert that it's not called on untyped
expressions.

Fixes #23834.

Change-Id: Icbd29648a293555e4015d3b06a95a24ccbd3f790
Reviewed-on: https://go-review.googlesource.com/98337Reviewed-by: Robert Griesemer <gri@golang.org>

a3b3284d