Commits · 8c954d57801d8ea855003425fbbbf78de8733e6a · go / golang

23 Jul, 2015 7 commits

[dev.ssa] cmd/compile: speed up cse · 8c954d57

Josh Bleecher Snyder authored Jul 15, 2015

By walking only the current set of partitions
at any given point, the cse pass ended up doing
lots of extraneous, effectively O(n^2) work.

Using a regular for loop allows each cse pass to
make as much progress as possible by processing
each new class as it is introduced.

This can and should be optimized further,
but it already reduces by 75% cse time on test/slice3.go.

The overall time to compile test/slice3.go is still
dominated by the O(n^2) work in the liveness pass.
However, Keith is rewriting regalloc anyway.

Change-Id: I8be020b2f69352234587eeadeba923481bf43fcc
Reviewed-on: https://go-review.googlesource.com/12244Reviewed-by: Keith Randall <khr@golang.org>

8c954d57

[dev.ssa] cmd/compile: don't combine phi vars from different blocks in CSE · 00437ebe

Josh Bleecher Snyder authored Jul 21, 2015

Here is a concrete case in which this goes wrong.

func f_ssa() int {
	var n int
Next:
	for j := 0; j < 3; j++ {
		for i := 0; i < 10; i++ {
			if i == 6 {
				continue Next
			}
			n = i
		}
		n += j + j + j + j + j + j + j + j + j + j // j * 10
	}
	return n
}

What follows is the function printout before and after CSE.

Note blocks b8 and b10 in the before case.

b8 is the inner loop's condition: i < 10.
b10 is the inner loop's increment: i++.
v82 is i. On entry to b8, it is either 0 (v19) the first time,
or the result of incrementing v82, by way of v29.

The CSE pass considered v82 and v49 to be common subexpressions,
and eliminated v82 in favor of v49.

In the after case, v82 is now dead and will shortly be eliminated.
As a result, v29 is also dead, and we have lost the increment.
The loop runs forever.

BEFORE CSE

f_ssa <nil>
  b1:
    v1 = Arg <mem>
    v2 = SP <uint64>
    v4 = Addr <*int> {~r0} v2
    v13 = Zero <mem> [8] v4 v1
    v14 = Const <int>
    v15 = Const <int>
    v17 = Const <int> [3]
    v19 = Const <int>
    v21 = Const <int> [10]
    v24 = Const <int> [6]
    v28 = Const <int> [1]
    v43 = Const <int> [1]
    Plain -> b3
  b2: <- b7
    Exit v47
  b3: <- b1
    Plain -> b4
  b4: <- b3 b6
    v49 = Phi <int> v15 v44
    v68 = Phi <int> v14 v67
    v81 = Phi <mem> v13 v81
    v18 = Less <bool> v49 v17
    If v18 -> b5 b7
  b5: <- b4
    Plain -> b8
  b6: <- b12 b11
    v67 = Phi <int> v66 v41
    v44 = Add <int> v49 v43
    Plain -> b4
  b7: <- b4
    v47 = Store <mem> v4 v68 v81
    Plain -> b2
  b8: <- b5 b10
    v66 = Phi <int> v68 v82
    v82 = Phi <int> v19 v29
    v22 = Less <bool> v82 v21
    If v22 -> b9 b11
  b9: <- b8
    v25 = Eq <bool> v82 v24
    If v25 -> b12 b13
  b10: <- b13
    v29 = Add <int> v82 v28
    Plain -> b8
  b11: <- b8
    v32 = Add <int> v49 v49
    v33 = Add <int> v32 v49
    v34 = Add <int> v33 v49
    v35 = Add <int> v34 v49
    v36 = Add <int> v35 v49
    v37 = Add <int> v36 v49
    v38 = Add <int> v37 v49
    v39 = Add <int> v38 v49
    v40 = Add <int> v39 v49
    v41 = Add <int> v66 v40
    Plain -> b6
  b12: <- b9
    Plain -> b6
  b13: <- b9
    Plain -> b10

AFTER CSE

f_ssa <nil>
  b1:
    v1 = Arg <mem>
    v2 = SP <uint64>
    v4 = Addr <*int> {~r0} v2
    v13 = Zero <mem> [8] v4 v1
    v14 = Const <int>
    v15 = Const <int>
    v17 = Const <int> [3]
    v19 = Const <int>
    v21 = Const <int> [10]
    v24 = Const <int> [6]
    v28 = Const <int> [1]
    v43 = Const <int> [1]
    Plain -> b3
  b2: <- b7
    Exit v47
  b3: <- b1
    Plain -> b4
  b4: <- b3 b6
    v49 = Phi <int> v19 v44
    v68 = Phi <int> v19 v67
    v81 = Phi <mem> v13 v81
    v18 = Less <bool> v49 v17
    If v18 -> b5 b7
  b5: <- b4
    Plain -> b8
  b6: <- b12 b11
    v67 = Phi <int> v66 v41
    v44 = Add <int> v49 v43
    Plain -> b4
  b7: <- b4
    v47 = Store <mem> v4 v68 v81
    Plain -> b2
  b8: <- b5 b10
    v66 = Phi <int> v68 v49
    v82 = Phi <int> v19 v29
    v22 = Less <bool> v49 v21
    If v22 -> b9 b11
  b9: <- b8
    v25 = Eq <bool> v49 v24
    If v25 -> b12 b13
  b10: <- b13
    v29 = Add <int> v49 v43
    Plain -> b8
  b11: <- b8
    v32 = Add <int> v49 v49
    v33 = Add <int> v32 v49
    v34 = Add <int> v33 v49
    v35 = Add <int> v34 v49
    v36 = Add <int> v35 v49
    v37 = Add <int> v36 v49
    v38 = Add <int> v37 v49
    v39 = Add <int> v38 v49
    v40 = Add <int> v39 v49
    v41 = Add <int> v66 v40
    Plain -> b6
  b12: <- b9
    Plain -> b6
  b13: <- b9
    Plain -> b10

Change-Id: I16fc4ec527ec63f24f7d0d79d1a4a59bf37269de
Reviewed-on: https://go-review.googlesource.com/12444Reviewed-by: Keith Randall <khr@golang.org>

00437ebe

[dev.ssa] cmd/compile/internal/ssa: implement multiplies · be1eb57a

Keith Randall authored Jul 22, 2015

Use width-and-signed-specific multiply opcodes.
Implement OMUL.
A few other cleanups.

Fixes #11467

Change-Id: Ib0fe80a1a9b7208dbb8a2b6b652a478847f5d244
Reviewed-on: https://go-review.googlesource.com/12540Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>

be1eb57a

[dev.ssa] cmd/compile: speed up liveness analysis · d5297f72

Josh Bleecher Snyder authored Jul 23, 2015

This reduces the wall time to run test/slice3.go
on my laptop from >10m to ~20s.

This could perhaps be further reduced by using
a worklist of blocks and/or implementing the
suggestion in the comment in this CL, but at this
point, it's fast enough that there is no need.

Change-Id: I741119e0c8310051d7185459f78be8b89237b85b
Reviewed-on: https://go-review.googlesource.com/12564Reviewed-by: Keith Randall <khr@golang.org>

d5297f72

[dev.ssa] cmd/compile: add some common binary ops · e61e7c96

Josh Bleecher Snyder authored Jul 23, 2015

Change-Id: I1af486a69960b9b66d5c2c9bbfcf7db6ef075d8c
Reviewed-on: https://go-review.googlesource.com/12563Reviewed-by: Keith Randall <khr@golang.org>

e61e7c96

[dev.ssa] cmd/compile: minor cleanup · e0ac5c53

Josh Bleecher Snyder authored Jul 21, 2015

Change-Id: Ib33f3b1cfa09f410675d275e214d8ddc246c53c3
Reviewed-on: https://go-review.googlesource.com/12548Reviewed-by: Keith Randall <khr@golang.org>

e0ac5c53

[dev.ssa] cmd/compile: implement control flow handling · 61aa0953

Josh Bleecher Snyder authored Jul 20, 2015

Add label and goto checks and improve test coverage.

Implement OSWITCH and OSELECT.

Implement OBREAK and OCONTINUE.

Allow generation of code in dead blocks.

Change-Id: Ibebb7c98b4b2344f46d38db7c9dce058c56beaac
Reviewed-on: https://go-review.googlesource.com/12445Reviewed-by: Keith Randall <khr@golang.org>

61aa0953

22 Jul, 2015 2 commits

[dev.ssa] cmd/compile/internal/ssa/gen: generalize strength reduction. · 3e7e519c

Alexandru Moșoi authored Jul 17, 2015

Handle multiplication with -1, 0, 3, 5, 9 and all powers of two.

Change-Id: I8e87e7670dae389aebf6f446d7a56950cacb59e0
Reviewed-on: https://go-review.googlesource.com/12350Reviewed-by: Keith Randall <khr@golang.org>

3e7e519c

[dev.ssa] cmd/compile/internal/ssa/gen: implement OMINUS · 954d5ada

Alexandru Moșoi authored Jul 21, 2015

Change-Id: Ibc645d6cf229ecc18af3549dd3750be9d7451abe
Reviewed-on: https://go-review.googlesource.com/12472Reviewed-by: Keith Randall <khr@golang.org>

954d5ada

21 Jul, 2015 11 commits

[dev.ssa] cmd/compile: don't generate zero values for ssa ops · 8fb63581

Josh Bleecher Snyder authored Jul 21, 2015

Shorter code, easier to read, no pointless empty slices.

Change-Id: Id410364b4f6924b5665188af3373a5e914117c38
Reviewed-on: https://go-review.googlesource.com/12480Reviewed-by: Keith Randall <khr@golang.org>

8fb63581

[dev.ssa] cmd/compile: fix build · ac1935b3

Josh Bleecher Snyder authored Jul 21, 2015

Bad rebase in CL 12439.

Change-Id: I7ad359519c6274be37456b655f19bf0ca6ac6692
Reviewed-on: https://go-review.googlesource.com/12449Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>

ac1935b3

[dev.ssa] cmd/compile: implement "if SETEQ" branches · a402b58e

Josh Bleecher Snyder authored Jul 20, 2015

Change-Id: I814fd0c2f1a622cca7dfd1b771f81de309a1904c
Reviewed-on: https://go-review.googlesource.com/12441Reviewed-by: Keith Randall <khr@golang.org>

a402b58e

[dev.ssa] cmd/compile: call through to expr for expression statements · 2574e4ac

Josh Bleecher Snyder authored Jul 16, 2015

Change-Id: I8625eff33f5a49dbaaec060c3fa067d7531193c4
Reviewed-on: https://go-review.googlesource.com/12313Reviewed-by: Keith Randall <khr@golang.org>

2574e4ac

[dev.ssa] cmd/compile: fix stackalloc handling of zero-aligned variables · 67bfd695

Josh Bleecher Snyder authored Jul 20, 2015

Prior to this fix, a zero-aligned variable such as a flags
variable would reset n to 0.

While we're here, log the stack layout so that debugging
and reading the generated assembly is easier.

Change-Id: I18ef83ea95b6ea877c83f2e595e14c48c9ad7d84
Reviewed-on: https://go-review.googlesource.com/12439Reviewed-by: Keith Randall <khr@golang.org>

67bfd695

[dev.ssa] cmd/compile: mark LoadReg8 and StoreReg8 of flags as unimplemented · 26f135d7

Josh Bleecher Snyder authored Jul 20, 2015

It is not clear to me what the right implementation is.
LoadReg8 and StoreReg8 are introduced during regalloc,
so after the amd64 rewrites. But implementing them
in genValue seems silly.

Change-Id: Ia708209c4604867bddcc0e5d75ecd17cf32f52c3
Reviewed-on: https://go-review.googlesource.com/12437Reviewed-by: Keith Randall <khr@golang.org>

26f135d7

[dev.ssa] cmd/compile: implement genValue for AMD64SETxx · a794074d

Josh Bleecher Snyder authored Jul 20, 2015

Change-Id: I591f2c0465263dcdeef46920aabf1bbb8e7ac5c0
Reviewed-on: https://go-review.googlesource.com/12436Reviewed-by: Keith Randall <khr@golang.org>

a794074d

Revert "[dev.ssa] cmd/compile: don't Compile if Unimplemented" · 983bc8d1

Josh Bleecher Snyder authored Jul 17, 2015

This reverts commit 766bcc92.

Change-Id: I55413c1aa80d82c856a3ea89b4ffccf80fb58013
Reviewed-on: https://go-review.googlesource.com/12361Reviewed-by: Keith Randall <khr@golang.org>

983bc8d1

[dev.ssa] cmd/compile/internal/ssa: use width and sign specific opcodes · 67fdb0de

Keith Randall authored Jul 19, 2015

Bake the bit width and signedness into opcodes.
Pro: Rewrite rules become easier.  Less chance for confusion.
Con: Lots more opcodes.

Let me know what you think.  I'm leaning towards this, but I could be
convinced otherwise if people think this is too ugly.

Update #11467

Change-Id: Icf1b894268cdf73515877bb123839800d97b9df9
Reviewed-on: https://go-review.googlesource.com/12362Reviewed-by: Alan Donovan <adonovan@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>

67fdb0de

[dev.ssa] cmd/compile: fix test verb · 8043f450

Josh Bleecher Snyder authored Jul 20, 2015

The verb doesn't do anything, but if/when we move
these to the test directory, having it be right
will be one fewer thing to remember.

Change-Id: Ibf0280d7cc14bf48927e25215de6b91c111983d9
Reviewed-on: https://go-review.googlesource.com/12438Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

8043f450

[dev.ssa] cmd/compile: refactor out zero value creation · 21bd483c

Josh Bleecher Snyder authored Jul 20, 2015

This will be used in a subsequent commit.

Change-Id: I43eca21f4692d99e164c9f6be0760597c46e6a26
Reviewed-on: https://go-review.googlesource.com/12440Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

21bd483c

20 Jul, 2015 2 commits

[dev.ssa] test: gofmt {goto,label,label1}.go · ffbf209a

Josh Bleecher Snyder authored Jul 20, 2015

Change-Id: I971d0c93632e39aad4e2ba1862f085df820baf8b
Reviewed-on: https://go-review.googlesource.com/12431Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

ffbf209a

[dev.ssa] cmd/compile: handle OpCopy loops in rewrite · f421735b

Josh Bleecher Snyder authored Jul 17, 2015

Change-Id: Icbaad6e5cbfc5430a651538fe90c0a9ee664faf4
Reviewed-on: https://go-review.googlesource.com/12360Reviewed-by: Keith Randall <khr@golang.org>

f421735b

17 Jul, 2015 1 commit

[dev.ssa] cmd/compile/internal/ssa/gen: Fix *64 strength reduction · c1593da8

Keith Randall authored Jul 16, 2015

*64 is <<6, not <<5.

Change-Id: I2eb7e113d5003b2c77fbd3abc3defc4d98976a5e
Reviewed-on: https://go-review.googlesource.com/12323Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

c1593da8

16 Jul, 2015 4 commits

[dev.ssa] cmd/compile/internal/ssa: compute outarg size correctly · 3dcc424b

Keith Randall authored Jul 14, 2015

Keep track of the outargs size needed at each call.
Compute the size of the outargs section of the stack frame. It's just
the max of the outargs size at all the callsites in the function.

Change-Id: I3d0640f654f01307633b1a5f75bab16e211ea6c0
Reviewed-on: https://go-review.googlesource.com/12178Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>

3dcc424b

[dev.ssa] cmd/compile: implement lowering of constant bools · 8adc905a

Josh Bleecher Snyder authored Jul 16, 2015

Change-Id: Ia56ee9798eefe123d4da04138a6a559d2c25ddf3
Reviewed-on: https://go-review.googlesource.com/12312Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

8adc905a

[dev.ssa] cmd/compile: don't Compile if Unimplemented · 766bcc92

Josh Bleecher Snyder authored Jul 16, 2015

If we've already hit an Unimplemented, there may be important
SSA invariants that do not hold and which could cause
ssa.Compile to hang or spin.

While we're here, make detected dependency cycles stop execution.

Change-Id: Ic7d4eea659e1fe3f2c9b3e8a4eee5567494f46ad
Reviewed-on: https://go-review.googlesource.com/12310Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

766bcc92

[dev.ssa] cmd/compile/internal/ssa: implement ODOT · cd7e0594

Keith Randall authored Jul 16, 2015

Implement ODOT.  Similar to ArrayIndex, StructSelect selects a field
out of a larger Value.

We may need more ways to rewrite StructSelect, but StructSelect/Load
is the typical way it is used.

Change-Id: Ida7b8aab3298f4754eaf9fee733974cf8736e45d
Reviewed-on: https://go-review.googlesource.com/12265Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

cd7e0594

15 Jul, 2015 2 commits

[dev.ssa] cmd/compile/internal : Implement Lengauer-Tarjan for dominators · 078ba138

Todd Neal authored Jul 05, 2015

Implements the simple Lengauer-Tarjan algorithm for dominator
and post-dominator calculation.

benchmark old ns/op new ns/op delta
BenchmarkDominatorsLinear-8 1403862 1292741 -7.92%
BenchmarkDominatorsFwdBack-8 1270633 1428285 +12.41%
BenchmarkDominatorsManyPred-8 225932354 1530886 -99.32%
BenchmarkDominatorsMaxPred-8 445994225 1393612 -99.69%
BenchmarkDominatorsMaxPredVal-8 447235248 1246899 -99.72%
BenchmarkNilCheckDeep1-8 829 1259 +51.87%
BenchmarkNilCheckDeep10-8 2199 2397 +9.00%
BenchmarkNilCheckDeep100-8 57325 29405 -48.70%
BenchmarkNilCheckDeep1000-8 6625837 2933151 -55.73%
BenchmarkNilCheckDeep10000-8 763559787 319105541 -58.21%

benchmark old MB/s new MB/s speedup
BenchmarkDominatorsLinear-8 7.12 7.74 1.09x
BenchmarkDominatorsFwdBack-8 7.87 7.00 0.89x
BenchmarkDominatorsManyPred-8 0.04 6.53 163.25x
BenchmarkDominatorsMaxPred-8 0.02 7.18 359.00x
BenchmarkDominatorsMaxPredVal-8 0.02 8.02 401.00x
BenchmarkNilCheckDeep1-8 1.21 0.79 0.65x
BenchmarkNilCheckDeep10-8 4.55 4.17 0.92x
BenchmarkNilCheckDeep100-8 1.74 3.40 1.95x
BenchmarkNilCheckDeep1000-8 0.15 0.34 2.27x
BenchmarkNilCheckDeep10000-8 0.01 0.03 3.00x

Change-Id: Icec3d774422a9bc64914779804c8c0ab73aa72bf
Reviewed-on: https://go-review.googlesource.com/11971Reviewed-by: Keith Randall <khr@golang.org>

078ba138

[dev.ssa] cmd/compile: implement OIND · b383de2e

Todd Neal authored Jul 14, 2015

Change-Id: I15aee8095e6388822e2222f1995fe2278ac956ca
Reviewed-on: https://go-review.googlesource.com/12129Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>

b383de2e

14 Jul, 2015 2 commits

[dev.ssa] cmd/compile/internal/ssa: ensure Phi ops are scheduled first · 4e204b42

Keith Randall authored Jul 14, 2015

Phi ops should always be scheduled first. They have the semantics
of all happening simultaneously at the start of the block. The regalloc
phase assumes all the phis will appear first.

Change-Id: I30291e1fa384a0819205218f1d1ec3aef6d538dd
Reviewed-on: https://go-review.googlesource.com/12154Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>

4e204b42

[dev.ssa] cmd/compile: handle OLITERAL nil expressions · 337b7e7e

Brad Fitzpatrick authored Jul 13, 2015

Change-Id: I02b8fb277b486eaf0916ddcd8f28c062d4022d4b
Reviewed-on: https://go-review.googlesource.com/12150Reviewed-by: Keith Randall <khr@golang.org>

337b7e7e

13 Jul, 2015 5 commits

[dev.ssa] cmd/compile/internal/gc: Implement ODOT and ODOTPTR in addr. · c3c84a25

Keith Randall authored Jul 13, 2015

Change-Id: If8a9d5901fa2141d16b1c8d001761ea62bc23207
Reviewed-on: https://go-review.googlesource.com/12141Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

c3c84a25

[dev.ssa] cmd/compile: treat unsafe.Pointer as a pointer · b06961b4

Brad Fitzpatrick authored Jul 13, 2015

Change-Id: I3f3ac3055c93858894b8852603d79592bbc1696b
Reviewed-on: https://go-review.googlesource.com/12140Reviewed-by: Keith Randall <khr@golang.org>

b06961b4

[dev.ssa] cmd/compile: support zero type for *T · a92bd662

Brad Fitzpatrick authored Jul 13, 2015

Change-Id: I4c9bcea01e2c4333c2a3592b66f1da9f424747a4
Reviewed-on: https://go-review.googlesource.com/12130Reviewed-by: Keith Randall <khr@golang.org>

a92bd662

[dev.ssa] cmd/compile/internal/gc: fix tests on non-amd64 · 50e59bb9

Brad Fitzpatrick authored Jul 13, 2015

Change-Id: Ibd6a59db2d5feea41a21fbea5c1a7fdd49238aa8
Reviewed-on: https://go-review.googlesource.com/12131Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>

50e59bb9

[dev.ssa] cmd/compile: OANDAND, OOROR · e8167111

Brad Fitzpatrick authored Jul 10, 2015

Joint hacking with josharian. Hints from matloob and Todd Neal.

Now with tests, and OROR.

Change-Id: Iff8826fde475691fb72a3eea7396a640b6274af9
Reviewed-on: https://go-review.googlesource.com/12041Reviewed-by: Keith Randall <khr@golang.org>

e8167111

12 Jul, 2015 4 commits

[dev.ssa] cmd/compile/internal/gc: handle _ label correctly · 7e4c06da

Keith Randall authored Jul 12, 2015

An empty label statement can just be ignored, as it cannot
be the target of any gotos.

Tests are already in test/fixedbugs/issue7538*.go

Fixes #11589
Fixes #11593

Change-Id: Iadcd639e7200ce16aa40fd7fa3eaf82522513e82
Reviewed-on: https://go-review.googlesource.com/12093Reviewed-by: Daniel Morsing <daniel.morsing@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

7e4c06da

[dev.ssa] cmd/compile/internal/gc: implement more no-op statements · 4c521ac8

Daniel Morsing authored Jul 12, 2015

Change-Id: I26c268f46dcffe39912b8c92ce9abb875310934f
Reviewed-on: https://go-review.googlesource.com/12100Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

4c521ac8

[dev.ssa] cmd/compile/internal/ssa: comment why replacing phi with copy is ok · accf9b59

Keith Randall authored Jul 11, 2015

Change-Id: I3e2e8862f2fde4349923016b97e8330b0d494e0e
Reviewed-on: https://go-review.googlesource.com/12092Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>

accf9b59

[dev.ssa] cmd/compile: implement ONOT · d9c72d73

Brad Fitzpatrick authored Jul 10, 2015

Co-hacking with josharian at Gophercon.

Change-Id: Ia59dfab676c6ed598c2c25483439cd1395a4ea87
Reviewed-on: https://go-review.googlesource.com/12029Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>

d9c72d73