Commits · a4a82241529ece5d5c7580b7b2df1b616c51b832 · go / golang

07 Dec, 2009 5 commits

use a bootstrap array to avoid allocation for short vectors · a4a82241
Robert Griesemer authored Dec 07, 2009
```
R=r
https://golang.org/cl/165078
```
a4a82241
Remove copyBytes completely in favor of copy. · 8c22dd24
Christopher Wedgwood authored Dec 07, 2009
```
R=r, rsc
https://golang.org/cl/165068
```
8c22dd24

pick off special one-byte case in copy. worth 2x in benchmarks (38ns->16ns). · 20c1ec26

Rob Pike authored Dec 07, 2009

the one-item case could be generalized easily with no cost. worth considering.

R=rsc
CC=golang-dev, cw
https://golang.org/cl/167044

20c1ec26

the AST walker currently provides no way to find out how the · 80e17d67

Roger Peppe authored Dec 07, 2009

nodes in the tree are nested with respect to one another.
a simple change to the Visitor interface makes it possible
to do this (for example to maintain a current node-depth, or a
knowledge of the name of the current function).

Visit(nil) is called at the end of a node's children;
this make possible the channel-based interface below,
amongst other possibilities.

It is still just as simple to get the original behaviour - just
return the same Visitor from Visit.

Here are a couple of possible Visitor types.

// closure-based
type FVisitor func(n interface{}) FVisitor
func (f FVisitor) Visit(n interface{}) Visitor {
	return f(n);
}

// channel-based
type CVisitor chan Visit;
type Visit struct {
	node interface{};
	reply chan CVisitor;
};
func (v CVisitor) Visit(n interface{}) Visitor
{
	if n == nil {
		close(v);
	} else {
		reply := make(chan CVisitor);
		v <- Visit{n, reply};
		r := <-reply;
		if r == nil {
			return nil;
		}
		return r;
	}
	return nil;
}

R=gri
CC=rsc
https://golang.org/cl/166047

80e17d67

changes necessary to get the new chameneosredux onto shootout.alioth.debian.org . · ea98e4b5

Roger Peppe authored Dec 07, 2009

it's now there: http://shootout.alioth.debian.org/u32q/benchmark.php?test=chameneosredux&lang=all&box=1!

R=r, rsc
CC=golang-dev
https://golang.org/cl/167043

ea98e4b5

06 Dec, 2009 4 commits

save a few ns by inlining (which mostly simplifies things anyway). · f91cd447

Rob Pike authored Dec 06, 2009

a couple of cleanups.
don't keep big buffers in the free list.

R=rsc
CC=golang-dev
https://golang.org/cl/166078

f91cd447

unexport Fmt. it's not needed outside this package any more · 353ef80f

Rob Pike authored Dec 06, 2009

cleans up godoc's output for package fmt substantially.

R=rsc
CC=golang-dev
https://golang.org/cl/165070

353ef80f

Make printing faster by avoiding mallocs and some other advances. · 4c0e51cd

Rob Pike authored Dec 06, 2009

Roughly 33% faster for simple cases, probably more for complex ones.

Before:

mallocs per Sprintf(""): 4
mallocs per Sprintf("xxx"): 6
mallocs per Sprintf("%x"): 10
mallocs per Sprintf("%x %x"): 12

Now:

mallocs per Sprintf(""): 2
mallocs per Sprintf("xxx"): 3
mallocs per Sprintf("%x"): 5
mallocs per Sprintf("%x %x"): 7

Speed improves because of avoiding mallocs and also by sharing a bytes.Buffer
between print.go and format.go rather than copying the data back after each
printed item.

Before:

fmt_test.BenchmarkSprintfEmpty	1000000	      1346 ns/op
fmt_test.BenchmarkSprintfString	500000	      3461 ns/op
fmt_test.BenchmarkSprintfInt	500000	      3671 ns/op

Now:

fmt_test.BenchmarkSprintfEmpty	 2000000	       995 ns/op
fmt_test.BenchmarkSprintfString	 1000000	      2745 ns/op
fmt_test.BenchmarkSprintfInt	 1000000	      2391 ns/op
fmt_test.BenchmarkSprintfIntInt	  500000	      3751 ns/op

I believe there is more to get but this is a good milestone.

R=rsc
CC=golang-dev, hong
https://golang.org/cl/166076

4c0e51cd

runtime: disable pointer scan optimization · ed6fd1bc
Russ Cox authored Dec 06, 2009
```
  * broken by reflect, gob

TBR=r
https://golang.org/cl/166077
```
ed6fd1bc

05 Dec, 2009 10 commits

Fix syscall.Statfs and syscall.Fstatfs for 386 GNU/Linux. · 44c1eb6b

Ian Lance Taylor authored Dec 05, 2009

For 386 we use the [f]statfs64 system call, which takes three
parameters: the filename, the size of the statfs64 structure,
and a pointer to the structure itself.

R=rsc
https://golang.org/cl/166073

44c1eb6b

test/bench: use range in reverse-complement · 864c6bcb

Russ Cox authored Dec 05, 2009

1.9s	gcc reverse-complement.c

reverse-complement.go
4.5s / 3.5s	original, with/without bounds checks
3.5s / 3.3s	bounds check reduction
3.3s / 2.8s	smarter garbage collector
2.6s / 2.3s	assembler bytes.IndexByte
2.5s / 2.1s	even smarter garbage collector
2.3s / 2.1s	fix optimizer unnecessary spill bug
2.0s / 1.9s	change loop to range (this CL)

R=r
https://golang.org/cl/166072

864c6bcb

gc/runtime: pass type structure to makeslice. · 864c757a

Russ Cox authored Dec 05, 2009

  * inform garbage collector about memory with no pointers in it

1.9s	gcc reverse-complement.c

reverse-complement.go
4.5s / 3.5s	original, with/without bounds checks
3.5s / 3.3s	bounds check reduction
3.3s / 2.8s	smarter garbage collector
2.6s / 2.3s		assembler bytes.IndexByte
2.5s / 2.1s	even smarter garbage collector (this CL)

R=r
https://golang.org/cl/165064

864c757a

gc: walk pointer in range on slice/array · 6f14cada
Russ Cox authored Dec 05, 2009
```
R=ken2
https://golang.org/cl/166071
```
6f14cada
6g/8g optimizer fix: throw functions now in runtime · 7c4aeec8
Russ Cox authored Dec 05, 2009
```
R=ken2
https://golang.org/cl/166070
```
7c4aeec8
test/bench: dead code in reverse-complement · e2b23e42
Russ Cox authored Dec 05, 2009
```
R=r
https://golang.org/cl/165065
```
e2b23e42
gotest: stop if the // gotest commands fail · d7402cea
Russ Cox authored Dec 05, 2009
```
R=r
https://golang.org/cl/166067
```
d7402cea

net: more fiddling with the udp test. · 2807621d

Russ Cox authored Dec 05, 2009

  i don't know why the timeout needs
  to be so big.

R=r
https://golang.org/cl/165063

2807621d

libmach: fix disassembly of MOVLQSX · d539d079
Russ Cox authored Dec 05, 2009
```
R=r
https://golang.org/cl/166068
```
d539d079
gotest: ignore *_test.pb.go · 01f0f16e
Russ Cox authored Dec 05, 2009
```
R=r
https://golang.org/cl/166064
```
01f0f16e

04 Dec, 2009 21 commits

Add syscall.Rename for NaCl. Fixes NaCl build. · 9e0b68d1
Ian Lance Taylor authored Dec 04, 2009
```
R=rsc
https://golang.org/cl/165062
```
9e0b68d1

runtime: shift the index for the sort by one. · e79bcf8b

Adam Langley authored Dec 04, 2009

Makes the code look cleaner, even if it's a little harder to figure
out from the sort invariants.

R=rsc
CC=golang-dev
https://golang.org/cl/165061

e79bcf8b

Add os.Rename. · 0b5cc316
Ian Lance Taylor authored Dec 04, 2009
```
R=rsc
https://golang.org/cl/166058
```
0b5cc316

Remove global chanlock. · d1740bb3

Adam Langley authored Dec 04, 2009

On a microbenchmark that ping-pongs on lots of channels, this makes
the multithreaded case about 20% faster and the uniprocessor case
about 1% slower. (Due to cache effects, I expect.)

R=rsc, agl
CC=golang-dev
https://golang.org/cl/166043

d1740bb3

bytes: asm for bytes.IndexByte · d6b3f37e

Russ Cox authored Dec 04, 2009

PERFORMANCE DIFFERENCE

SUMMARY

                                                   amd64           386
2.2 GHz AMD Opteron 8214 HE (Linux)             3.0x faster    8.2x faster
3.60 GHz Intel Xeon (Linux)                     2.2x faster    6.2x faster
2.53 GHz Intel Core2 Duo E7200 (Linux)          1.5x faster    4.4x faster
2.66 Ghz Intel Xeon 5150 (Mac Pro, OS X)        1.5x SLOWER    3.0x faster
2.33 GHz Intel Xeon E5435 (Linux)               1.5x SLOWER    3.0x faster
2.33 GHz Intel Core2 T7600 (MacBook Pro, OS X)  1.4x SLOWER    3.0x faster
1.83 GHz Intel Core2 T5600 (Mac Mini, OS X)        none*       3.0x faster

* but yesterday I consistently saw 1.4x SLOWER.

DETAILS

2.2 GHz AMD Opteron 8214 HE (Linux)

amd64 (3x faster)

IndexByte4K            500000           3733 ns/op     1097.24 MB/s
IndexByte4M               500        4328042 ns/op      969.10 MB/s
IndexByte64M               50       67866160 ns/op      988.84 MB/s

IndexBytePortable4K    200000          11161 ns/op      366.99 MB/s
IndexBytePortable4M       100       11795880 ns/op      355.57 MB/s
IndexBytePortable64M       10      188675000 ns/op      355.68 MB/s

386 (8.2x faster)

IndexByte4K            500000           3734 ns/op     1096.95 MB/s
IndexByte4M               500        4209954 ns/op      996.28 MB/s
IndexByte64M               50       68031980 ns/op      986.43 MB/s

IndexBytePortable4K     50000          30670 ns/op      133.55 MB/s
IndexBytePortable4M        50       31868220 ns/op      131.61 MB/s
IndexBytePortable64M        2      508851500 ns/op      131.88 MB/s

3.60 GHz Intel Xeon (Linux)

amd64 (2.2x faster)

IndexByte4K            500000           4612 ns/op      888.12 MB/s
IndexByte4M               500        4835250 ns/op      867.44 MB/s
IndexByte64M               20       77388450 ns/op      867.17 MB/s

IndexBytePortable4K    200000          10306 ns/op      397.44 MB/s
IndexBytePortable4M       100       11201460 ns/op      374.44 MB/s
IndexBytePortable64M       10      179456800 ns/op      373.96 MB/s

386 (6.3x faster)

IndexByte4K            500000           4631 ns/op      884.47 MB/s
IndexByte4M               500        4846388 ns/op      865.45 MB/s
IndexByte64M               20       78691200 ns/op      852.81 MB/s

IndexBytePortable4K    100000          28989 ns/op      141.29 MB/s
IndexBytePortable4M        50       31183180 ns/op      134.51 MB/s
IndexBytePortable64M        5      498347200 ns/op      134.66 MB/s

2.53 GHz Intel Core2 Duo E7200  (Linux)

amd64 (1.5x faster)

IndexByte4K            500000           6502 ns/op      629.96 MB/s
IndexByte4M               500        6692208 ns/op      626.74 MB/s
IndexByte64M               10      107410400 ns/op      624.79 MB/s

IndexBytePortable4K    200000           9721 ns/op      421.36 MB/s
IndexBytePortable4M       100       10013680 ns/op      418.86 MB/s
IndexBytePortable64M       10      160460800 ns/op      418.23 MB/s

386 (4.4x faster)

IndexByte4K            500000           6505 ns/op      629.67 MB/s
IndexByte4M               500        6694078 ns/op      626.57 MB/s
IndexByte64M               10      107397600 ns/op      624.86 MB/s

IndexBytePortable4K    100000          28835 ns/op      142.05 MB/s
IndexBytePortable4M        50       29562680 ns/op      141.88 MB/s
IndexBytePortable64M        5      473221400 ns/op      141.81 MB/s

2.66 Ghz Intel Xeon 5150  (Mac Pro, OS X)

amd64 (1.5x SLOWER)

IndexByte4K            200000           9290 ns/op      440.90 MB/s
IndexByte4M               200        9568925 ns/op      438.33 MB/s
IndexByte64M               10      154473600 ns/op      434.44 MB/s

IndexBytePortable4K    500000           6202 ns/op      660.43 MB/s
IndexBytePortable4M       500        6583614 ns/op      637.08 MB/s
IndexBytePortable64M       20      107166250 ns/op      626.21 MB/s

386 (3x faster)

IndexByte4K            200000           9301 ns/op      440.38 MB/s
IndexByte4M               200        9568025 ns/op      438.37 MB/s
IndexByte64M               10      154391000 ns/op      434.67 MB/s

IndexBytePortable4K    100000          27526 ns/op      148.80 MB/s
IndexBytePortable4M       100       28302490 ns/op      148.20 MB/s
IndexBytePortable64M        5      454170200 ns/op      147.76 MB/s

2.33 GHz Intel Xeon E5435  (Linux)

amd64 (1.5x SLOWER)

IndexByte4K            200000          10601 ns/op      386.38 MB/s
IndexByte4M               100       10827240 ns/op      387.38 MB/s
IndexByte64M               10      173175500 ns/op      387.52 MB/s

IndexBytePortable4K    500000           7082 ns/op      578.37 MB/s
IndexBytePortable4M       500        7391792 ns/op      567.43 MB/s
IndexBytePortable64M       20      122618550 ns/op      547.30 MB/s

386 (3x faster)

IndexByte4K            200000          11074 ns/op      369.88 MB/s
IndexByte4M               100       10902620 ns/op      384.71 MB/s
IndexByte64M               10      181292800 ns/op      370.17 MB/s

IndexBytePortable4K     50000          31725 ns/op      129.11 MB/s
IndexBytePortable4M        50       32564880 ns/op      128.80 MB/s
IndexBytePortable64M        2      545926000 ns/op      122.93 MB/s

2.33 GHz Intel Core2 T7600 (MacBook Pro, OS X)

amd64 (1.4x SLOWER)

IndexByte4K            200000          11120 ns/op      368.35 MB/s
IndexByte4M               100       11531950 ns/op      363.71 MB/s
IndexByte64M               10      184819000 ns/op      363.11 MB/s

IndexBytePortable4K    500000           7419 ns/op      552.10 MB/s
IndexBytePortable4M       200        8018710 ns/op      523.06 MB/s
IndexBytePortable64M       10      127614900 ns/op      525.87 MB/s

386 (3x faster)

IndexByte4K            200000          11114 ns/op      368.54 MB/s
IndexByte4M               100       11443530 ns/op      366.52 MB/s
IndexByte64M               10      185212000 ns/op      362.34 MB/s

IndexBytePortable4K     50000          32891 ns/op      124.53 MB/s
IndexBytePortable4M        50       33930580 ns/op      123.61 MB/s
IndexBytePortable64M        2      545400500 ns/op      123.05 MB/s

1.83 GHz Intel Core2 T5600  (Mac Mini, OS X)

amd64 (no difference)

IndexByte4K            200000          13497 ns/op      303.47 MB/s
IndexByte4M               100       13890650 ns/op      301.95 MB/s
IndexByte64M                5      222358000 ns/op      301.81 MB/s

IndexBytePortable4K    200000          13584 ns/op      301.53 MB/s
IndexBytePortable4M       100       13913280 ns/op      301.46 MB/s
IndexBytePortable64M       10      222572600 ns/op      301.51 MB/s

386 (3x faster)

IndexByte4K            200000          13565 ns/op      301.95 MB/s
IndexByte4M               100       13882640 ns/op      302.13 MB/s
IndexByte64M                5      221411600 ns/op      303.10 MB/s

IndexBytePortable4K     50000          39978 ns/op      102.46 MB/s
IndexBytePortable4M        50       41038160 ns/op      102.20 MB/s
IndexBytePortable64M        2      656362500 ns/op      102.24 MB/s

R=r
CC=golang-dev
https://golang.org/cl/166055

d6b3f37e

spec: document that built-ins cannot be used as func values · 2a5f0c67
Russ Cox authored Dec 04, 2009
```
R=gri
CC=golang-dev
https://golang.org/cl/164088
```
2a5f0c67

make Native Client support build again, · 609eeee8

Russ Cox authored Dec 04, 2009

add README explaining how to try the
web demos.

Fixes #339.

R=r
CC=barry.d.silverman, bss, vadim
https://golang.org/cl/165057

609eeee8

testing: compute MB/s in benchmarks · 11384eec
Russ Cox authored Dec 04, 2009
```
R=r
https://golang.org/cl/166060
```
11384eec
avoid an allocation inside bytes.Buffer by providing a static array. · 4ed57173
Rob Pike authored Dec 04, 2009
```
R=rsc
https://golang.org/cl/165058
```
4ed57173
8l: fix print line number format, buffer overflow · f2c7a201
Russ Cox authored Dec 04, 2009
```
R=ken2
https://golang.org/cl/165059
```
f2c7a201
net: turn off empty packet test by default · 3b858fb8
Russ Cox authored Dec 04, 2009
```
Fixes #374.

R=r
https://golang.org/cl/166053
```
3b858fb8
gc: check for assignment to private fields during initialization · 9da6666a
Russ Cox authored Dec 04, 2009
```
R=ken2
https://golang.org/cl/165055
```
9da6666a
6g code gen bug · 62be24d9
Ken Thompson authored Dec 04, 2009
```
R=rsc
https://golang.org/cl/166052
```
62be24d9
Add Count, Cycle, ZipWith, GroupBy, Repeat, RepeatTimes, Unique to exp/iterable. · f3d63bea
Michael Elkins authored Dec 04, 2009
```
Modify iterFunc to take chan<- instead of just chan.

R=rsc, dsymonds1
CC=golang-dev, r
https://golang.org/cl/160064
```
f3d63bea
crypto/rsa: fix shadowing error. · e93132c9
Adam Langley authored Dec 04, 2009
```
Fixes bug 375.

R=rsc
https://golang.org/cl/165045
```
e93132c9
runtime: fix Caller crash on 386. · cf37254b
Russ Cox authored Dec 04, 2009
```
Fixes #176.

R=r
https://golang.org/cl/166044
```
cf37254b
faq: add question about translation · 6301fb41
Russ Cox authored Dec 04, 2009
```
R=jini, r
https://golang.org/cl/163092
```
6301fb41
codereview: do not gofmt deleted files · 9a86cc67
Russ Cox authored Dec 04, 2009
```
R=r
https://golang.org/cl/164083
```
9a86cc67
Make.conf: fix if $HOME has spaces · aaa2374b
Russ Cox authored Dec 04, 2009
```
R=r
https://golang.org/cl/164086
```
aaa2374b

runtime: malloc fixes · 7e5055ce

Russ Cox authored Dec 04, 2009

  * throw away dead code
  * add mlookup counter
  * add malloc counter
  * set up for blocks with no pointers

Fixes #367.

R=r
https://golang.org/cl/165050

7e5055ce

The String() method requires global state that makes it not work outside of this package, · 10a349a7
Rob Pike authored Dec 04, 2009
```
so make it a local method (_String()).

R=rsc
CC=golang-dev
https://golang.org/cl/165049
```
10a349a7