- 30 May, 2012 11 commits
-
-
Joel Sing authored
On NetBSD a cgo enabled binary has more than 32 sections - bump NSECTS so that we can actually link them successfully. R=golang-dev, rsc CC=golang-dev https://golang.org/cl/6261052
-
Jan Ziak authored
R=rsc CC=golang-dev https://golang.org/cl/6243059
-
Marcel van Lohuizen authored
R=r CC=golang-dev https://golang.org/cl/6202063
-
Russ Cox authored
R=golang-dev, bradfitz CC=golang-dev https://golang.org/cl/6244063
-
Russ Cox authored
I added the nl->op == OLITERAL case during the recent performance round, and while it helps for small integer constants, it hurts for floating point constants. In the Mandelbrot benchmark it causes 2*Zr*Zi to compile like Zr*2*Zi: 0x000000000042663d <+249>: movsd %xmm6,%xmm0 0x0000000000426641 <+253>: movsd $2,%xmm1 0x000000000042664a <+262>: mulsd %xmm1,%xmm0 0x000000000042664e <+266>: mulsd %xmm5,%xmm0 instead of: 0x0000000000426835 <+276>: movsd $2,%xmm0 0x000000000042683e <+285>: mulsd %xmm6,%xmm0 0x0000000000426842 <+289>: mulsd %xmm5,%xmm0 It is unclear why that has such a dramatic performance effect in a tight loop, but it's obviously slightly better code, so go with it. benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 5957470000 5973924000 +0.28% BenchmarkFannkuch11 3811295000 3869128000 +1.52% BenchmarkGobDecode 26001900 25670500 -1.27% BenchmarkGobEncode 12051430 11948590 -0.85% BenchmarkGzip 177432 174821 -1.47% BenchmarkGunzip 10967 10756 -1.92% BenchmarkJSONEncode 78924750 79746900 +1.04% BenchmarkJSONDecode 313606400 307081600 -2.08% BenchmarkMandelbrot200 13670860 8200725 -40.01% !!! BenchmarkRevcomp25M 1179194000 1206539000 +2.32% BenchmarkTemplate 447931200 443948200 -0.89% BenchmarkMD5Hash1K 2856 2873 +0.60% BenchmarkMD5Hash8K 22083 22029 -0.24% benchmark old MB/s new MB/s speedup BenchmarkGobDecode 29.52 29.90 1.01x BenchmarkGobEncode 63.69 64.24 1.01x BenchmarkJSONEncode 24.59 24.33 0.99x BenchmarkJSONDecode 6.19 6.32 1.02x BenchmarkRevcomp25M 215.54 210.66 0.98x BenchmarkTemplate 4.33 4.37 1.01x BenchmarkMD5Hash1K 358.54 356.31 0.99x BenchmarkMD5Hash8K 370.95 371.86 1.00x R=ken2 CC=golang-dev https://golang.org/cl/6261051
-
Nigel Tao authored
filterPaeth takes []byte arguments instead of byte arguments, which avoids some redudant computation of the previous pixel in the inner loop. Also eliminate a bounds check in decoding the up filter. benchmark old ns/op new ns/op delta BenchmarkDecodeGray 3139636 2812531 -10.42% BenchmarkDecodeNRGBAGradient 12341520 10971680 -11.10% BenchmarkDecodeNRGBAOpaque 10740780 9612455 -10.51% BenchmarkDecodePaletted 1819535 1818913 -0.03% BenchmarkDecodeRGB 8974695 8178070 -8.88% R=rsc CC=golang-dev https://golang.org/cl/6243061
-
Alex Brainman authored
R=golang-dev CC=golang-dev https://golang.org/cl/6256069
-
Rémy Oudompheng authored
A block with finalizer might also be profiled. The special bit is needed to unregister the block from the profile. It will be unset only when the block is freed. Fixes #3668. R=golang-dev, rsc CC=golang-dev, remy https://golang.org/cl/6249066
-
Andrew Balholm authored
Also escape "\r" as " " when rendering HTML. Pass 2 additional tests. R=nigeltao CC=golang-dev https://golang.org/cl/6260046
-
Alex Brainman authored
Fixes #3543. R=golang-dev, kardianos, rsc CC=golang-dev, hectorchu, vcc.163 https://golang.org/cl/6245063
-
Nigel Tao authored
$GOROOT/src/pkg/exp/html/testdata/go1.html is an execution of the $GOROOT/doc/go1.html template by godoc. Sample numbers on my linux,amd64 desktop: BenchmarkParser 500 4699198 ns/op 16.63 MB/s --- BENCH: BenchmarkParser parse_test.go:409: 1 iterations, 14653 mallocs per iteration parse_test.go:409: 100 iterations, 14651 mallocs per iteration parse_test.go:409: 500 iterations, 14651 mallocs per iteration BenchmarkRawLevelTokenizer 2000 904957 ns/op 86.37 MB/s --- BENCH: BenchmarkRawLevelTokenizer token_test.go:657: 1 iterations, 28 mallocs per iteration token_test.go:657: 100 iterations, 28 mallocs per iteration token_test.go:657: 2000 iterations, 28 mallocs per iteration BenchmarkLowLevelTokenizer 2000 1134300 ns/op 68.91 MB/s --- BENCH: BenchmarkLowLevelTokenizer token_test.go:657: 1 iterations, 41 mallocs per iteration token_test.go:657: 100 iterations, 41 mallocs per iteration token_test.go:657: 2000 iterations, 41 mallocs per iteration BenchmarkHighLevelTokenizer 1000 2096179 ns/op 37.29 MB/s --- BENCH: BenchmarkHighLevelTokenizer token_test.go:657: 1 iterations, 6616 mallocs per iteration token_test.go:657: 100 iterations, 6616 mallocs per iteration token_test.go:657: 1000 iterations, 6616 mallocs per iteration R=rsc CC=andybalholm, golang-dev, r https://golang.org/cl/6257067
-
- 29 May, 2012 23 commits
-
-
Brad Fitzpatrick authored
R=golang-dev, r CC=golang-dev https://golang.org/cl/6259052
-
Rémy Oudompheng authored
Fixes #3345. R=golang-dev, r, rsc, dave CC=golang-dev, remy https://golang.org/cl/6214061
-
Rob Pike authored
The check for Stringer etc. can only fire if the test is not a builtin, so avoid the expensive check if we know there's no chance. Also put in a fast path for pad, which saves a more modest amount. benchmark old ns/op new ns/op delta BenchmarkSprintfEmpty 148 152 +2.70% BenchmarkSprintfString 585 497 -15.04% BenchmarkSprintfInt 441 396 -10.20% BenchmarkSprintfIntInt 718 603 -16.02% BenchmarkSprintfPrefixedInt 676 621 -8.14% BenchmarkSprintfFloat 1003 953 -4.99% BenchmarkManyArgs 2945 2312 -21.49% BenchmarkScanInts 1704152 1734441 +1.78% BenchmarkScanRecursiveInt 1837397 1828920 -0.46% R=golang-dev, bradfitz CC=golang-dev https://golang.org/cl/6245068
-
Russ Cox authored
Also convert table to use tagged literal. R=golang-dev, bradfitz CC=golang-dev https://golang.org/cl/6258061
-
Brad Fitzpatrick authored
R=golang-dev, rsc CC=golang-dev https://golang.org/cl/6249065
-
Peter Kleiweg authored
Two fixes for indentation problems: 1. Properly recognize multi-line strings. These start with `, not ". 2. Don't indent a line if the beginning of the line is the end of a multi-line string. This happened for instance when inserting a closing bracket after a multi-line string. R=sameer CC=golang-dev https://golang.org/cl/6157044
-
Robert Griesemer authored
Fixes #3682. R=rsc CC=golang-dev https://golang.org/cl/6256067
-
Brad Fitzpatrick authored
This prevents clients from seeing RSTs and missing the response body. TCP stacks vary. The included test failed on Darwin before but passed on Linux. R=golang-dev, rsc CC=golang-dev https://golang.org/cl/6256066
-
Russ Cox authored
-
Dmitriy Vyukov authored
See time/sleep_test.go for repro. R=golang-dev, r, rsc CC=golang-dev, patrick.allen.higgins https://golang.org/cl/6250072
-
Brad Fitzpatrick authored
It was only being used for (*Stmt).Exec, not Query, and not for the same two methods on *DB. This unifies (*Stmt).Exec's old inline code into the old subsetArgs function, renaming it in the process (changing the old word "subset" to "driver", mostly converted earlier) Fixes #3640 R=golang-dev, rsc CC=golang-dev https://golang.org/cl/6258045
-
Russ Cox authored
It's sad to introduce a new macro, but rnd shows up consistently in profiles, and the function call overwhelms the two arithmetic instructions it performs. R=r CC=golang-dev https://golang.org/cl/6260051
-
Rob Pike authored
Moving panic out of line speeds up fannkuch almost a factor of two. Changes to bitwhacking code affect mandelbrot badly. R=golang-dev, bradfitz, rsc, r CC=golang-dev https://golang.org/cl/6258056
-
Russ Cox authored
It's broken and seems to be exp/types's fault. Update #3682. R=golang-dev, r CC=golang-dev https://golang.org/cl/6243068
-
Joel Sing authored
R=golang-dev, rsc CC=golang-dev https://golang.org/cl/6254055
-
Mikio Hara authored
R=rsc CC=golang-dev https://golang.org/cl/6242067
-
Russ Cox authored
Rename _Block to block, don't bother making it compute count. Add benchmarks. R=agl, agl CC=golang-dev https://golang.org/cl/6243053
-
Mikio Hara authored
breaks public API document style ««« original CL description net: fix comment on FileListener R=golang-dev, bradfitz CC=golang-dev https://golang.org/cl/6248054 »»» R=golang-dev, rsc CC=golang-dev https://golang.org/cl/6242066
-
Peter Kleiweg authored
Fixes #3509. Fixes #2767. R=golang-dev, sameer CC=golang-dev https://golang.org/cl/6139066
-
Akshat Kumar authored
Plan 9 versions for amd64 have 2 megabyte pages. This also fixes the logic for 32-bit vs 64-bit Plan 9, making 64-bit the default, and adds logic to generate a symbols table. R=golang-dev, rsc, rminnich, ality, 0intro CC=golang-dev, john https://golang.org/cl/6218046
-
Russ Cox authored
The old code generated for a bounds check was CMP JLT ok CALL panicindex ok: ... The new code is (once the linker finishes with it): CMP JGE panic ... panic: CALL panicindex which moves the calls out of line, putting more useful code in each cache line. This matters especially in tight loops, such as in Fannkuch. The benefit is more modest elsewhere, but real. From test/bench/go1, amd64: benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 6096092000 6088808000 -0.12% BenchmarkFannkuch11 6151404000 4020463000 -34.64% BenchmarkGobDecode 28990050 28894630 -0.33% BenchmarkGobEncode 12406310 12136730 -2.17% BenchmarkGzip 179923 179903 -0.01% BenchmarkGunzip 11219 11130 -0.79% BenchmarkJSONEncode 86429350 86515900 +0.10% BenchmarkJSONDecode 334593800 315728400 -5.64% BenchmarkRevcomp25M 1219763000 1180767000 -3.20% BenchmarkTemplate 492947600 483646800 -1.89% And 386: benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 6354902000 6243000000 -1.76% BenchmarkFannkuch11 8043769000 7326965000 -8.91% BenchmarkGobDecode 19010800 18941230 -0.37% BenchmarkGobEncode 14077500 13792460 -2.02% BenchmarkGzip 194087 193619 -0.24% BenchmarkGunzip 12495 12457 -0.30% BenchmarkJSONEncode 125636400 125451400 -0.15% BenchmarkJSONDecode 696648600 685032800 -1.67% BenchmarkRevcomp25M 2058088000 2052545000 -0.27% BenchmarkTemplate 602140000 589876800 -2.04% To implement this, two new instruction forms: JLT target // same as always JLT $0, target // branch expected not taken JLT $1, target // branch expected taken The linker could also emit the prediction prefixes, but it does not: expected taken branches are reversed so that the expected case is not taken (as in example above), and the default expectaton for such a jump is not taken already. R=golang-dev, gri, r, dave CC=golang-dev https://golang.org/cl/6248049
-
Sameer Ajmani authored
R=golang-dev, bradfitz CC=golang-dev https://golang.org/cl/6260049
-
Andrew Balholm authored
Implement the (3-per-family) Noah's Ark clause (i.e. don't put more than three identical elements on the list of active formatting elements. Also, when running tests, sort attributes by name before dumping them. Pass 4 additional tests with Noah's Ark clause (including one that needs attributes to be sorted). Pass 5 additional, unrelated tests because of sorting attributes. R=nigeltao, rsc CC=golang-dev https://golang.org/cl/6247056
-
- 28 May, 2012 6 commits
-
-
Mikio Hara authored
R=golang-dev, bradfitz CC=golang-dev https://golang.org/cl/6248054
-
Mikio Hara authored
R=golang-dev, bradfitz CC=golang-dev https://golang.org/cl/6256059
-
Brad Fitzpatrick authored
R=golang-dev, dsymonds, r CC=golang-dev https://golang.org/cl/6242062
-
Brad Fitzpatrick authored
CanonicalHeaderKey didn't allocate, but it did use unnecessary CPU in the hot path, deciding it didn't need to allocate. I considered using constants for all these common header keys but I didn't think it would be prettier. "Content-Length" looks better than contentLength or hdrContentLength, etc. R=golang-dev, dave CC=golang-dev https://golang.org/cl/6255053
-
Brad Fitzpatrick authored
R=golang-dev, r CC=golang-dev https://golang.org/cl/6248053
-
Brad Fitzpatrick authored
Fixes #3535 R=golang-dev, dsymonds CC=golang-dev https://golang.org/cl/6245060
-