- 27 Feb, 2018 8 commits
-
-
Josh Bleecher Snyder authored
Minor improvements, noticed while investigating other things. Shorten the prologue. Make branch direction better for static branch prediction; the most common case by far is switching stacks (g==curg). Change-Id: Ib2211d3efecb60446355cda56194221ccb78057d Reviewed-on: https://go-review.googlesource.com/97377 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Joe Tsai authored
When a var or const declaration contains a mixture of exported and unexported identifiers, replace the unexported identifiers with underscore. Otherwise, the LHS and the RHS may mismatch or the declaration may mismatch with an iota from above. Fixes #22426 Change-Id: Icd5fb81b4ece647232a9f7d05cb140227091e9cb Reviewed-on: https://go-review.googlesource.com/94877 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>
-
erifan01 authored
Improve performance by reducing unnecessary function calls Benchmarks: Tme old time/op new time/op delta Cosh-8 229ns ± 0% 138ns ± 0% -39.74% (p=0.008 n=5+5) Sinh-8 231ns ± 0% 139ns ± 0% -39.83% (p=0.008 n=5+5) Change-Id: Icab5485849bbfaafca8429d06b67c558101f4f3c Reviewed-on: https://go-review.googlesource.com/85477Reviewed-by: Robert Griesemer <gri@golang.org>
-
Josh Bleecher Snyder authored
Change-Id: I855268a4c0d07ad602ec90f5da66422d3d87c5f2 Reviewed-on: https://go-review.googlesource.com/94595 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>
-
Giovanni Bajo authored
Bit-test rules failed to match when matching the highest bit of a word because operands in SSA are signed int64. Fix them by treating them as unsigned (and correctly handling 32-bit operands as well). Tests will be added in next CL. Change-Id: I491c4e88e7e2f87e9bb72bd0d9fa5d4025b90736 Reviewed-on: https://go-review.googlesource.com/94765Reviewed-by: Keith Randall <khr@golang.org>
-
Giovanni Bajo authored
Spotted while working on #18943, it triggers once during bootstrap. Change-Id: Ia4330ccc6395627c233a8eb4dcc0e3e2a770bea7 Reviewed-on: https://go-review.googlesource.com/94764Reviewed-by: Keith Randall <khr@golang.org>
-
Chad Rosier authored
This reduces the go tool binary on arm64 by 12k. go1 results on Amberwing: name old time/op new time/op delta RegexpMatchEasy0_32 249ns ± 0% 249ns ± 0% ~ (p=0.087 n=10+10) RegexpMatchEasy0_1K 584ns ± 0% 584ns ± 0% ~ (all equal) RegexpMatchEasy1_32 246ns ± 0% 246ns ± 0% ~ (p=1.000 n=10+10) RegexpMatchEasy1_1K 806ns ± 0% 806ns ± 0% ~ (p=0.706 n=10+9) RegexpMatchMedium_32 314ns ± 0% 314ns ± 0% ~ (all equal) RegexpMatchMedium_1K 52.1µs ± 0% 52.1µs ± 0% ~ (p=0.245 n=10+8) RegexpMatchHard_32 2.75µs ± 1% 2.75µs ± 1% ~ (p=0.690 n=10+10) RegexpMatchHard_1K 78.9µs ± 0% 78.9µs ± 1% ~ (p=0.295 n=9+9) FmtFprintfEmpty 58.5ns ± 0% 58.5ns ± 0% ~ (all equal) FmtFprintfString 112ns ± 0% 112ns ± 0% ~ (all equal) FmtFprintfInt 117ns ± 0% 116ns ± 0% -0.85% (p=0.000 n=10+10) FmtFprintfIntInt 181ns ± 0% 181ns ± 0% ~ (all equal) FmtFprintfPrefixedInt 222ns ± 0% 224ns ± 0% +0.90% (p=0.000 n=9+10) FmtFprintfFloat 318ns ± 1% 322ns ± 0% ~ (p=0.059 n=10+8) FmtManyArgs 736ns ± 1% 735ns ± 0% ~ (p=0.206 n=9+9) Gzip 437ms ± 0% 436ms ± 0% -0.25% (p=0.000 n=10+10) HTTPClientServer 89.8µs ± 1% 90.2µs ± 2% ~ (p=0.393 n=10+10) JSONEncode 20.1ms ± 1% 20.2ms ± 1% ~ (p=0.065 n=9+10) JSONDecode 94.2ms ± 1% 93.9ms ± 1% -0.42% (p=0.043 n=10+10) GobDecode 12.7ms ± 1% 12.8ms ± 2% +0.94% (p=0.019 n=10+10) GobEncode 12.1ms ± 0% 12.1ms ± 0% ~ (p=0.052 n=10+10) Mandelbrot200 5.06ms ± 0% 5.05ms ± 0% -0.04% (p=0.000 n=9+10) TimeParse 450ns ± 3% 446ns ± 0% ~ (p=0.238 n=10+9) TimeFormat 485ns ± 1% 483ns ± 1% ~ (p=0.073 n=10+10) Template 90.4ms ± 0% 90.7ms ± 0% +0.29% (p=0.000 n=8+10) GoParse 6.01ms ± 0% 6.03ms ± 0% +0.35% (p=0.000 n=10+10) BinaryTree17 11.7s ± 0% 11.7s ± 0% ~ (p=0.481 n=10+10) Revcomp 669ms ± 0% 669ms ± 0% ~ (p=0.315 n=10+10) Fannkuch11 3.40s ± 0% 3.37s ± 0% -0.92% (p=0.000 n=10+10) [Geo mean] 67.9µs 67.9µs +0.02% name old speed new speed delta RegexpMatchEasy0_32 128MB/s ± 0% 128MB/s ± 0% -0.08% (p=0.003 n=8+10) RegexpMatchEasy0_1K 1.75GB/s ± 0% 1.75GB/s ± 0% ~ (p=0.642 n=8+10) RegexpMatchEasy1_32 130MB/s ± 0% 130MB/s ± 0% ~ (p=0.690 n=10+9) RegexpMatchEasy1_1K 1.27GB/s ± 0% 1.27GB/s ± 0% ~ (p=0.661 n=10+9) RegexpMatchMedium_32 3.18MB/s ± 0% 3.18MB/s ± 0% ~ (all equal) RegexpMatchMedium_1K 19.7MB/s ± 0% 19.6MB/s ± 0% ~ (p=0.190 n=10+9) RegexpMatchHard_32 11.6MB/s ± 0% 11.6MB/s ± 1% ~ (p=0.669 n=10+10) RegexpMatchHard_1K 13.0MB/s ± 0% 13.0MB/s ± 0% ~ (p=0.718 n=9+9) Gzip 44.4MB/s ± 0% 44.5MB/s ± 0% +0.24% (p=0.000 n=10+10) JSONEncode 96.5MB/s ± 1% 96.1MB/s ± 1% ~ (p=0.065 n=9+10) JSONDecode 20.6MB/s ± 1% 20.7MB/s ± 1% +0.42% (p=0.041 n=10+10) GobDecode 60.6MB/s ± 1% 60.0MB/s ± 2% -0.92% (p=0.016 n=10+10) GobEncode 63.4MB/s ± 0% 63.6MB/s ± 0% ~ (p=0.055 n=10+10) Template 21.5MB/s ± 0% 21.4MB/s ± 0% -0.30% (p=0.000 n=9+10) GoParse 9.64MB/s ± 0% 9.61MB/s ± 0% -0.36% (p=0.000 n=10+10) Revcomp 380MB/s ± 0% 380MB/s ± 0% ~ (p=0.323 n=10+10) [Geo mean] 56.0MB/s 55.9MB/s -0.07% Change-Id: Ia732fa57fbcf4767d72382516d9f16705d177736 Reviewed-on: https://go-review.googlesource.com/96435 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
-
Josh Bleecher Snyder authored
Moving tighten after lowering benefits from the removal of values by lowering and lowered CSE. It lets us make better decisions about which values are rematerializable and which generate flags. Empirically, it lowers stack usage (by avoiding spills) and generates slightly smaller and faster binaries. Fixes #19853 Fixes #21041 name old time/op new time/op delta Template 195ms ± 4% 193ms ± 4% -1.33% (p=0.000 n=92+97) Unicode 94.1ms ± 9% 92.5ms ± 8% -1.66% (p=0.002 n=97+95) GoTypes 572ms ± 5% 566ms ± 7% -0.92% (p=0.001 n=95+98) Compiler 2.56s ± 4% 2.52s ± 3% -1.41% (p=0.000 n=94+97) SSA 6.52s ± 2% 6.47s ± 3% -0.82% (p=0.000 n=96+94) Flate 117ms ± 5% 116ms ± 7% -0.72% (p=0.018 n=97+97) GoParser 148ms ± 6% 146ms ± 4% -0.97% (p=0.002 n=98+95) Reflect 370ms ± 7% 363ms ± 6% -1.79% (p=0.000 n=99+98) Tar 175ms ± 6% 173ms ± 6% -1.11% (p=0.001 n=94+95) XML 204ms ± 6% 201ms ± 5% -1.49% (p=0.000 n=97+96) [Geo mean] 363ms 359ms -1.22% name old user-time/op new user-time/op delta Template 251ms ± 5% 245ms ± 5% -2.40% (p=0.000 n=97+93) Unicode 131ms ±10% 128ms ± 9% -1.93% (p=0.001 n=100+99) GoTypes 760ms ± 4% 752ms ± 4% -0.96% (p=0.000 n=97+95) Compiler 3.51s ± 3% 3.48s ± 2% -1.04% (p=0.000 n=96+95) SSA 9.57s ± 4% 9.52s ± 2% -0.50% (p=0.004 n=97+96) Flate 149ms ± 6% 147ms ± 6% -1.46% (p=0.000 n=98+96) GoParser 184ms ± 5% 181ms ± 7% -1.84% (p=0.000 n=98+97) Reflect 469ms ± 6% 461ms ± 6% -1.69% (p=0.000 n=100+98) Tar 219ms ± 8% 217ms ± 7% -0.90% (p=0.035 n=96+96) XML 255ms ± 5% 251ms ± 6% -1.48% (p=0.000 n=98+98) [Geo mean] 476ms 469ms -1.42% name old alloc/op new alloc/op delta Template 37.8MB ± 0% 37.8MB ± 0% -0.17% (p=0.000 n=100+100) Unicode 28.8MB ± 0% 28.8MB ± 0% -0.02% (p=0.000 n=100+95) GoTypes 112MB ± 0% 112MB ± 0% -0.20% (p=0.000 n=100+97) Compiler 466MB ± 0% 464MB ± 0% -0.27% (p=0.000 n=100+100) SSA 1.49GB ± 0% 1.49GB ± 0% -0.08% (p=0.000 n=100+99) Flate 24.4MB ± 0% 24.3MB ± 0% -0.25% (p=0.000 n=98+99) GoParser 30.7MB ± 0% 30.6MB ± 0% -0.26% (p=0.000 n=99+100) Reflect 76.4MB ± 0% 76.4MB ± 0% ~ (p=0.253 n=100+100) Tar 38.9MB ± 0% 38.8MB ± 0% -0.20% (p=0.000 n=100+97) XML 41.5MB ± 0% 41.4MB ± 0% -0.19% (p=0.000 n=100+98) [Geo mean] 77.5MB 77.4MB -0.16% name old allocs/op new allocs/op delta Template 381k ± 0% 381k ± 0% -0.15% (p=0.000 n=100+100) Unicode 342k ± 0% 342k ± 0% -0.01% (p=0.000 n=100+98) GoTypes 1.19M ± 0% 1.18M ± 0% -0.24% (p=0.000 n=100+100) Compiler 4.52M ± 0% 4.50M ± 0% -0.29% (p=0.000 n=100+100) SSA 12.3M ± 0% 12.3M ± 0% -0.11% (p=0.000 n=100+100) Flate 234k ± 0% 234k ± 0% -0.26% (p=0.000 n=99+96) GoParser 318k ± 0% 317k ± 0% -0.21% (p=0.000 n=99+100) Reflect 974k ± 0% 974k ± 0% -0.03% (p=0.000 n=100+100) Tar 392k ± 0% 391k ± 0% -0.17% (p=0.000 n=100+99) XML 404k ± 0% 403k ± 0% -0.24% (p=0.000 n=99+99) [Geo mean] 794k 792k -0.17% name old object-bytes new object-bytes delta Template 393kB ± 0% 392kB ± 0% -0.19% (p=0.008 n=5+5) Unicode 207kB ± 0% 207kB ± 0% ~ (all equal) GoTypes 1.23MB ± 0% 1.22MB ± 0% -0.11% (p=0.008 n=5+5) Compiler 4.34MB ± 0% 4.33MB ± 0% -0.15% (p=0.008 n=5+5) SSA 9.85MB ± 0% 9.85MB ± 0% -0.07% (p=0.008 n=5+5) Flate 235kB ± 0% 234kB ± 0% -0.59% (p=0.008 n=5+5) GoParser 297kB ± 0% 296kB ± 0% -0.22% (p=0.008 n=5+5) Reflect 1.03MB ± 0% 1.03MB ± 0% -0.00% (p=0.008 n=5+5) Tar 332kB ± 0% 331kB ± 0% -0.15% (p=0.008 n=5+5) XML 413kB ± 0% 412kB ± 0% -0.19% (p=0.008 n=5+5) [Geo mean] 728kB 727kB -0.17% Change-Id: I9b5cdb668ed102a001897a05e833105acba220a2 Reviewed-on: https://go-review.googlesource.com/95995 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-
- 26 Feb, 2018 22 commits
-
-
Keith Randall authored
Allow the compiler to generate code like CMPQ 16(AX), $7 It's tricky because it's difficult to spill such a comparison during flagalloc, because the same memory state might not be available at the restore locations. Solve this problem by decomposing the compare+load back into its parts if it needs to be spilled. The big win is that the write barrier test goes from: MOVL runtime.writeBarrier(SB), CX TESTL CX, CX JNE 60 to CMPL runtime.writeBarrier(SB), $0 JNE 59 It's one instruction and one byte smaller. Fixes #19485 Fixes #15245 Update #22460 Binaries are about 0.15% smaller. Change-Id: I4fd8d1111b6b9924d52f9a0901ca1b2e5cce0836 Reviewed-on: https://go-review.googlesource.com/86035Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
-
Kunpei Sakai authored
Previously, finishcompare just used SetTypecheck, but this didn't recursively update any untyped bool typed subexpressions. This CL changes it to call typecheck, which correctly handles this. Also cleaned up outdated code for simplifying logic. Updates #23834 Change-Id: Ic7f92d2a77c2eb74024ee97815205371761c1c90 Reviewed-on: https://go-review.googlesource.com/97035Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Kunpei Sakai authored
CL generated mechanically with github.com/mdempsky/unconvert. Also updated cmd/compile/internal/ssa/gen/*.rules manually. Change-Id: If721ef73cf0771ae83ce7e2d11623fc8d9155768 Reviewed-on: https://go-review.googlesource.com/97075Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Ilya Tocar authored
Currently we generate NEGQ for DIV{Q,L,W}. By generating NEGL and NEGW, we will reduce code size, because NEGL doesn't require rex prefix. This also guarantees that upper 32 bits are zeroed, so we can revert CL 85736, and remove zero-extensions of DIVL results. Also adds test for redundant zero extend elimination. Fixes #23310 Change-Id: Ic58c3104c255a71371a06e09d10a975bbe5df587 Reviewed-on: https://go-review.googlesource.com/96815 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Yury Smolsky authored
FileHeader.Name also reflects this fact. Fixes #24018 Change-Id: Id0860a9b23c264ac4c6ddd65ba20e0f1f36e4865 Reviewed-on: https://go-review.googlesource.com/97057Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
-
motemen authored
The output of go with -x flag is formatted in a manner that file paths under current directory are modified to start with a dot (.), but when the directory path ends with a slash (/), the formatting goes wrong. Fixes #23982 Change-Id: I8f8d15dd52bee882a9c6357eb9eabdc3eaa887c3 GitHub-Last-Rev: 1493f38bafdf2c40f16392b794fd1a12eb12a151 GitHub-Pull-Request: golang/go#23985 Reviewed-on: https://go-review.googlesource.com/95755 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Hana Kim authored
This is for debugging the reported flaky tests. Update #24081 Change-Id: Ica046928f675d69e38251a47a6f225efedce920c Reviewed-on: https://go-review.googlesource.com/96855 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>
-
Robert Griesemer authored
Change https://go-review.googlesource.com/79575 fixed the computation of recursive method sets by separating the method set computation from type computation. However, it didn't track an embedded method's scope and as a result, some methods' signatures were typed in the wrong context. This change tracks embedded methods together with their scope and uses that scope for the correct context setup when typing those method signatures. Fixes #23914. Change-Id: If3677dceddb43e9db2f9fb3c7a4a87d2531fbc2a Reviewed-on: https://go-review.googlesource.com/96376Reviewed-by: Alan Donovan <adonovan@google.com>
-
Ian Lance Taylor authored
Fixes #24115 Change-Id: I89d3d5a9c0916fd2e21fe5930549c4129de8ab48 Reviewed-on: https://go-review.googlesource.com/96983Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Rens Rikkerink authored
When using the special import "C", the "cgo" build constraint is implied for the go file, potentially triggering unclear "undefined" error messages. Explicitly explain this in the documentation. Updates #24068 Change-Id: Ib656ceccd52c749ffe7fb2d3db9ac144f17abb32 GitHub-Last-Rev: 5a13f00a9b917e51246a5fbb642c4e9ed55aa21d GitHub-Pull-Request: golang/go#24072 Reviewed-on: https://go-review.googlesource.com/96655Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Robert Griesemer authored
Extend cmd/internal/src.PosBase to track column information, and adjust the meaning of the PosBase position to mean the position at which the PosBase's relative (line, col) position starts (rather than indicating the position of the //line directive). Because this semantic change is made in the compiler's noder, it doesn't affect the logic of src.PosBase, only its test setup (where PosBases are constructed with corrected incomming positions). In short, src.PosBase now matches syntax.PosBase with respect to the semantics of src.PosBase.pos. For #22662. Change-Id: I5b1451cb88fff3f149920c2eec08b6167955ce27 Reviewed-on: https://go-review.googlesource.com/96535Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Robert Griesemer authored
For line directives which have a line and a column number, an omitted filename means that the filename has not changed (per the issue below). For line directives w/o a column number, an omitted filename means the empty filename (to preserve the existing behavior). For #22662. Change-Id: I32cd9037550485da5445a34bb104706eccce1df1 Reviewed-on: https://go-review.googlesource.com/96476Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Robert Griesemer authored
For dependency reasons, the data structure implementing source positions in the compiler is in cmd/internal/src. It contains highly compiler specific details (e.g. inlining index). This change introduces a parallel but simpler position representation, defined in the syntax package, which removes that package's dependency on cmd/internal/src, and also removes the need to deal with certain filename-specific operations (defined by the needs of the compiler) in the syntax package. As a result, the syntax package becomes again a compiler- independent, stand-alone package that at some point might replace (or augment) the existing top-level go/* syntax-related packages. Additionally, line directives that update column numbers are now correctly tracked through the syntax package, with additional tests added. (The respective changes also need to be made in cmd/internal/src; i.e., the compiler accepts but still ignores column numbers in line directives.) This change comes at the cost of a new position translation step, but that step is cheap because it only needs to do real work if the position base changed (i.e., if there is a new file, or new line directive). There is no noticeable impact on overall compiler performance measured with `compilebench -count 5 -alloc`: name old time/op new time/op delta Template 220ms ± 8% 228ms ±18% ~ (p=0.548 n=5+5) Unicode 119ms ±11% 113ms ± 5% ~ (p=0.056 n=5+5) GoTypes 684ms ± 6% 677ms ± 3% ~ (p=0.841 n=5+5) Compiler 3.19s ± 7% 3.01s ± 1% ~ (p=0.095 n=5+5) SSA 7.92s ± 8% 7.79s ± 1% ~ (p=0.690 n=5+5) Flate 141ms ± 7% 139ms ± 4% ~ (p=0.548 n=5+5) GoParser 173ms ±12% 171ms ± 4% ~ (p=1.000 n=5+5) Reflect 417ms ± 5% 411ms ± 3% ~ (p=0.548 n=5+5) Tar 205ms ± 5% 198ms ± 2% ~ (p=0.690 n=5+5) XML 232ms ± 4% 229ms ± 4% ~ (p=0.690 n=5+5) StdCmd 28.7s ± 5% 28.2s ± 2% ~ (p=0.421 n=5+5) name old user-time/op new user-time/op delta Template 269ms ± 4% 265ms ± 5% ~ (p=0.421 n=5+5) Unicode 153ms ± 7% 149ms ± 3% ~ (p=0.841 n=5+5) GoTypes 850ms ± 7% 862ms ± 4% ~ (p=0.841 n=5+5) Compiler 4.01s ± 5% 3.86s ± 0% ~ (p=0.190 n=5+4) SSA 10.9s ± 4% 10.8s ± 2% ~ (p=0.548 n=5+5) Flate 166ms ± 7% 167ms ± 6% ~ (p=1.000 n=5+5) GoParser 204ms ± 8% 206ms ± 7% ~ (p=0.841 n=5+5) Reflect 514ms ± 5% 508ms ± 4% ~ (p=0.548 n=5+5) Tar 245ms ± 6% 244ms ± 3% ~ (p=0.690 n=5+5) XML 280ms ± 4% 278ms ± 4% ~ (p=0.841 n=5+5) name old alloc/op new alloc/op delta Template 37.9MB ± 0% 37.9MB ± 0% ~ (p=0.841 n=5+5) Unicode 28.8MB ± 0% 28.8MB ± 0% ~ (p=0.841 n=5+5) GoTypes 113MB ± 0% 113MB ± 0% ~ (p=0.151 n=5+5) Compiler 468MB ± 0% 468MB ± 0% -0.01% (p=0.032 n=5+5) SSA 1.50GB ± 0% 1.50GB ± 0% ~ (p=0.548 n=5+5) Flate 24.4MB ± 0% 24.4MB ± 0% ~ (p=1.000 n=5+5) GoParser 30.7MB ± 0% 30.7MB ± 0% ~ (p=1.000 n=5+5) Reflect 76.5MB ± 0% 76.5MB ± 0% ~ (p=0.548 n=5+5) Tar 38.9MB ± 0% 38.9MB ± 0% ~ (p=0.222 n=5+5) XML 41.6MB ± 0% 41.6MB ± 0% ~ (p=0.548 n=5+5) name old allocs/op new allocs/op delta Template 382k ± 0% 382k ± 0% +0.01% (p=0.008 n=5+5) Unicode 343k ± 0% 343k ± 0% ~ (p=0.841 n=5+5) GoTypes 1.19M ± 0% 1.19M ± 0% +0.01% (p=0.008 n=5+5) Compiler 4.53M ± 0% 4.53M ± 0% +0.03% (p=0.008 n=5+5) SSA 12.4M ± 0% 12.4M ± 0% +0.00% (p=0.008 n=5+5) Flate 235k ± 0% 235k ± 0% ~ (p=0.079 n=5+5) GoParser 318k ± 0% 318k ± 0% ~ (p=0.730 n=5+5) Reflect 978k ± 0% 978k ± 0% ~ (p=1.000 n=5+5) Tar 393k ± 0% 393k ± 0% ~ (p=0.056 n=5+5) XML 405k ± 0% 405k ± 0% ~ (p=0.548 n=5+5) name old text-bytes new text-bytes delta HelloSize 672kB ± 0% 672kB ± 0% ~ (all equal) CmdGoSize 7.12MB ± 0% 7.12MB ± 0% ~ (all equal) name old data-bytes new data-bytes delta HelloSize 133kB ± 0% 133kB ± 0% ~ (all equal) CmdGoSize 390kB ± 0% 390kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.07MB ± 0% 1.07MB ± 0% ~ (all equal) CmdGoSize 11.2MB ± 0% 11.2MB ± 0% ~ (all equal) Passes toolstash compare. For #22662. Change-Id: I19edb53dd9675af57f7122cb7dba2a6d8bdcc3da Reviewed-on: https://go-review.googlesource.com/94515Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-
Brad Fitzpatrick authored
Despite the existing test that locks in the allocation behavior, people really want a benchmark. So: BenchmarkBuildString_Builder/1Write_NoGrow-4 20000000 60.4 ns/op 48 B/op 1 allocs/op BenchmarkBuildString_Builder/3Write_NoGrow-4 10000000 230 ns/op 336 B/op 3 allocs/op BenchmarkBuildString_Builder/3Write_Grow-4 20000000 102 ns/op 112 B/op 1 allocs/op BenchmarkBuildString_ByteBuffer/1Write_NoGrow-4 10000000 125 ns/op 160 B/op 2 allocs/op BenchmarkBuildString_ByteBuffer/3Write_NoGrow-4 5000000 339 ns/op 400 B/op 3 allocs/op BenchmarkBuildString_ByteBuffer/3Write_Grow-4 5000000 316 ns/op 336 B/op 3 allocs/op I don't think these allocate-as-fast-as-you-can benchmarks are very interesting because they're effectively just GC benchmarks, but sure. If one wants to see that there's 1 fewer allocation, there it is. The ns/op and B/op numbers will change as the built string size changes. Updates #18990 Change-Id: Ifccf535bd396217434a0e6989e195105f90132ae Reviewed-on: https://go-review.googlesource.com/96980 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Alan Donovan <adonovan@google.com>
-
Tobias Klauser authored
Error returns for linux/arm syscalls are handled since a long time. Remove another list of unimplemented syscalls, following CL 96315. The root-only check in TestSyscallNoError was shown to be sufficient as part of CL 84485 already. NetBSD and OpenBSD do not implement the sendfile syscall (yet), so add a link to golang.org/issue/5847 Change-Id: I07efc3c3203537a4142707385f31b59dc0ecca42 Reviewed-on: https://go-review.googlesource.com/97115Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Tobias Klauser authored
On Darwin and FreeBSD, supportsCloseOnExec is defined in its own file, even though it is set to true as on other Unices. Drop the separate definitions but keep the accompanying comments. Change-Id: Iab1d20e1b2590800f141d54b55a099c9cd7ae57e Reviewed-on: https://go-review.googlesource.com/97155 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Alex Brainman authored
Fixes #23123 Change-Id: Ia4ac947cc49ef3d150ef60a095b86552dcef397d Reviewed-on: https://go-review.googlesource.com/84435Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Giovanni Bajo <rasky@develer.com>
-
Tobias Klauser authored
net, internal/poll, net/internal/socktest: use SOCK_{CLOEXEC,NONBLOCK} accept4/socket flags on OpenBSD The SOCK_CLOEXEC and SOCK_NONBLOCK flags to the socket syscall and the accept4 syscall are supported since OpenBSD 5.7. Follows CL 40895 and CL 94295 Change-Id: Icaf35ace2ef5e73279a70d4f1a9fbf3be9371e6c Reviewed-on: https://go-review.googlesource.com/97196Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Kevin Burke authored
Change-Id: If9fe04894851d60a682346415c2e5523b2f04929 Reviewed-on: https://go-review.googlesource.com/96981Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
-
Kunpei Sakai authored
Change-Id: Ic318c25b21298ec123eb27c814c79f637887713c Reviewed-on: https://go-review.googlesource.com/97135 Run-TryBot: Kunpei Sakai <namusyaka@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Giovanni Bajo authored
Contrary to bash, double quotes cannot be used to group arguments in Windows shell, so they were being printed as literals by the echo command. Since a literal '>' is present in the string, it is sufficient to escape it correctly through '^'. Change-Id: Icc8c92b3dc8d813825adadbe3d921a38d44a1a94 Reviewed-on: https://go-review.googlesource.com/97056Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
-
unknown authored
There are a few places where the integer value is used. Use the equivalent constants to aid with readability. Change-Id: I023b1dbe605340544c056d0e0d9d6d5a7d7d0edc GitHub-Last-Rev: c1c90bcd251901f9f2a305ce5ddd0d85009a3d49 GitHub-Pull-Request: golang/go#24123 Reviewed-on: https://go-review.googlesource.com/96984Reviewed-by: Andrew Bonventre <andybons@golang.org>
-
- 25 Feb, 2018 1 commit
-
-
Agniva De Sarker authored
CL 14624 introduced this label. At that time, the switch-case had a break to label statement which made this necessary. But now, the code no longer has a break statement and it directly returns. Hence, it is no longer necessary to have a label. Change-Id: Idde0fcc4d2db2d76424679f5acfe33ab8573bce4 Reviewed-on: https://go-review.googlesource.com/96935Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
-
- 24 Feb, 2018 3 commits
-
-
Lubomir I. Ivanov (VMware) authored
newUserFromSid() is extended so that the retriaval of the user home path based on a user SID becomes possible. (1) The primary method it uses is to lookup the Windows registry for the following key: HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileList\[SID] If the key does not exist the user might not have logged in yet. If (1) fails it falls back to (2) (2) The second method the function uses is to look at the default home path for users (e.g. WINAPI's GetProfilesDirectory()) and append the username to that. The procedure is in the lines of: c:\Users + \ + <username> The function newUser() now requires the following arguments: uid, gid, dir, username, domain This is done to avoid multiple calls to usid.String() and usid.LookupAccount("") in the case of a newUserFromSid() call stack. The functions current() and newUserFromSid() both call newUser() supplying the arguments in question. The helpers lookupUsernameAndDomain() and findHomeDirInRegistry() are added. This commit also updates: - go/build/deps_test.go, so that the test now includes the "internal/syscall/windows/registry" import. - os/user/user_test.go, so that User.HomeDir is tested on Windows. GitHub-Last-Rev: 25423e2a3820121f4c42321e7a77a3977f409724 GitHub-Pull-Request: golang/go#23822 Change-Id: I6c3ad1c4ce3e7bc0d1add024951711f615b84ee5 Reviewed-on: https://go-review.googlesource.com/93935Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Run-TryBot: Alex Brainman <alex.brainman@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Daniel Martí authored
With its new -linecomment flag, it is now possible to use stringer on values whose strings aren't valid identifiers. This is the case with tokens and operators in Go. Operator alredy had inline comments with each operator's string representation; only minor modifications were needed. The inline comments were added to each of the token names, using the same strategy. Comments that were previously inline or part of the string arrays were moved to the line immediately before the name they correspond to. Finally, declare tokStrFast as a function that uses the generated arrays directly. Avoiding the branch and strconv call means that we avoid a performance regression in the scanner, perhaps due to the lack of mid-stack inlining. Performance is not affected. Measured with 'go test -run StdLib -fast' on an X1 Carbon Gen2 (i5-4300U @ 1.90GHz, 8GB RAM, SSD), the best of 5 runs before and after the changes are: parsed 1709399 lines (3763 files) in 1.707402159s (1001169 lines/s) allocated 449.282Mb (263.137Mb/s) parsed 1709329 lines (3765 files) in 1.706663154s (1001562 lines/s) allocated 449.290Mb (263.256Mb/s) Change-Id: Idcc4f83393fcadd6579700e3602c09496ea2625b Reviewed-on: https://go-review.googlesource.com/95357Reviewed-by: Robert Griesemer <gri@golang.org>
-
Ilya Tocar authored
Use MULX/ADOX/ADCX instructions to speed-up addMulVVW, when they are available. addMulVVW is a hotspot in rsa. This is faster than ADD/ADC/IMUL version, because ADOX/ADCX only modify carry/overflow flag, so they can be interleaved with each other and with MULX, which doesn't modify flags at all. Increasing unroll factor to e. g. 16 makes rsa 1% faster, but 3PrimeRSA2048Decrypt performance falls back to baseline. Updates #20058 AddMulVVW/1-8 3.28ns ± 2% 3.26ns ± 3% ~ (p=0.107 n=10+10) AddMulVVW/2-8 4.26ns ± 2% 4.24ns ± 3% ~ (p=0.327 n=9+9) AddMulVVW/3-8 5.07ns ± 2% 5.26ns ± 2% +3.73% (p=0.000 n=10+10) AddMulVVW/4-8 6.40ns ± 2% 6.50ns ± 2% +1.61% (p=0.000 n=10+10) AddMulVVW/5-8 6.77ns ± 2% 6.86ns ± 1% +1.38% (p=0.001 n=9+9) AddMulVVW/10-8 12.2ns ± 2% 10.6ns ± 3% -13.65% (p=0.000 n=10+10) AddMulVVW/100-8 79.7ns ± 2% 52.4ns ± 1% -34.17% (p=0.000 n=10+10) AddMulVVW/1000-8 695ns ± 1% 491ns ± 2% -29.39% (p=0.000 n=9+10) AddMulVVW/10000-8 7.26µs ± 2% 5.92µs ± 6% -18.42% (p=0.000 n=10+10) AddMulVVW/100000-8 72.6µs ± 2% 62.2µs ± 2% -14.31% (p=0.000 n=10+10) crypto/rsa speed-up is smaller, but stil noticeable: RSA2048Decrypt-8 1.61ms ± 1% 1.38ms ± 1% -14.13% (p=0.000 n=10+10) RSA2048Sign-8 1.93ms ± 1% 1.70ms ± 1% -11.86% (p=0.000 n=10+10) 3PrimeRSA2048Decrypt-8 932µs ± 0% 828µs ± 0% -11.15% (p=0.000 n=10+10) Results on crypto/tls: HandshakeServer/RSA-8 901µs ± 1% 777µs ± 0% -13.70% (p=0.000 n=10+8) HandshakeServer/ECDHE-P256-RSA-8 1.01ms ± 1% 0.90ms ± 0% -11.53% (p=0.000 n=10+9) Full math/big benchmarks: name old time/op new time/op delta AddVV/1-8 3.74ns ± 6% 3.55ns ± 2% ~ (p=0.082 n=10+8) AddVV/2-8 3.96ns ± 2% 3.98ns ± 5% ~ (p=0.794 n=10+9) AddVV/3-8 4.97ns ± 2% 4.94ns ± 1% ~ (p=0.081 n=10+9) AddVV/4-8 5.59ns ± 2% 5.59ns ± 2% ~ (p=0.809 n=10+10) AddVV/5-8 6.63ns ± 1% 6.62ns ± 1% ~ (p=0.560 n=9+10) AddVV/10-8 8.11ns ± 1% 8.11ns ± 2% ~ (p=0.402 n=10+10) AddVV/100-8 46.9ns ± 2% 46.8ns ± 1% ~ (p=0.809 n=10+10) AddVV/1000-8 389ns ± 1% 391ns ± 4% ~ (p=0.809 n=10+10) AddVV/10000-8 5.05µs ± 5% 4.98µs ± 2% ~ (p=0.113 n=9+10) AddVV/100000-8 55.3µs ± 3% 55.2µs ± 3% ~ (p=0.796 n=10+10) AddVW/1-8 3.04ns ± 3% 3.02ns ± 3% ~ (p=0.538 n=10+10) AddVW/2-8 3.57ns ± 2% 3.61ns ± 2% +1.12% (p=0.032 n=9+9) AddVW/3-8 3.77ns ± 1% 3.79ns ± 2% ~ (p=0.719 n=10+10) AddVW/4-8 4.69ns ± 1% 4.69ns ± 2% ~ (p=0.920 n=10+9) AddVW/5-8 4.58ns ± 1% 4.58ns ± 1% ~ (p=0.812 n=10+10) AddVW/10-8 7.62ns ± 2% 7.63ns ± 1% ~ (p=0.926 n=10+10) AddVW/100-8 41.1ns ± 2% 42.4ns ± 3% +3.34% (p=0.000 n=10+10) AddVW/1000-8 386ns ± 2% 389ns ± 4% ~ (p=0.514 n=10+10) AddVW/10000-8 3.88µs ± 3% 3.87µs ± 3% ~ (p=0.448 n=10+10) AddVW/100000-8 41.2µs ± 3% 41.7µs ± 3% ~ (p=0.148 n=10+10) AddMulVVW/1-8 3.28ns ± 2% 3.26ns ± 3% ~ (p=0.107 n=10+10) AddMulVVW/2-8 4.26ns ± 2% 4.24ns ± 3% ~ (p=0.327 n=9+9) AddMulVVW/3-8 5.07ns ± 2% 5.26ns ± 2% +3.73% (p=0.000 n=10+10) AddMulVVW/4-8 6.40ns ± 2% 6.50ns ± 2% +1.61% (p=0.000 n=10+10) AddMulVVW/5-8 6.77ns ± 2% 6.86ns ± 1% +1.38% (p=0.001 n=9+9) AddMulVVW/10-8 12.2ns ± 2% 10.6ns ± 3% -13.65% (p=0.000 n=10+10) AddMulVVW/100-8 79.7ns ± 2% 52.4ns ± 1% -34.17% (p=0.000 n=10+10) AddMulVVW/1000-8 695ns ± 1% 491ns ± 2% -29.39% (p=0.000 n=9+10) AddMulVVW/10000-8 7.26µs ± 2% 5.92µs ± 6% -18.42% (p=0.000 n=10+10) AddMulVVW/100000-8 72.6µs ± 2% 62.2µs ± 2% -14.31% (p=0.000 n=10+10) DecimalConversion-8 108µs ±19% 104µs ± 4% ~ (p=0.460 n=10+8) FloatString/100-8 926ns ±14% 908ns ± 5% ~ (p=0.398 n=9+9) FloatString/1000-8 25.7µs ± 1% 25.7µs ± 1% ~ (p=0.739 n=10+10) FloatString/10000-8 2.13ms ± 1% 2.12ms ± 1% ~ (p=0.353 n=10+10) FloatString/100000-8 207ms ± 1% 206ms ± 2% ~ (p=0.912 n=10+10) FloatAdd/10-8 61.3ns ± 3% 61.9ns ± 3% ~ (p=0.183 n=10+10) FloatAdd/100-8 62.0ns ± 2% 62.9ns ± 4% ~ (p=0.118 n=10+10) FloatAdd/1000-8 84.7ns ± 2% 84.4ns ± 1% ~ (p=0.591 n=10+10) FloatAdd/10000-8 305ns ± 2% 306ns ± 1% ~ (p=0.443 n=10+10) FloatAdd/100000-8 2.45µs ± 1% 2.46µs ± 1% ~ (p=0.782 n=10+10) FloatSub/10-8 56.8ns ± 4% 56.5ns ± 5% ~ (p=0.423 n=10+10) FloatSub/100-8 57.3ns ± 4% 57.1ns ± 5% ~ (p=0.540 n=10+10) FloatSub/1000-8 66.8ns ± 4% 66.6ns ± 1% ~ (p=0.868 n=10+10) FloatSub/10000-8 199ns ± 1% 198ns ± 1% ~ (p=0.287 n=10+9) FloatSub/100000-8 1.47µs ± 2% 1.47µs ± 2% ~ (p=0.920 n=10+9) ParseFloatSmallExp-8 8.74µs ±10% 9.48µs ±10% +8.51% (p=0.010 n=9+10) ParseFloatLargeExp-8 39.2µs ±25% 39.6µs ±12% ~ (p=0.529 n=10+10) GCD10x10/WithoutXY-8 173ns ±23% 177ns ±20% ~ (p=0.698 n=10+10) GCD10x10/WithXY-8 736ns ±12% 728ns ±16% ~ (p=0.838 n=10+10) GCD10x100/WithoutXY-8 325ns ±16% 326ns ±14% ~ (p=0.912 n=10+10) GCD10x100/WithXY-8 1.14µs ±13% 1.16µs ± 6% ~ (p=0.287 n=10+9) GCD10x1000/WithoutXY-8 851ns ±25% 820ns ±12% ~ (p=0.592 n=10+10) GCD10x1000/WithXY-8 2.89µs ±17% 2.85µs ± 5% ~ (p=1.000 n=10+9) GCD10x10000/WithoutXY-8 6.66µs ±12% 6.82µs ±19% ~ (p=0.529 n=10+10) GCD10x10000/WithXY-8 18.0µs ± 5% 17.2µs ±19% ~ (p=0.315 n=7+10) GCD10x100000/WithoutXY-8 77.8µs ±18% 73.3µs ±11% ~ (p=0.315 n=10+9) GCD10x100000/WithXY-8 186µs ±14% 204µs ±29% ~ (p=0.218 n=10+10) GCD100x100/WithoutXY-8 1.09µs ± 1% 1.09µs ± 2% ~ (p=0.117 n=9+10) GCD100x100/WithXY-8 7.93µs ± 1% 7.97µs ± 1% +0.52% (p=0.006 n=10+10) GCD100x1000/WithoutXY-8 2.00µs ± 3% 2.04µs ± 6% ~ (p=0.053 n=9+10) GCD100x1000/WithXY-8 9.23µs ± 1% 9.29µs ± 1% +0.63% (p=0.009 n=10+10) GCD100x10000/WithoutXY-8 10.2µs ±11% 9.7µs ± 6% ~ (p=0.278 n=10+9) GCD100x10000/WithXY-8 33.3µs ± 4% 33.6µs ± 4% ~ (p=0.481 n=10+10) GCD100x100000/WithoutXY-8 106µs ±17% 105µs ±13% ~ (p=0.853 n=10+10) GCD100x100000/WithXY-8 289µs ±17% 276µs ± 8% ~ (p=0.353 n=10+10) GCD1000x1000/WithoutXY-8 12.2µs ± 1% 12.1µs ± 1% -0.45% (p=0.007 n=10+10) GCD1000x1000/WithXY-8 131µs ± 1% 132µs ± 0% +0.93% (p=0.000 n=9+7) GCD1000x10000/WithoutXY-8 20.6µs ± 2% 20.6µs ± 1% ~ (p=0.326 n=10+9) GCD1000x10000/WithXY-8 238µs ± 1% 237µs ± 1% ~ (p=0.356 n=9+10) GCD1000x100000/WithoutXY-8 117µs ± 8% 114µs ±11% ~ (p=0.190 n=10+10) GCD1000x100000/WithXY-8 1.51ms ± 1% 1.50ms ± 1% ~ (p=0.053 n=9+10) GCD10000x10000/WithoutXY-8 220µs ± 1% 218µs ± 1% -0.86% (p=0.000 n=10+10) GCD10000x10000/WithXY-8 3.04ms ± 0% 3.05ms ± 0% +0.33% (p=0.001 n=9+10) GCD10000x100000/WithoutXY-8 513µs ± 0% 511µs ± 0% -0.38% (p=0.000 n=10+10) GCD10000x100000/WithXY-8 15.1ms ± 0% 15.0ms ± 0% ~ (p=0.053 n=10+9) GCD100000x100000/WithoutXY-8 10.4ms ± 1% 10.4ms ± 2% ~ (p=0.258 n=9+9) GCD100000x100000/WithXY-8 205ms ± 1% 205ms ± 1% ~ (p=0.481 n=10+10) Hilbert-8 1.25ms ±15% 1.24ms ±17% ~ (p=0.853 n=10+10) Binomial-8 3.03µs ±24% 2.90µs ±16% ~ (p=0.481 n=10+10) QuoRem-8 1.95µs ± 1% 1.95µs ± 2% ~ (p=0.117 n=9+10) Exp-8 5.12ms ± 2% 3.99ms ± 1% -22.02% (p=0.000 n=10+9) Exp2-8 5.14ms ± 2% 3.98ms ± 0% -22.55% (p=0.000 n=10+9) Bitset-8 16.4ns ± 2% 16.5ns ± 2% ~ (p=0.311 n=9+10) BitsetNeg-8 46.3ns ± 4% 45.8ns ± 4% ~ (p=0.272 n=10+10) BitsetOrig-8 250ns ±19% 247ns ±14% ~ (p=0.671 n=10+10) BitsetNegOrig-8 416ns ±14% 429ns ±14% ~ (p=0.353 n=10+10) ModSqrt225_Tonelli-8 400µs ± 0% 320µs ± 0% -19.88% (p=0.000 n=9+7) ModSqrt224_3Mod4-8 123µs ± 1% 97µs ± 0% -21.21% (p=0.000 n=9+10) ModSqrt5430_Tonelli-8 1.87s ± 0% 1.39s ± 1% -25.70% (p=0.000 n=9+10) ModSqrt5430_3Mod4-8 630ms ± 2% 465ms ± 1% -26.12% (p=0.000 n=10+10) Sqrt-8 25.8µs ± 1% 25.9µs ± 0% +0.66% (p=0.002 n=10+8) IntSqr/1-8 11.3ns ± 1% 11.3ns ± 2% ~ (p=0.360 n=9+10) IntSqr/2-8 26.6ns ± 1% 27.4ns ± 2% +2.87% (p=0.000 n=8+9) IntSqr/3-8 36.5ns ± 6% 36.6ns ± 5% ~ (p=0.589 n=10+10) IntSqr/5-8 57.2ns ± 2% 57.8ns ± 1% +0.92% (p=0.045 n=10+9) IntSqr/8-8 112ns ± 1% 93ns ± 1% -16.60% (p=0.000 n=10+10) IntSqr/10-8 148ns ± 1% 129ns ± 5% -12.85% (p=0.000 n=10+10) IntSqr/20-8 642ns ±28% 692ns ±21% ~ (p=0.105 n=10+10) IntSqr/30-8 1.03µs ±18% 1.06µs ±15% ~ (p=0.422 n=10+8) IntSqr/50-8 2.33µs ±14% 2.14µs ±20% ~ (p=0.063 n=10+10) IntSqr/80-8 4.06µs ±13% 3.72µs ±14% -8.31% (p=0.029 n=10+10) IntSqr/100-8 5.79µs ±10% 5.20µs ±18% -10.15% (p=0.004 n=10+10) IntSqr/200-8 17.1µs ± 1% 12.9µs ± 3% -24.44% (p=0.000 n=10+10) IntSqr/300-8 35.9µs ± 0% 26.6µs ± 1% -25.75% (p=0.000 n=10+10) IntSqr/500-8 84.9µs ± 0% 71.7µs ± 1% -15.49% (p=0.000 n=10+10) IntSqr/800-8 170µs ± 1% 142µs ± 2% -16.73% (p=0.000 n=10+10) IntSqr/1000-8 258µs ± 1% 218µs ± 1% -15.65% (p=0.000 n=10+10) Mul-8 10.4ms ± 1% 8.3ms ± 0% -20.05% (p=0.000 n=10+9) Exp3Power/0x10-8 311ns ±15% 321ns ±24% ~ (p=0.447 n=10+10) Exp3Power/0x40-8 358ns ±21% 346ns ±37% ~ (p=0.591 n=10+10) Exp3Power/0x100-8 611ns ±19% 570ns ±27% ~ (p=0.393 n=10+10) Exp3Power/0x400-8 1.31µs ±26% 1.34µs ±19% ~ (p=0.853 n=10+10) Exp3Power/0x1000-8 6.76µs ±23% 6.22µs ±16% ~ (p=0.095 n=10+9) Exp3Power/0x4000-8 37.6µs ±14% 36.4µs ±21% ~ (p=0.247 n=10+10) Exp3Power/0x10000-8 345µs ±14% 310µs ±11% -9.99% (p=0.005 n=10+10) Exp3Power/0x40000-8 2.77ms ± 1% 2.34ms ± 1% -15.47% (p=0.000 n=10+10) Exp3Power/0x100000-8 25.1ms ± 1% 21.3ms ± 1% -15.26% (p=0.000 n=10+10) Exp3Power/0x400000-8 225ms ± 1% 190ms ± 1% -15.61% (p=0.000 n=10+10) Fibo-8 23.4ms ± 1% 23.3ms ± 0% ~ (p=0.052 n=10+10) NatSqr/1-8 58.4ns ±24% 59.8ns ±38% ~ (p=0.739 n=10+10) NatSqr/2-8 122ns ±21% 122ns ±16% ~ (p=0.896 n=10+10) NatSqr/3-8 140ns ±28% 148ns ±30% ~ (p=0.288 n=10+10) NatSqr/5-8 193ns ±29% 210ns ±34% ~ (p=0.469 n=10+10) NatSqr/8-8 317ns ±21% 296ns ±25% ~ (p=0.393 n=10+10) NatSqr/10-8 362ns ± 8% 373ns ±30% ~ (p=0.617 n=9+10) NatSqr/20-8 1.24µs ±16% 1.06µs ±29% -14.57% (p=0.019 n=10+10) NatSqr/30-8 1.90µs ±32% 1.71µs ±10% ~ (p=0.176 n=10+9) NatSqr/50-8 4.22µs ±19% 3.67µs ± 7% -13.03% (p=0.017 n=10+9) NatSqr/80-8 7.33µs ±20% 6.50µs ±15% -11.26% (p=0.009 n=10+10) NatSqr/100-8 9.84µs ±18% 9.33µs ± 8% ~ (p=0.280 n=10+10) NatSqr/200-8 21.4µs ± 7% 20.0µs ±14% ~ (p=0.075 n=10+10) NatSqr/300-8 38.0µs ± 2% 31.3µs ±10% -17.63% (p=0.000 n=10+10) NatSqr/500-8 102µs ± 5% 101µs ± 4% ~ (p=0.780 n=9+10) NatSqr/800-8 190µs ± 3% 166µs ± 6% -12.29% (p=0.000 n=10+10) NatSqr/1000-8 277µs ± 2% 245µs ± 6% -11.64% (p=0.000 n=10+10) ScanPi-8 144µs ±23% 149µs ±24% ~ (p=0.579 n=10+10) StringPiParallel-8 25.6µs ± 0% 25.8µs ± 0% +0.69% (p=0.000 n=9+10) Scan/10/Base2-8 305ns ± 1% 309ns ± 1% +1.32% (p=0.000 n=10+9) Scan/100/Base2-8 1.95µs ± 1% 1.98µs ± 1% +1.10% (p=0.000 n=10+10) Scan/1000/Base2-8 19.5µs ± 1% 19.7µs ± 1% +1.39% (p=0.000 n=10+10) Scan/10000/Base2-8 270µs ± 1% 272µs ± 1% +0.58% (p=0.024 n=9+9) Scan/100000/Base2-8 10.3ms ± 0% 10.3ms ± 0% +0.16% (p=0.022 n=9+10) Scan/10/Base8-8 146ns ± 4% 154ns ± 4% +5.57% (p=0.000 n=9+9) Scan/100/Base8-8 748ns ± 1% 759ns ± 1% +1.51% (p=0.000 n=9+10) Scan/1000/Base8-8 7.88µs ± 1% 8.00µs ± 1% +1.64% (p=0.000 n=10+10) Scan/10000/Base8-8 155µs ± 1% 155µs ± 1% ~ (p=0.968 n=10+9) Scan/100000/Base8-8 9.11ms ± 0% 9.11ms ± 0% ~ (p=0.604 n=9+10) Scan/10/Base10-8 140ns ± 5% 149ns ± 5% +6.39% (p=0.000 n=9+10) Scan/100/Base10-8 680ns ± 0% 688ns ± 1% +1.08% (p=0.000 n=9+10) Scan/1000/Base10-8 7.09µs ± 1% 7.16µs ± 1% +0.98% (p=0.019 n=10+10) Scan/10000/Base10-8 149µs ± 3% 150µs ± 3% ~ (p=0.143 n=10+10) Scan/100000/Base10-8 9.16ms ± 0% 9.16ms ± 0% ~ (p=0.661 n=10+9) Scan/10/Base16-8 134ns ± 5% 135ns ± 3% ~ (p=0.505 n=9+9) Scan/100/Base16-8 560ns ± 1% 563ns ± 0% +0.67% (p=0.000 n=10+8) Scan/1000/Base16-8 6.28µs ± 1% 6.26µs ± 1% ~ (p=0.448 n=10+10) Scan/10000/Base16-8 161µs ± 1% 162µs ± 1% +0.74% (p=0.008 n=9+9) Scan/100000/Base16-8 9.64ms ± 0% 9.64ms ± 0% ~ (p=0.436 n=10+10) String/10/Base2-8 116ns ±12% 118ns ±13% ~ (p=0.645 n=10+10) String/100/Base2-8 871ns ±23% 860ns ±22% ~ (p=0.699 n=10+10) String/1000/Base2-8 10.0µs ±20% 10.0µs ±23% ~ (p=0.853 n=10+10) String/10000/Base2-8 110µs ±21% 120µs ±25% ~ (p=0.436 n=10+10) String/100000/Base2-8 768µs ±11% 733µs ±16% ~ (p=0.393 n=10+10) String/10/Base8-8 51.3ns ± 1% 51.0ns ± 3% ~ (p=0.286 n=9+9) String/100/Base8-8 284ns ± 9% 272ns ±12% ~ (p=0.267 n=9+10) String/1000/Base8-8 3.06µs ± 9% 3.04µs ±10% ~ (p=0.739 n=10+10) String/10000/Base8-8 36.1µs ±14% 35.1µs ± 9% ~ (p=0.447 n=10+9) String/100000/Base8-8 371µs ±12% 373µs ±16% ~ (p=0.739 n=10+10) String/10/Base10-8 167ns ±11% 165ns ± 9% ~ (p=0.781 n=10+10) String/100/Base10-8 727ns ± 1% 740ns ± 2% +1.70% (p=0.001 n=10+10) String/1000/Base10-8 5.30µs ±18% 5.37µs ±14% ~ (p=0.631 n=10+10) String/10000/Base10-8 45.0µs ±14% 44.6µs ±10% ~ (p=0.720 n=9+10) String/100000/Base10-8 5.10ms ± 1% 5.05ms ± 3% ~ (p=0.211 n=9+10) String/10/Base16-8 47.7ns ± 6% 47.7ns ± 6% ~ (p=0.985 n=10+10) String/100/Base16-8 221ns ±10% 234ns ±27% ~ (p=0.541 n=10+10) String/1000/Base16-8 2.23µs ±11% 2.12µs ± 8% -4.81% (p=0.029 n=9+8) String/10000/Base16-8 28.3µs ±21% 28.5µs ±14% ~ (p=0.796 n=10+10) String/100000/Base16-8 291µs ±16% 293µs ±15% ~ (p=0.931 n=9+9) LeafSize/0-8 2.43ms ± 1% 2.49ms ± 1% +2.56% (p=0.000 n=10+10) LeafSize/1-8 49.7µs ± 9% 46.3µs ±16% -6.78% (p=0.017 n=10+9) LeafSize/2-8 48.4µs ±18% 46.3µs ±19% ~ (p=0.436 n=10+10) LeafSize/3-8 81.7µs ± 3% 80.9µs ± 3% ~ (p=0.278 n=10+9) LeafSize/4-8 47.0µs ± 7% 47.9µs ±13% ~ (p=0.905 n=9+10) LeafSize/5-8 96.8µs ± 1% 97.3µs ± 2% ~ (p=0.515 n=8+10) LeafSize/6-8 82.5µs ± 4% 80.9µs ± 2% -1.92% (p=0.019 n=10+10) LeafSize/7-8 67.2µs ±13% 66.6µs ± 9% ~ (p=0.842 n=10+9) LeafSize/8-8 46.0µs ±28% 45.1µs ±12% ~ (p=0.739 n=10+10) LeafSize/9-8 111µs ± 1% 111µs ± 1% ~ (p=0.739 n=10+10) LeafSize/10-8 98.8µs ± 4% 97.9µs ± 3% ~ (p=0.278 n=10+9) LeafSize/11-8 96.8µs ± 1% 96.4µs ± 1% ~ (p=0.211 n=9+10) LeafSize/12-8 81.0µs ± 4% 81.3µs ± 3% ~ (p=0.579 n=10+10) LeafSize/13-8 79.7µs ± 5% 79.2µs ± 3% ~ (p=0.661 n=10+9) LeafSize/14-8 67.6µs ±12% 65.8µs ± 7% ~ (p=0.447 n=10+9) LeafSize/15-8 63.9µs ±17% 66.3µs ±14% ~ (p=0.481 n=10+10) LeafSize/16-8 44.0µs ±28% 46.0µs ±27% ~ (p=0.481 n=10+10) LeafSize/32-8 46.2µs ±13% 43.5µs ±18% ~ (p=0.156 n=9+10) LeafSize/64-8 53.3µs ±10% 53.0µs ±19% ~ (p=0.730 n=9+9) ProbablyPrime/n=0-8 3.60ms ± 1% 3.39ms ± 1% -5.87% (p=0.000 n=10+9) ProbablyPrime/n=1-8 4.42ms ± 1% 4.08ms ± 1% -7.69% (p=0.000 n=10+10) ProbablyPrime/n=5-8 7.57ms ± 2% 6.79ms ± 1% -10.24% (p=0.000 n=10+10) ProbablyPrime/n=10-8 11.6ms ± 2% 10.2ms ± 1% -11.69% (p=0.000 n=10+10) ProbablyPrime/n=20-8 19.4ms ± 2% 16.9ms ± 2% -12.89% (p=0.000 n=10+10) ProbablyPrime/Lucas-8 2.81ms ± 2% 2.72ms ± 1% -3.22% (p=0.000 n=10+9) ProbablyPrime/MillerRabinBase2-8 797µs ± 1% 680µs ± 1% -14.64% (p=0.000 n=10+10) name old speed new speed delta AddVV/1-8 17.1GB/s ± 6% 18.0GB/s ± 2% ~ (p=0.122 n=10+8) AddVV/2-8 32.4GB/s ± 2% 32.2GB/s ± 4% ~ (p=0.661 n=10+9) AddVV/3-8 38.6GB/s ± 2% 38.9GB/s ± 1% ~ (p=0.113 n=10+9) AddVV/4-8 45.8GB/s ± 2% 45.8GB/s ± 2% ~ (p=0.796 n=10+10) AddVV/5-8 48.1GB/s ± 2% 48.3GB/s ± 1% ~ (p=0.315 n=10+10) AddVV/10-8 78.9GB/s ± 1% 78.9GB/s ± 2% ~ (p=0.353 n=10+10) AddVV/100-8 136GB/s ± 2% 137GB/s ± 1% ~ (p=0.971 n=10+10) AddVV/1000-8 164GB/s ± 1% 164GB/s ± 4% ~ (p=0.853 n=10+10) AddVV/10000-8 126GB/s ± 6% 129GB/s ± 2% ~ (p=0.063 n=10+10) AddVV/100000-8 116GB/s ± 3% 116GB/s ± 3% ~ (p=0.796 n=10+10) AddVW/1-8 2.64GB/s ± 3% 2.64GB/s ± 3% ~ (p=0.579 n=10+10) AddVW/2-8 4.49GB/s ± 2% 4.44GB/s ± 2% -1.09% (p=0.040 n=9+9) AddVW/3-8 6.36GB/s ± 1% 6.34GB/s ± 2% ~ (p=0.684 n=10+10) AddVW/4-8 6.83GB/s ± 1% 6.82GB/s ± 2% ~ (p=0.905 n=10+9) AddVW/5-8 8.75GB/s ± 1% 8.73GB/s ± 1% ~ (p=0.796 n=10+10) AddVW/10-8 10.5GB/s ± 2% 10.5GB/s ± 1% ~ (p=0.971 n=10+10) AddVW/100-8 19.5GB/s ± 2% 18.9GB/s ± 2% -3.22% (p=0.000 n=10+10) AddVW/1000-8 20.7GB/s ± 2% 20.6GB/s ± 4% ~ (p=0.631 n=10+10) AddVW/10000-8 20.6GB/s ± 3% 20.7GB/s ± 3% ~ (p=0.481 n=10+10) AddVW/100000-8 19.4GB/s ± 2% 19.2GB/s ± 3% ~ (p=0.165 n=10+10) AddMulVVW/1-8 19.5GB/s ± 2% 19.7GB/s ± 3% ~ (p=0.123 n=10+10) AddMulVVW/2-8 30.1GB/s ± 2% 30.2GB/s ± 3% ~ (p=0.297 n=9+9) AddMulVVW/3-8 37.9GB/s ± 2% 36.5GB/s ± 2% -3.63% (p=0.000 n=10+10) AddMulVVW/4-8 40.0GB/s ± 2% 39.4GB/s ± 2% -1.58% (p=0.001 n=10+10) AddMulVVW/5-8 47.3GB/s ± 2% 46.6GB/s ± 1% -1.35% (p=0.001 n=9+9) AddMulVVW/10-8 52.3GB/s ± 2% 60.6GB/s ± 3% +15.76% (p=0.000 n=10+10) AddMulVVW/100-8 80.3GB/s ± 2% 122.1GB/s ± 1% +51.92% (p=0.000 n=10+10) AddMulVVW/1000-8 92.0GB/s ± 1% 130.3GB/s ± 2% +41.61% (p=0.000 n=9+10) AddMulVVW/10000-8 88.2GB/s ± 2% 108.2GB/s ± 5% +22.66% (p=0.000 n=10+10) AddMulVVW/100000-8 88.2GB/s ± 2% 102.9GB/s ± 2% +16.69% (p=0.000 n=10+10) Change-Id: Ic98e30c91d437d845fed03e07e976c3fdbf02b36 Reviewed-on: https://go-review.googlesource.com/74851 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Adam Langley <agl@golang.org>
-
- 23 Feb, 2018 6 commits
-
-
Joe Tsai authored
The Info-ZIP Unix1 extra field is specified as such: >>> Value Size Description ----- ---- ----------- 0x5855 Short tag for this extra block type ("UX") TSize Short total data size for this block AcTime Long time of last access (GMT/UTC) ModTime Long time of last modification (GMT/UTC) <<< The previous handling was incorrect in that it read the AcTime field instead of the ModTime field. The test-osx.zip test unfortunately locked in the wrong behavior. Manually parsing that ZIP file shows that the encoded MS-DOS date and time are 0x4b5f and 0xa97d, which corresponds with a date of 2017-10-31 21:11:58, which matches the correct mod time (off by 1 second due to MS-DOS timestamp resolution). Fixes #23901 Change-Id: I567824c66e8316b9acd103dbecde366874a4b7ef Reviewed-on: https://go-review.googlesource.com/96895 Run-TryBot: Joe Tsai <joetsai@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Ian Lance Taylor authored
They have either already been called by preprintpanics, or they can not be called safely because of the various conditions checked at the start of gopanic. Fixes #24059 Change-Id: I4a6233d12c9f7aaaee72f343257ea108bae79241 Reviewed-on: https://go-review.googlesource.com/96755Reviewed-by: Austin Clements <austin@google.com>
-
Yuval Pavel Zholkover authored
Instead of calling Chmod directly on perm, stat the created file/dir to extract the actual permission bits which can be different from perm due to umask. Fixes #23120. Change-Id: I3e70032451fc254bf48ce9627e98988f84af8d91 Reviewed-on: https://go-review.googlesource.com/84477 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Austin Clements authored
Currently, we use 64MB heap arenas on 64-bit platforms. This works well on UNIX-like OSes because they treat untouched pages as essentially free. However, on Windows, committed memory is charged against a process whether or not it has demand-faulted physical pages in. Hence, on Windows, even a process with a tiny heap will commit 64MB for one heap arena, plus another 32MB for the arena map. Things are much worse under the race detector, which increases the heap commitment by a factor of 5.5X, leading to 384MB of committed memory at runtime init. Fix this by reducing the heap arena size to 4MB on Windows. To counterbalance the effect of increasing the arena map size by a factor of 16, and to further reduce the impact of the commitment for the arena map, we switch from a single entry L1 arena map to a 64 entry L1 arena map. Compared to the original arena design, this slows down the x/benchmarks garbage benchmark by 0.49% (the slow down of this commit alone is 1.59%, but the previous commit bought us a 1% speed-up): name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.28ms ± 1% 2.29ms ± 1% +0.49% (p=0.000 n=17+18) (https://perf.golang.org/search?q=upload:20180223.1) (This was measured on linux/amd64 by modifying its arena configuration as above.) Fixes #23900. Change-Id: I6b7fa5ecebee2947bf20cfeb78c248809469c6b1 Reviewed-on: https://go-review.googlesource.com/96780 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>
-
Austin Clements authored
Currently, the heap arena map is a single, large array that covers every possible arena frame in the entire address space. This is practical up to about 48 bits of address space with 64 MB arenas. However, there are two problems with this: 1. mips64, ppc64, and s390x support full 64-bit address spaces (though on Linux only s390x has kernel support for 64-bit address spaces). On these platforms, it would be good to support these larger address spaces. 2. On Windows, processes are charged for untouched memory, so for processes with small heaps, the mostly-untouched 32 MB arena map plus a 64 MB arena are significant overhead. Hence, it would be good to reduce both the arena map size and the arena size, but with a single-level arena, these are inversely proportional. This CL adds support for a two-level arena map. Arena frame numbers are now divided into arenaL1Bits of L1 index and arenaL2Bits of L2 index. At the moment, arenaL1Bits is always 0, so we effectively have a single level map. We do a few things so that this has no cost beyond the current single-level map: 1. We embed the L2 array directly in mheap, so if there's a single entry in the L2 array, the representation is identical to the current representation and there's no extra level of indirection. 2. Hot code that accesses the arena map is structured so that it optimizes to nearly the same machine code as it does currently. 3. We make some small tweaks to hot code paths and to the inliner itself to keep some important functions inlined despite their now-larger ASTs. In particular, this is necessary for heapBitsForAddr and heapBits.next. Possibly as a result of some of the tweaks, this actually slightly improves the performance of the x/benchmarks garbage benchmark: name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.28ms ± 1% 2.26ms ± 1% -1.07% (p=0.000 n=17+19) (https://perf.golang.org/search?q=upload:20180223.2) For #23900. Change-Id: If5164e0961754f97eb9eca58f837f36d759505ff Reviewed-on: https://go-review.googlesource.com/96779 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>
-
Austin Clements authored
The front-end dead code elimination is very simple. Currently, it just looks for if statements with constant boolean conditions. Its main purpose is to reduce load on the compiler and shrink code before inlining computes hairiness. This CL teaches front-end dead code elimination about short-circuiting boolean expressions && and ||, since they're essentially the same as if statements. This also teaches the inliner that the constant 'if' form left behind by deadcode is free. These changes will help with runtime modifications in the next CL that would otherwise inhibit inlining in some hot code paths. Currently, however, they have no significant impact on benchmarks. Change-Id: I886203b3c4acdbfef08148fddd7f3a7af5afc7c1 Reviewed-on: https://go-review.googlesource.com/96778 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
-