1. 28 Aug, 2018 2 commits
    • Ben Shi's avatar
      cmd/compile: optimize arm64 with indexed FP load/store · 3ca3e89b
      Ben Shi authored
      The FP load/store on arm64 have register indexed forms. And this
      CL implements this optimization.
      
      1. The total size of pkg/android_arm64 (excluding cmd/compile)
      decreases about 400 bytes.
      
      2. There is no regression in the go1 benchmark, the test case
      GobEncode even gets slight improvement, excluding noise.
      
      name                     old time/op    new time/op    delta
      BinaryTree17-4              19.0s ± 0%     19.0s ± 1%    ~     (p=0.817 n=29+29)
      Fannkuch11-4                9.94s ± 0%     9.95s ± 0%  +0.03%  (p=0.010 n=24+30)
      FmtFprintfEmpty-4           233ns ± 0%     233ns ± 0%    ~     (all equal)
      FmtFprintfString-4          427ns ± 0%     427ns ± 0%    ~     (p=0.649 n=30+30)
      FmtFprintfInt-4             471ns ± 0%     471ns ± 0%    ~     (all equal)
      FmtFprintfIntInt-4          730ns ± 0%     730ns ± 0%    ~     (all equal)
      FmtFprintfPrefixedInt-4     889ns ± 0%     889ns ± 0%    ~     (all equal)
      FmtFprintfFloat-4          1.21µs ± 0%    1.21µs ± 0%  +0.04%  (p=0.012 n=20+30)
      FmtManyArgs-4              2.99µs ± 0%    2.99µs ± 0%    ~     (p=0.651 n=29+29)
      GobDecode-4                42.4ms ± 1%    42.3ms ± 1%  -0.27%  (p=0.001 n=29+28)
      GobEncode-4                37.8ms ±11%    36.0ms ± 0%  -4.67%  (p=0.000 n=30+26)
      Gzip-4                      1.98s ± 1%     1.96s ± 1%  -1.26%  (p=0.000 n=30+30)
      Gunzip-4                    175ms ± 0%     175ms ± 0%    ~     (p=0.988 n=29+29)
      HTTPClientServer-4          854µs ± 5%     860µs ± 5%    ~     (p=0.236 n=28+29)
      JSONEncode-4               88.8ms ± 0%    87.9ms ± 0%  -1.00%  (p=0.000 n=24+26)
      JSONDecode-4                390ms ± 1%     392ms ± 2%  +0.48%  (p=0.025 n=30+30)
      Mandelbrot200-4            19.5ms ± 0%    19.5ms ± 0%    ~     (p=0.894 n=24+29)
      GoParse-4                  20.3ms ± 0%    20.1ms ± 1%  -0.94%  (p=0.000 n=27+26)
      RegexpMatchEasy0_32-4       451ns ± 0%     451ns ± 0%    ~     (p=0.578 n=30+30)
      RegexpMatchEasy0_1K-4      1.63µs ± 0%    1.63µs ± 0%    ~     (p=0.298 n=30+28)
      RegexpMatchEasy1_32-4       431ns ± 0%     434ns ± 0%  +0.67%  (p=0.000 n=30+29)
      RegexpMatchEasy1_1K-4      2.60µs ± 0%    2.64µs ± 0%  +1.36%  (p=0.000 n=28+26)
      RegexpMatchMedium_32-4      744ns ± 0%     744ns ± 0%    ~     (p=0.474 n=29+29)
      RegexpMatchMedium_1K-4      223µs ± 0%     223µs ± 0%  -0.08%  (p=0.038 n=26+30)
      RegexpMatchHard_32-4       12.2µs ± 0%    12.3µs ± 0%  +0.27%  (p=0.000 n=29+30)
      RegexpMatchHard_1K-4        373µs ± 0%     373µs ± 0%    ~     (p=0.219 n=29+28)
      Revcomp-4                   2.84s ± 0%     2.84s ± 0%    ~     (p=0.130 n=28+28)
      Template-4                  394ms ± 1%     392ms ± 1%  -0.52%  (p=0.001 n=30+30)
      TimeParse-4                1.93µs ± 0%    1.93µs ± 0%    ~     (p=0.587 n=29+30)
      TimeFormat-4               2.00µs ± 0%    2.00µs ± 0%  +0.07%  (p=0.001 n=28+27)
      [Geo mean]                  306µs          305µs       -0.17%
      
      name                     old speed      new speed      delta
      GobDecode-4              18.1MB/s ± 1%  18.2MB/s ± 1%  +0.27%  (p=0.001 n=29+28)
      GobEncode-4              20.3MB/s ±10%  21.3MB/s ± 0%  +4.64%  (p=0.000 n=30+26)
      Gzip-4                   9.79MB/s ± 1%  9.91MB/s ± 1%  +1.28%  (p=0.000 n=30+30)
      Gunzip-4                  111MB/s ± 0%   111MB/s ± 0%    ~     (p=0.988 n=29+29)
      JSONEncode-4             21.8MB/s ± 0%  22.1MB/s ± 0%  +1.02%  (p=0.000 n=24+26)
      JSONDecode-4             4.97MB/s ± 1%  4.95MB/s ± 2%  -0.45%  (p=0.031 n=30+30)
      GoParse-4                2.85MB/s ± 1%  2.88MB/s ± 1%  +1.03%  (p=0.000 n=30+26)
      RegexpMatchEasy0_32-4    70.9MB/s ± 0%  70.9MB/s ± 0%    ~     (p=0.904 n=29+28)
      RegexpMatchEasy0_1K-4     627MB/s ± 0%   627MB/s ± 0%    ~     (p=0.156 n=30+30)
      RegexpMatchEasy1_32-4    74.2MB/s ± 0%  73.7MB/s ± 0%  -0.67%  (p=0.000 n=30+29)
      RegexpMatchEasy1_1K-4     393MB/s ± 0%   388MB/s ± 0%  -1.34%  (p=0.000 n=28+26)
      RegexpMatchMedium_32-4   1.34MB/s ± 0%  1.34MB/s ± 0%    ~     (all equal)
      RegexpMatchMedium_1K-4   4.59MB/s ± 0%  4.59MB/s ± 0%  +0.07%  (p=0.035 n=25+30)
      RegexpMatchHard_32-4     2.61MB/s ± 0%  2.61MB/s ± 0%  -0.11%  (p=0.002 n=28+30)
      RegexpMatchHard_1K-4     2.75MB/s ± 0%  2.75MB/s ± 0%  +0.15%  (p=0.001 n=30+24)
      Revcomp-4                89.4MB/s ± 0%  89.4MB/s ± 0%    ~     (p=0.140 n=28+28)
      Template-4               4.93MB/s ± 1%  4.95MB/s ± 1%  +0.51%  (p=0.001 n=30+30)
      [Geo mean]               18.4MB/s       18.4MB/s       +0.37%
      
      Change-Id: I9a6b521a971b21cfb51064e8e9b853cef8a1d071
      Reviewed-on: https://go-review.googlesource.com/124636
      Run-TryBot: Ben Shi <powerman1st@163.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      3ca3e89b
    • Guoliang Wang's avatar
      os: add ExitCode method to ProcessState · be94dac4
      Guoliang Wang authored
      Fixes #26539
      
      Change-Id: I6d403c1bbb552e1f1bdcc09a7ccd60b50617e0fc
      GitHub-Last-Rev: 0b5262df5d99504523fd7a4665cb70a3cc6b0a09
      GitHub-Pull-Request: golang/go#26544
      Reviewed-on: https://go-review.googlesource.com/125443
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      be94dac4
  2. 27 Aug, 2018 3 commits
    • Ben Shi's avatar
      cmd/compile: optimize arm's comparison · 2b69ad0b
      Ben Shi authored
      The CMP/CMN/TST/TEQ perform similar to SUB/ADD/AND/XOR except
      the result is abondoned, and only NZCV flags are affected.
      
      This CL implements further optimization with them.
      
      1. A micro benchmark test gets more than 9% improvment.
      TSTTEQ-4                   6.99ms ± 0%    6.35ms ± 0%  -9.15%  (p=0.000 n=33+36)
      (https://github.com/benshi001/ugo1/blob/master/tstteq2_test.go)
      
      2. The go1 benckmark shows no regression, excluding noise.
      name                     old time/op    new time/op    delta
      BinaryTree17-4              25.7s ± 1%     25.7s ± 1%    ~     (p=0.830 n=40+40)
      Fannkuch11-4                13.3s ± 0%     13.2s ± 0%  -0.65%  (p=0.000 n=40+34)
      FmtFprintfEmpty-4           394ns ± 0%     394ns ± 0%    ~     (p=0.819 n=40+40)
      FmtFprintfString-4          677ns ± 0%     677ns ± 0%  +0.06%  (p=0.039 n=39+40)
      FmtFprintfInt-4             707ns ± 0%     706ns ± 0%  -0.14%  (p=0.000 n=40+39)
      FmtFprintfIntInt-4         1.04µs ± 0%    1.04µs ± 0%  +0.10%  (p=0.000 n=29+31)
      FmtFprintfPrefixedInt-4    1.10µs ± 0%    1.11µs ± 0%  +0.65%  (p=0.000 n=39+37)
      FmtFprintfFloat-4          2.27µs ± 0%    2.26µs ± 0%  -0.53%  (p=0.000 n=39+40)
      FmtManyArgs-4              3.96µs ± 0%    3.96µs ± 0%  +0.10%  (p=0.000 n=39+40)
      GobDecode-4                53.4ms ± 1%    52.8ms ± 2%  -1.10%  (p=0.000 n=39+39)
      GobEncode-4                50.3ms ± 3%    50.4ms ± 2%    ~     (p=0.089 n=40+39)
      Gzip-4                      2.62s ± 0%     2.64s ± 0%  +0.60%  (p=0.000 n=40+39)
      Gunzip-4                    312ms ± 0%     312ms ± 0%  +0.02%  (p=0.030 n=40+39)
      HTTPClientServer-4         1.01ms ± 7%    0.98ms ± 7%  -2.37%  (p=0.000 n=40+39)
      JSONEncode-4                126ms ± 1%     126ms ± 1%  -0.38%  (p=0.004 n=39+39)
      JSONDecode-4                423ms ± 0%     426ms ± 2%  +0.72%  (p=0.001 n=39+40)
      Mandelbrot200-4            18.4ms ± 0%    18.4ms ± 0%  +0.04%  (p=0.000 n=38+40)
      GoParse-4                  22.8ms ± 0%    22.6ms ± 0%  -0.68%  (p=0.000 n=35+40)
      RegexpMatchEasy0_32-4       699ns ± 0%     704ns ± 0%  +0.73%  (p=0.000 n=27+40)
      RegexpMatchEasy0_1K-4      4.27µs ± 0%    4.26µs ± 0%  -0.09%  (p=0.000 n=35+38)
      RegexpMatchEasy1_32-4       741ns ± 0%     735ns ± 0%  -0.85%  (p=0.000 n=40+35)
      RegexpMatchEasy1_1K-4      5.53µs ± 0%    5.49µs ± 0%  -0.69%  (p=0.000 n=39+40)
      RegexpMatchMedium_32-4     1.07µs ± 0%    1.04µs ± 2%  -2.34%  (p=0.000 n=40+40)
      RegexpMatchMedium_1K-4      261µs ± 0%     261µs ± 0%  -0.16%  (p=0.000 n=40+39)
      RegexpMatchHard_32-4       14.9µs ± 0%    14.9µs ± 0%  -0.18%  (p=0.000 n=39+40)
      RegexpMatchHard_1K-4        445µs ± 0%     446µs ± 0%  +0.09%  (p=0.000 n=36+34)
      Revcomp-4                  41.8ms ± 1%    41.8ms ± 1%    ~     (p=0.595 n=39+38)
      Template-4                  530ms ± 1%     528ms ± 1%  -0.49%  (p=0.000 n=40+40)
      TimeParse-4                3.39µs ± 0%    3.42µs ± 0%  +0.98%  (p=0.000 n=36+38)
      TimeFormat-4               6.12µs ± 0%    6.07µs ± 0%  -0.81%  (p=0.000 n=34+38)
      [Geo mean]                  384µs          383µs       -0.24%
      
      name                     old speed      new speed      delta
      GobDecode-4              14.4MB/s ± 1%  14.5MB/s ± 2%  +1.11%  (p=0.000 n=39+39)
      GobEncode-4              15.3MB/s ± 3%  15.2MB/s ± 2%    ~     (p=0.104 n=40+39)
      Gzip-4                   7.40MB/s ± 1%  7.36MB/s ± 0%  -0.60%  (p=0.000 n=40+39)
      Gunzip-4                 62.2MB/s ± 0%  62.1MB/s ± 0%  -0.02%  (p=0.047 n=40+39)
      JSONEncode-4             15.4MB/s ± 1%  15.4MB/s ± 2%  +0.39%  (p=0.002 n=39+39)
      JSONDecode-4             4.59MB/s ± 0%  4.56MB/s ± 2%  -0.71%  (p=0.000 n=39+40)
      GoParse-4                2.54MB/s ± 0%  2.56MB/s ± 0%  +0.72%  (p=0.000 n=26+40)
      RegexpMatchEasy0_32-4    45.8MB/s ± 0%  45.4MB/s ± 0%  -0.75%  (p=0.000 n=38+40)
      RegexpMatchEasy0_1K-4     240MB/s ± 0%   240MB/s ± 0%  +0.09%  (p=0.000 n=35+38)
      RegexpMatchEasy1_32-4    43.1MB/s ± 0%  43.5MB/s ± 0%  +0.84%  (p=0.000 n=40+39)
      RegexpMatchEasy1_1K-4     185MB/s ± 0%   186MB/s ± 0%  +0.69%  (p=0.000 n=39+40)
      RegexpMatchMedium_32-4    936kB/s ± 1%   959kB/s ± 2%  +2.38%  (p=0.000 n=40+40)
      RegexpMatchMedium_1K-4   3.92MB/s ± 0%  3.93MB/s ± 0%  +0.18%  (p=0.000 n=39+40)
      RegexpMatchHard_32-4     2.15MB/s ± 0%  2.15MB/s ± 0%  +0.19%  (p=0.000 n=40+40)
      RegexpMatchHard_1K-4     2.30MB/s ± 0%  2.30MB/s ± 0%    ~     (all equal)
      Revcomp-4                60.8MB/s ± 1%  60.8MB/s ± 1%    ~     (p=0.600 n=39+38)
      Template-4               3.66MB/s ± 1%  3.68MB/s ± 1%  +0.46%  (p=0.000 n=40+40)
      [Geo mean]               12.8MB/s       12.8MB/s       +0.27%
      
      Change-Id: I849161169ecf0876a04b7c1d3990fa8d1435215e
      Reviewed-on: https://go-review.googlesource.com/122855
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      2b69ad0b
    • Ben Shi's avatar
      cmd/compile: add missing type information for some arm/arm64 rules · 096229b2
      Ben Shi authored
      Some indexed load/store rules lack of type information, and this
      CL adds that for them.
      
      Change-Id: Icac315ccb83a2f5bf30b056d4667d5b59eb4e5e2
      Reviewed-on: https://go-review.googlesource.com/128455Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      096229b2
    • Benny Siegert's avatar
      cmd/dist: do not run race detector tests on netbsd · 7334904e
      Benny Siegert authored
      The race detector is not fully functional on NetBSD yet. Without
      this change, all.bash fails in TestOutput.
      
      This unbreaks the netbsd-amd64 builder.
      
      Update #26403
      Fixes #27268
      
      Change-Id: I2c7015692d3632aa1037f40155d4fc5c7bb1d8c3
      Reviewed-on: https://go-review.googlesource.com/131555Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      7334904e
  3. 26 Aug, 2018 6 commits
  4. 25 Aug, 2018 7 commits
    • Daniel Martí's avatar
      encoding/json: remove a branch in the structEncoder loop · c21ba224
      Daniel Martí authored
      Encoders like map and array can use the much cheaper "i > 0" check to
      see if we're not writing the first element. However, since struct fields
      support omitempty, we need to keep track of that separately.
      
      This is much more expensive - after calling the field encoder itself,
      and retrieving the field via reflection, this branch was the third most
      expensive piece of this field loop.
      
      Instead, hoist the branch logic outside of the loop. The code doesn't
      get much more complex, since we just delay the writing of each byte
      until the next iteration. Yet the performance improvement is noticeable,
      even when the struct types in CodeEncoder only have 2 and 7 fields,
      respectively.
      
      name           old time/op    new time/op    delta
      CodeEncoder-4    5.39ms ± 0%    5.31ms ± 0%  -1.37%  (p=0.010 n=4+6)
      
      name           old speed      new speed      delta
      CodeEncoder-4   360MB/s ± 0%   365MB/s ± 0%  +1.39%  (p=0.010 n=4+6)
      
      Updates #5683.
      
      Change-Id: I2662cf459e0dfd68e56fa52bc898a417e84266c2
      Reviewed-on: https://go-review.googlesource.com/131401
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      c21ba224
    • Daniel Martí's avatar
      encoding/json: avoid some more pointer receivers · 88f4bcce
      Daniel Martí authored
      A few encoder struct types, such as map and slice, only encapsulate
      other prepared encoder funcs. Using pointer receivers has no advantage,
      and makes calling these methods slightly more expensive.
      
      Not a huge performance win, but certainly an easy one. The struct types
      used in the benchmark below contain one slice field and one pointer
      field.
      
      name           old time/op    new time/op    delta
      CodeEncoder-4    5.48ms ± 0%    5.39ms ± 0%  -1.66%  (p=0.010 n=6+4)
      
      name           old speed      new speed      delta
      CodeEncoder-4   354MB/s ± 0%   360MB/s ± 0%  +1.69%  (p=0.010 n=6+4)
      
      Updates #5683.
      
      Change-Id: I9f78dbe07fcc6fbf19a6d96c22f5d6970db9eca4
      Reviewed-on: https://go-review.googlesource.com/131400
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      88f4bcce
    • Brad Fitzpatrick's avatar
      net/http: make Transport return Writable Response.Body on protocol switch · 54162040
      Brad Fitzpatrick authored
      Updates #26937
      Updates #17227
      
      Change-Id: I79865938b05c219e1947822e60e4f52bb2604b70
      Reviewed-on: https://go-review.googlesource.com/131279Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      54162040
    • Goo's avatar
      src/make.bat: add missing go.exe extension · 30b080e0
      Goo authored
      Got error:
      'go' is not an internal or external command, nor is it a runnable program
      
      Change-Id: Ie45a3a12252fa01b67ca09ef8fbb5b4bbf728fe7
      GitHub-Last-Rev: 451815cacd9bfc509fa0aab3be54303797e605a2
      GitHub-Pull-Request: golang/go#27214
      Reviewed-on: https://go-review.googlesource.com/131397Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      30b080e0
    • Florin Pățan's avatar
      cmd/dist: fix compilation on windows · d145d923
      Florin Pățan authored
      Add missing extensions to binary files in order to allow execution.
      
      Change-Id: Idfe4c72c80c26b7b938023bc7bbe1ef85e1aa7b0
      
      Change-Id: Idfe4c72c80c26b7b938023bc7bbe1ef85e1aa7b0
      GitHub-Last-Rev: ed9d8124270c30b7f25f89656432ef5089466c7e
      GitHub-Pull-Request: golang/go#26464
      Reviewed-on: https://go-review.googlesource.com/124936
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      d145d923
    • Ian Lance Taylor's avatar
      cmd/go: don't let script grep commands match $WORK · f2ed3e1d
      Ian Lance Taylor authored
      If $WORK happens to contain the string that a stdout/stderr/grep
      command is searching for, a negative grep command will fail incorrectly.
      
      Fixes #27170
      Fixes #27221
      
      Change-Id: I84454d3c42360fe3295c7235d388381525eb85b4
      Reviewed-on: https://go-review.googlesource.com/131398
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBryan C. Mills <bcmills@google.com>
      f2ed3e1d
    • Ben Shi's avatar
      cmd/compile: optimize 386 code with FLDPI · e03220a5
      Ben Shi authored
      FLDPI pushes the constant pi to 387's register stack, which is
      more efficient than MOVSSconst/MOVSDconst.
      
      1. This optimization reduces 0.3KB of the total size of pkg/linux_386
      (exlcuding cmd/compile).
      
      2. There is little regression in the go1 benchmark.
      name                     old time/op    new time/op    delta
      BinaryTree17-4              3.30s ± 3%     3.30s ± 2%    ~     (p=0.759 n=40+39)
      Fannkuch11-4                3.53s ± 1%     3.54s ± 1%    ~     (p=0.168 n=40+40)
      FmtFprintfEmpty-4          45.5ns ± 3%    45.6ns ± 3%    ~     (p=0.553 n=40+40)
      FmtFprintfString-4         78.4ns ± 3%    78.3ns ± 3%    ~     (p=0.593 n=40+40)
      FmtFprintfInt-4            88.8ns ± 2%    89.9ns ± 2%    ~     (p=0.083 n=40+33)
      FmtFprintfIntInt-4          140ns ± 4%     140ns ± 4%    ~     (p=0.656 n=40+40)
      FmtFprintfPrefixedInt-4     180ns ± 2%     181ns ± 3%  +0.53%  (p=0.050 n=40+40)
      FmtFprintfFloat-4           408ns ± 4%     411ns ± 3%    ~     (p=0.112 n=40+40)
      FmtManyArgs-4               599ns ± 3%     602ns ± 3%    ~     (p=0.784 n=40+40)
      GobDecode-4                7.24ms ± 6%    7.30ms ± 5%    ~     (p=0.171 n=40+40)
      GobEncode-4                6.98ms ± 5%    6.89ms ± 8%    ~     (p=0.107 n=40+40)
      Gzip-4                      396ms ± 4%     396ms ± 3%    ~     (p=0.852 n=40+40)
      Gunzip-4                   41.3ms ± 3%    41.5ms ± 4%    ~     (p=0.221 n=40+40)
      HTTPClientServer-4         63.4µs ± 3%    63.4µs ± 2%    ~     (p=0.895 n=39+40)
      JSONEncode-4               17.5ms ± 2%    17.5ms ± 3%    ~     (p=0.090 n=40+40)
      JSONDecode-4               60.6ms ± 3%    60.1ms ± 4%    ~     (p=0.184 n=40+40)
      Mandelbrot200-4            7.80ms ± 3%    7.78ms ± 2%    ~     (p=0.512 n=40+40)
      GoParse-4                  3.30ms ± 3%    3.28ms ± 2%  -0.61%  (p=0.034 n=40+40)
      RegexpMatchEasy0_32-4       104ns ± 4%     103ns ± 4%    ~     (p=0.118 n=40+40)
      RegexpMatchEasy0_1K-4       850ns ± 2%     848ns ± 2%    ~     (p=0.370 n=40+40)
      RegexpMatchEasy1_32-4       112ns ± 4%     112ns ± 4%    ~     (p=0.848 n=40+40)
      RegexpMatchEasy1_1K-4      1.04µs ± 4%    1.03µs ± 4%    ~     (p=0.333 n=40+40)
      RegexpMatchMedium_32-4      132ns ± 4%     131ns ± 3%    ~     (p=0.527 n=40+40)
      RegexpMatchMedium_1K-4     43.4µs ± 3%    43.5µs ± 3%    ~     (p=0.111 n=40+40)
      RegexpMatchHard_32-4       2.24µs ± 4%    2.24µs ± 4%    ~     (p=0.441 n=40+40)
      RegexpMatchHard_1K-4       67.9µs ± 3%    68.0µs ± 3%    ~     (p=0.095 n=40+40)
      Revcomp-4                   1.84s ± 2%     1.84s ± 2%    ~     (p=0.677 n=40+40)
      Template-4                 68.4ms ± 3%    68.6ms ± 3%    ~     (p=0.345 n=40+40)
      TimeParse-4                 433ns ± 3%     433ns ± 3%    ~     (p=0.403 n=40+40)
      TimeFormat-4                407ns ± 3%     406ns ± 3%    ~     (p=0.900 n=40+40)
      [Geo mean]                 67.1µs         67.2µs       +0.04%
      
      name                     old speed      new speed      delta
      GobDecode-4               106MB/s ± 5%   105MB/s ± 5%    ~     (p=0.173 n=40+40)
      GobEncode-4               110MB/s ± 5%   112MB/s ± 9%    ~     (p=0.104 n=40+40)
      Gzip-4                   49.0MB/s ± 4%  49.1MB/s ± 4%    ~     (p=0.836 n=40+40)
      Gunzip-4                  471MB/s ± 3%   468MB/s ± 4%    ~     (p=0.218 n=40+40)
      JSONEncode-4              111MB/s ± 2%   111MB/s ± 3%    ~     (p=0.090 n=40+40)
      JSONDecode-4             32.0MB/s ± 3%  32.3MB/s ± 4%    ~     (p=0.194 n=40+40)
      GoParse-4                17.6MB/s ± 3%  17.7MB/s ± 2%  +0.62%  (p=0.035 n=40+40)
      RegexpMatchEasy0_32-4     307MB/s ± 4%   309MB/s ± 4%  +0.70%  (p=0.041 n=40+40)
      RegexpMatchEasy0_1K-4    1.20GB/s ± 3%  1.21GB/s ± 2%    ~     (p=0.353 n=40+40)
      RegexpMatchEasy1_32-4     285MB/s ± 3%   284MB/s ± 4%    ~     (p=0.384 n=40+40)
      RegexpMatchEasy1_1K-4     988MB/s ± 4%   992MB/s ± 3%    ~     (p=0.335 n=40+40)
      RegexpMatchMedium_32-4   7.56MB/s ± 4%  7.57MB/s ± 4%    ~     (p=0.314 n=40+40)
      RegexpMatchMedium_1K-4   23.6MB/s ± 3%  23.6MB/s ± 3%    ~     (p=0.107 n=40+40)
      RegexpMatchHard_32-4     14.3MB/s ± 4%  14.3MB/s ± 4%    ~     (p=0.429 n=40+40)
      RegexpMatchHard_1K-4     15.1MB/s ± 3%  15.1MB/s ± 3%    ~     (p=0.099 n=40+40)
      Revcomp-4                 138MB/s ± 2%   138MB/s ± 2%    ~     (p=0.658 n=40+40)
      Template-4               28.4MB/s ± 3%  28.3MB/s ± 3%    ~     (p=0.331 n=40+40)
      [Geo mean]               80.8MB/s       80.8MB/s       +0.09%
      
      Change-Id: I0cb715eead68ade097a302e7fb80ccbd1d1b511e
      Reviewed-on: https://go-review.googlesource.com/130975
      Run-TryBot: Ben Shi <powerman1st@163.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      e03220a5
  5. 24 Aug, 2018 22 commits
    • Ben Shi's avatar
      cmd/compile: introduce more read-modify-write operations for amd64 · 3bc34385
      Ben Shi authored
      Add suport of read-modify-write for AND/SUB/AND/OR/XOR on amd64.
      
      1. The total size of pkg/linux_amd64 decreases about 4KB, excluding
      cmd/compile.
      
      2. The go1 benchmark shows a little improvement, excluding noise.
      
      name                     old time/op    new time/op    delta
      BinaryTree17-4              2.63s ± 3%     2.65s ± 4%   +1.01%  (p=0.037 n=35+35)
      Fannkuch11-4                2.33s ± 2%     2.39s ± 2%   +2.49%  (p=0.000 n=35+35)
      FmtFprintfEmpty-4          45.4ns ± 5%    40.8ns ± 6%  -10.09%  (p=0.000 n=35+35)
      FmtFprintfString-4         73.3ns ± 4%    70.9ns ± 3%   -3.23%  (p=0.000 n=30+35)
      FmtFprintfInt-4            79.9ns ± 4%    79.5ns ± 3%     ~     (p=0.736 n=34+35)
      FmtFprintfIntInt-4          126ns ± 4%     125ns ± 4%     ~     (p=0.083 n=35+35)
      FmtFprintfPrefixedInt-4     152ns ± 6%     152ns ± 3%     ~     (p=0.855 n=34+35)
      FmtFprintfFloat-4           215ns ± 4%     213ns ± 4%     ~     (p=0.066 n=35+35)
      FmtManyArgs-4               522ns ± 3%     506ns ± 3%   -3.15%  (p=0.000 n=35+35)
      GobDecode-4                6.45ms ± 8%    6.51ms ± 7%   +0.96%  (p=0.026 n=35+35)
      GobEncode-4                6.10ms ± 6%    6.02ms ± 8%     ~     (p=0.160 n=35+35)
      Gzip-4                      228ms ± 3%     221ms ± 3%   -2.92%  (p=0.000 n=35+35)
      Gunzip-4                   37.5ms ± 4%    37.2ms ± 3%   -0.78%  (p=0.036 n=35+35)
      HTTPClientServer-4         58.7µs ± 2%    59.2µs ± 1%   +0.80%  (p=0.000 n=33+33)
      JSONEncode-4               12.0ms ± 3%    12.2ms ± 3%   +1.84%  (p=0.008 n=35+35)
      JSONDecode-4               57.0ms ± 4%    56.6ms ± 3%     ~     (p=0.320 n=35+35)
      Mandelbrot200-4            3.82ms ± 3%    3.79ms ± 3%     ~     (p=0.074 n=35+35)
      GoParse-4                  3.21ms ± 5%    3.24ms ± 4%     ~     (p=0.119 n=35+35)
      RegexpMatchEasy0_32-4      76.3ns ± 4%    75.4ns ± 4%   -1.14%  (p=0.014 n=34+33)
      RegexpMatchEasy0_1K-4       251ns ± 4%     254ns ± 3%   +1.28%  (p=0.016 n=35+35)
      RegexpMatchEasy1_32-4      69.6ns ± 3%    70.1ns ± 3%   +0.82%  (p=0.005 n=35+35)
      RegexpMatchEasy1_1K-4       367ns ± 4%     376ns ± 4%   +2.47%  (p=0.000 n=35+35)
      RegexpMatchMedium_32-4      108ns ± 5%     104ns ± 4%   -3.18%  (p=0.000 n=35+35)
      RegexpMatchMedium_1K-4     33.8µs ± 3%    32.7µs ± 3%   -3.27%  (p=0.000 n=35+35)
      RegexpMatchHard_32-4       1.55µs ± 3%    1.52µs ± 3%   -1.64%  (p=0.000 n=35+35)
      RegexpMatchHard_1K-4       46.6µs ± 3%    46.6µs ± 4%     ~     (p=0.149 n=35+35)
      Revcomp-4                   416ms ± 7%     412ms ± 6%   -0.95%  (p=0.033 n=33+35)
      Template-4                 64.3ms ± 3%    62.4ms ± 7%   -2.94%  (p=0.000 n=35+35)
      TimeParse-4                 320ns ± 2%     322ns ± 3%     ~     (p=0.589 n=35+35)
      TimeFormat-4                300ns ± 3%     300ns ± 3%     ~     (p=0.597 n=35+35)
      [Geo mean]                 47.4µs         47.0µs        -0.86%
      
      name                     old speed      new speed      delta
      GobDecode-4               119MB/s ± 7%   118MB/s ± 7%   -0.96%  (p=0.027 n=35+35)
      GobEncode-4               126MB/s ± 7%   127MB/s ± 6%     ~     (p=0.157 n=34+34)
      Gzip-4                   85.3MB/s ± 3%  87.9MB/s ± 3%   +3.02%  (p=0.000 n=35+35)
      Gunzip-4                  518MB/s ± 4%   522MB/s ± 3%   +0.79%  (p=0.037 n=35+35)
      JSONEncode-4              162MB/s ± 3%   159MB/s ± 3%   -1.81%  (p=0.009 n=35+35)
      JSONDecode-4             34.1MB/s ± 4%  34.3MB/s ± 3%     ~     (p=0.318 n=35+35)
      GoParse-4                18.0MB/s ± 5%  17.9MB/s ± 4%     ~     (p=0.117 n=35+35)
      RegexpMatchEasy0_32-4     419MB/s ± 3%   425MB/s ± 4%   +1.46%  (p=0.003 n=32+33)
      RegexpMatchEasy0_1K-4    4.07GB/s ± 4%  4.02GB/s ± 3%   -1.28%  (p=0.014 n=35+35)
      RegexpMatchEasy1_32-4     460MB/s ± 3%   456MB/s ± 4%   -0.82%  (p=0.004 n=35+35)
      RegexpMatchEasy1_1K-4    2.79GB/s ± 4%  2.72GB/s ± 4%   -2.39%  (p=0.000 n=35+35)
      RegexpMatchMedium_32-4   9.23MB/s ± 4%  9.53MB/s ± 4%   +3.16%  (p=0.000 n=35+35)
      RegexpMatchMedium_1K-4   30.3MB/s ± 3%  31.3MB/s ± 3%   +3.38%  (p=0.000 n=35+35)
      RegexpMatchHard_32-4     20.7MB/s ± 3%  21.0MB/s ± 3%   +1.67%  (p=0.000 n=35+35)
      RegexpMatchHard_1K-4     22.0MB/s ± 3%  21.9MB/s ± 4%     ~     (p=0.277 n=35+33)
      Revcomp-4                 612MB/s ± 7%   618MB/s ± 6%   +0.96%  (p=0.034 n=33+35)
      Template-4               30.2MB/s ± 3%  31.1MB/s ± 6%   +3.05%  (p=0.000 n=35+35)
      [Geo mean]                123MB/s        124MB/s        +0.64%
      
      Change-Id: Ia025da272e07d0069413824bfff3471b106d6280
      Reviewed-on: https://go-review.googlesource.com/121535
      Run-TryBot: Ben Shi <powerman1st@163.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarIlya Tocar <ilya.tocar@intel.com>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      3bc34385
    • Shenghou Ma's avatar
      doc/go1.11: fix typo · aacc891d
      Shenghou Ma authored
      Change-Id: I097bd90f62add7838f8c7baf3b777ad167635354
      Reviewed-on: https://go-review.googlesource.com/131357Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      aacc891d
    • Keith Randall's avatar
      cmd/compile: remove vet-blocking hack · 4a4e3b0b
      Keith Randall authored
      ...and add the vet failures to the vet whitelist.
      
      Change-Id: Idcf4289f39dda561c85f3b0afe396e5299e6495f
      Reviewed-on: https://go-review.googlesource.com/127995
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      4a4e3b0b
    • Keith Randall's avatar
      cmd/compile: enable two orphaned tests · 707fd452
      Keith Randall authored
      These tests weren't being run.  Re-enable them.
      
      R=go1.12
      
      Change-Id: I8d3cd09b7f07e4c39f855ddb9be000718ec86494
      Reviewed-on: https://go-review.googlesource.com/127117
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      707fd452
    • Keith Randall's avatar
      cmd/compile: move last compile tests to new test infrastructure · dca709da
      Keith Randall authored
      R=go1.12
      
      Fixes #26469
      
      Change-Id: Idbba88ef60f15a0ec9a83c78541a4d4fb63e534a
      Reviewed-on: https://go-review.googlesource.com/127116Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      dca709da
    • Keith Randall's avatar
      cmd/compile: move more compiler tests to new test infrastructure · 25ea4e57
      Keith Randall authored
      Update #26469
      
      Change-Id: I1188e49cde1bda11506afef6b6e3f34c6ff45ea5
      Reviewed-on: https://go-review.googlesource.com/127115
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      25ea4e57
    • Keith Randall's avatar
      reflect: use a bigger object when we need a finalizer to run · 78ce3a03
      Keith Randall authored
      If an object is allocated as part of a tinyalloc, then other live
      objects in the same tinyalloc chunk keep the finalizer from being run,
      even if the object that has the finalizer is dead.
      
      Make sure the object we're setting the finalizer on is big enough
      to not trigger tinyalloc allocation.
      
      Fixes #26857
      Update #21717
      
      Change-Id: I56ad8679426283237ebff20a0da6c9cf64eb1c27
      Reviewed-on: https://go-review.googlesource.com/128475
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarAustin Clements <austin@google.com>
      Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      78ce3a03
    • Keith Randall's avatar
      cmd/compile: move autogenerated tests to new infrastructure · 776298ab
      Keith Randall authored
      Update #26469
      
      R=go1.12
      
      Change-Id: Ib9a00ee5e98371769669bb9cad58320b66127374
      Reviewed-on: https://go-review.googlesource.com/127095
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      776298ab
    • Keith Randall's avatar
      cmd/compile: move over more compiler tests to new test infrastructure · ed21535a
      Keith Randall authored
      R=go1.12
      
      Update #26469
      
      Change-Id: Iad75edfc194f8391a8ead09bfa68d446155e84ac
      Reviewed-on: https://go-review.googlesource.com/127055
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      ed21535a
    • Keith Randall's avatar
      cmd/compile: unify compilation of compiler tests · 45e7e668
      Keith Randall authored
      Before this CL we would build&run each test file individually.
      Building the test takes most of the time, a significant fraction of a
      second. Running the tests are really fast.
      
      After this CL, we build all the tests at once, then run each
      individually. We only have to run the compiler&linker once (or twice,
      for softfloat architectures) instead of once per test.
      
      While we're here, organize these tests to fit a bit more into the
      standard testing framework.
      
      This is just the organizational CL that changes the testing framework
      and migrates 2 tests.  Future tests will follow.
      
      R=go1.12
      
      Update #26469
      
      Change-Id: I1a1e7338c054b51f0c1c4c539d48d3d046b08b7d
      Reviewed-on: https://go-review.googlesource.com/126995
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      45e7e668
    • Andrew Bonventre's avatar
      doc: document Go 1.10.4 · 97cc4b51
      Andrew Bonventre authored
      Change-Id: I7383e7d37a71defcad79fc662c4b4d1ca02189d1
      Reviewed-on: https://go-review.googlesource.com/131336Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      97cc4b51
    • Yury Smolsky's avatar
      cmd/compile: display AST IR in ssa.html · 4cc027fb
      Yury Smolsky authored
      This change adds a new column, AST IR. That column contains
      nodes for a function specified in $GOSSAFUNC.
      
      Also this CL enables horizontal scrolling of sources and AST columns.
      
      Fixes #26662
      
      Change-Id: I3fba39fd998bb05e9c93038e8ec2384c69613b24
      Reviewed-on: https://go-review.googlesource.com/126858
      Run-TryBot: Yury Smolsky <yury@smolsky.by>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      4cc027fb
    • Ian Lance Taylor's avatar
      runtime: mark sigInitIgnored nosplit · 97f15352
      Ian Lance Taylor authored
      The sigInitIgnored function can be called by initsig before a shared
      library is initialized, before the runtime is initialized.
      
      Fixes #27183
      
      Change-Id: I7073767938fc011879d47ea951d63a14d1cce878
      Reviewed-on: https://go-review.googlesource.com/131277
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      Reviewed-by: 's avatarAustin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      97f15352
    • Martin Möhrmann's avatar
      all: align cpu feature variable offset naming · 05c02444
      Martin Möhrmann authored
      Add an "offset_" prefix to all cpu feature variable offset constants to
      signify that they are not boolean cpu feature variables.
      
      Remove _ from offset constant names.
      
      Change-Id: I6e22a79ebcbe6e2ae54c4ac8764f9260bb3223ff
      Reviewed-on: https://go-review.googlesource.com/131215
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      05c02444
    • Martin Möhrmann's avatar
      runtime: replace sys.CacheLineSize by corresponding internal/cpu const and vars · 961eb13b
      Martin Möhrmann authored
      sys here is runtime/internal/sys.
      
      Replace uses of sys.CacheLineSize for padding by
      cpu.CacheLinePad or cpu.CacheLinePadSize.
      Replace other uses of sys.CacheLineSize by cpu.CacheLineSize.
      Remove now unused sys.CacheLineSize.
      
      Updates #25203
      
      Change-Id: I1daf410fe8f6c0493471c2ceccb9ca0a5a75ed8f
      Reviewed-on: https://go-review.googlesource.com/126601
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      961eb13b
    • Daniel Martí's avatar
      cmd/compile: cleanup walking OCONV/OCONVNOP · 2200b182
      Daniel Martí authored
      Use a separate func, which needs less indentation and can use returns
      instead of labelled breaks. We can also give the types better names, and
      we don't have to repeat the calls to conv and mkcall.
      
      Passes toolstash -cmp on std cmd.
      
      Change-Id: I1071c170fa729562d70093a09b7dea003c5fe26e
      Reviewed-on: https://go-review.googlesource.com/130075
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      2200b182
    • Martin Möhrmann's avatar
      internal/cpu: add a CacheLinePadSize constant · 60f83621
      Martin Möhrmann authored
      The new constant CacheLinePadSize can be used to compute best effort
      alignment of structs to cache lines.
      
      e.g. the runtime can use this in the locktab definition:
      var locktab [57]struct {
              l   spinlock
              pad [cpu.CacheLinePadSize - unsafe.Sizeof(spinlock{})]byte
      }
      
      Change-Id: I86f6fbfc5ee7436f742776a7d4a99a1d54ffccc8
      Reviewed-on: https://go-review.googlesource.com/131237Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      60f83621
    • Alberto Donizetti's avatar
      time: allow +00 as numeric timezone name and GMT offset · 38143bad
      Alberto Donizetti authored
      A timezone with a zero offset from UTC and without a three-letter
      abbreviation will have a numeric name in timestamps: "+00".
      
      There are currently two of them:
      
        $ zdump Atlantic/Azores America/Scoresbysund
        Atlantic/Azores       Wed Aug 22 09:01:05 2018 +00
        America/Scoresbysund  Wed Aug 22 09:01:05 2018 +00
      
      These two timestamp are rejected by Parse, since it doesn't allow for
      zero offsets:
      
        parsing time "Wed Aug 22 09:01:05 2018 +00": extra text: +00
      
      This change modifies Parse to accept a +00 offset in numeric timezone
      names.
      
      As side effect of this change, Parse also now accepts "GMT+00". It was
      explicitely disallowed (with a unit test ensuring it got rejected),
      but the restriction seems incorrect.
      
      DATE(1), for example, allows it:
      
        $ date --debug --date="2009-01-02 03:04:05 GMT+00"
      
        date: parsed date part: (Y-M-D) 2009-01-02
        date: parsed time part: 03:04:05
        date: parsed zone part: UTC+00
        date: input timezone: parsed date/time string (+00)
        date: using specified time as starting value: '03:04:05'
        date: starting date/time: '(Y-M-D) 2009-01-02 03:04:05 TZ=+00'
        date: '(Y-M-D) 2009-01-02 03:04:05 TZ=+00' = 1230865445 epoch-seconds
        date: timezone: system default
        date: final: 1230865445.000000000 (epoch-seconds)
        date: final: (Y-M-D) 2009-01-02 03:04:05 (UTC)
        date: final: (Y-M-D) 2009-01-02 04:04:05 (UTC+01)
        Fri  2 Jan 04:04:05 CET 2009
      
      This fixes 2 of 17 time.Parse() failures listed in Issue #26032.
      
      Updates #26032
      
      Change-Id: I01cd067044371322b7bb1dae452fb3c758ed3cc2
      Reviewed-on: https://go-review.googlesource.com/130696
      Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      38143bad
    • Tobias Klauser's avatar
      syscall, os: use pipe2 syscall on DragonflyBSD instead of pipe · e6c15945
      Tobias Klauser authored
      Follow the implementation used by the other BSDs ith os.Pipe and
      syscall.forkExecPipe consisting of a single syscall instead of three.
      
      Change-Id: I602187672f244cbd8faaa3397904d71d15452d9f
      Reviewed-on: https://go-review.googlesource.com/130996
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      e6c15945
    • Martin Möhrmann's avatar
      runtime: move arm hardware division support detection to internal/cpu · 2e8c31b3
      Martin Möhrmann authored
      Assumes mandatory VFP and VFPv3 support to be present by default
      but not IDIVA if AT_HWCAP is not available.
      
      Adds GODEBUGCPU options to disable the use of code paths in the runtime
      that use hardware support for division.
      
      Change-Id: Ida02311bd9b9701de3fc120697e69445bf6c0853
      Reviewed-on: https://go-review.googlesource.com/114826
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      2e8c31b3
    • Martin Möhrmann's avatar
      runtime: do not execute write barrier on newly allocated slice in growslice · 4363c98f
      Martin Möhrmann authored
      The new slice created in growslice is cleared during malloc for
      element types containing pointers and therefore can only contain
      nil pointers. This change avoids executing write barriers for these
      nil pointers by adding and using a special bulkBarrierPreWriteSrcOnly
      function that does not enqueue pointers to slots in dst to the write
      barrier buffer.
      
      Change-Id: If9b18248bfeeb6a874b0132d19520adea593bfc4
      Reviewed-on: https://go-review.googlesource.com/115996
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      4363c98f
    • Martin Möhrmann's avatar
      runtime: replace typedmemmmove with bulkBarrierPreWrite and memmove in growslice · 96dcc445
      Martin Möhrmann authored
      A bulkBarrierPreWrite together with a memmove as used in typedslicecopy
      is faster than a typedmemmove for each element of the old slice that
      needs to be copied to the new slice.
      
      typedslicecopy is not used here as runtime functions should not call
      other instrumented runtime functions and some conditions like dst == src
      or the destination slice not being large enought that are checked for
      in typedslicecopy can not happen in growslice.
      
      Append                         13.5ns ± 6%  13.3ns ± 3%     ~     (p=0.304 n=10+10)
      AppendGrowByte                 1.18ms ± 2%  1.19ms ± 1%     ~     (p=0.113 n=10+9)
      AppendGrowString                123ms ± 1%    73ms ± 1%  -40.39%  (p=0.000 n=9+8)
      AppendSlice/1Bytes             3.81ns ± 1%  3.78ns ± 1%     ~     (p=0.116 n=10+10)
      AppendSlice/4Bytes             3.71ns ± 1%  3.70ns ± 0%     ~     (p=0.095 n=10+9)
      AppendSlice/7Bytes             3.73ns ± 0%  3.75ns ± 1%     ~     (p=0.442 n=10+10)
      AppendSlice/8Bytes             4.00ns ± 1%  4.01ns ± 1%     ~     (p=0.330 n=10+10)
      AppendSlice/15Bytes            4.29ns ± 1%  4.28ns ± 1%     ~     (p=0.536 n=10+10)
      AppendSlice/16Bytes            4.28ns ± 1%  4.31ns ± 1%   +0.75%  (p=0.019 n=10+10)
      AppendSlice/32Bytes            4.57ns ± 2%  4.58ns ± 2%     ~     (p=0.236 n=10+10)
      AppendSliceLarge/1024Bytes      305ns ± 2%   306ns ± 1%     ~     (p=0.236 n=10+10)
      AppendSliceLarge/4096Bytes     1.06µs ± 1%  1.06µs ± 0%     ~     (p=1.000 n=9+10)
      AppendSliceLarge/16384Bytes    3.12µs ± 2%  3.11µs ± 1%     ~     (p=0.493 n=10+10)
      AppendSliceLarge/65536Bytes    5.61µs ± 5%  5.36µs ± 2%   -4.58%  (p=0.003 n=10+8)
      AppendSliceLarge/262144Bytes   21.0µs ± 1%  19.5µs ± 1%   -7.09%  (p=0.000 n=8+10)
      AppendSliceLarge/1048576Bytes  78.4µs ± 1%  78.7µs ± 2%     ~     (p=0.315 n=8+10)
      AppendStr/1Bytes               3.96ns ± 6%  3.99ns ± 9%     ~     (p=0.591 n=10+10)
      AppendStr/4Bytes               3.98ns ± 1%  3.99ns ± 1%     ~     (p=0.515 n=9+9)
      AppendStr/8Bytes               4.27ns ± 1%  4.27ns ± 1%     ~     (p=0.633 n=10+10)
      AppendStr/16Bytes              4.56ns ± 2%  4.55ns ± 1%     ~     (p=0.869 n=10+10)
      AppendStr/32Bytes              4.85ns ± 1%  4.89ns ± 1%   +0.71%  (p=0.003 n=10+8)
      AppendSpecialCase              18.7ns ± 1%  18.7ns ± 1%     ~     (p=0.144 n=10+10)
      AppendInPlace/NoGrow/Byte       438ns ± 1%   439ns ± 1%     ~     (p=0.135 n=10+8)
      AppendInPlace/NoGrow/1Ptr      1.05µs ± 2%  1.05µs ± 1%     ~     (p=0.469 n=10+10)
      AppendInPlace/NoGrow/2Ptr      1.77µs ± 1%  1.78µs ± 2%     ~     (p=0.469 n=10+10)
      AppendInPlace/NoGrow/3Ptr      1.94µs ± 1%  1.93µs ± 2%     ~     (p=0.517 n=10+10)
      AppendInPlace/NoGrow/4Ptr      3.18µs ± 1%  3.17µs ± 0%     ~     (p=0.483 n=10+9)
      AppendInPlace/Grow/Byte         382ns ± 2%   383ns ± 2%     ~     (p=0.705 n=9+10)
      AppendInPlace/Grow/1Ptr         383ns ± 1%   384ns ± 1%     ~     (p=0.844 n=10+10)
      AppendInPlace/Grow/2Ptr         459ns ± 2%   467ns ± 2%   +1.74%  (p=0.001 n=10+10)
      AppendInPlace/Grow/3Ptr         593ns ± 1%   597ns ± 2%     ~     (p=0.195 n=10+10)
      AppendInPlace/Grow/4Ptr         583ns ± 2%   589ns ± 2%     ~     (p=0.084 n=10+10)
      
      Change-Id: I629872f065a22b29267c1adbfc578aaedd36d365
      Reviewed-on: https://go-review.googlesource.com/115755
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      96dcc445