1. 06 Apr, 2018 16 commits
  2. 05 Apr, 2018 19 commits
    • Josh Bleecher Snyder's avatar
      cmd/compile: rewrite a & 1 != 1 into a & 1 == 0 on amd64 · 3f483e65
      Josh Bleecher Snyder authored
      These rules trigger 190 times during make.bash.
      
      Change-Id: I20d1688db5d8c904a7237c08635c6c9d8bd58b1c
      Reviewed-on: https://go-review.googlesource.com/105037
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarGiovanni Bajo <rasky@develer.com>
      3f483e65
    • Brian Kessler's avatar
      math/big: clean up z.div(z, x, y) calls · 7818b82f
      Brian Kessler authored
      Updates #22830
      
      Due to not checking if the output slices alias in divLarge,
      calls of the form z.div(z, x, y) caused the slice z
      to attempt to be used to store both the quotient and the
      remainder of the division.  CL 78995 applies an alias
      check to correct that error.  This CL cleans up the
      additional div calls that attempt to supply the same slice
      to hold both the quotient and remainder.
      
      Note that the call in expNN was responsible for the reported
      error in r.Exp(x, 1, m) when r was initialized to a non-zero value.
      
      The second instance in expNNMontgomery did not result in an error
      due to the size of the arguments.
      
      	// RR = 2**(2*_W*len(m)) mod m
      	RR := nat(nil).setWord(1)
      	zz := nat(nil).shl(RR, uint(2*numWords*_W))
      	_, RR = RR.div(RR, zz, m)
      
      Specifically,
      
      cap(RR) == 5 after setWord(1) due to const e = 4 in z.make(1)
      len(zz) == 2*len(m) + 1 after shifting left, numWords = len(m)
      
      Reusing the backing array for z and z2 in div was only triggered if
      cap(RR) >= len(zz) + 1 and len(m) > 1 so that divLarge was called.
      
      But, 5 < 2*len(m) + 2 if len(m) > 1, so new arrays were allocated
      and the error was never triggered in this case.
      
      Change-Id: Iedac80dbbde13216c94659e84d28f6f4be3aaf24
      Reviewed-on: https://go-review.googlesource.com/81055
      Run-TryBot: Robert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      7818b82f
    • Matthew Dempsky's avatar
      cmd/compile: cleanup method symbol creation · 638f112d
      Matthew Dempsky authored
      There were multiple ad hoc ways to create method symbols, with subtle
      and confusing differences between them. This CL unifies them into a
      single well-documented encoding and implementation.
      
      This introduces some inconsequential changes to symbol format for the
      sake of simplicity and consistency. Two notable changes:
      
      1) Symbol construction is now insensitive to the package currently
      being compiled. Previously, non-exported methods on anonymous types
      received different method symbols depending on whether the method was
      local or imported.
      
      2) Symbols for method values parenthesized non-pointer receiver types
      and non-exported method names, and also always package-qualified
      non-exported method names. Now they use the same rules as normal
      method symbols.
      
      The methodSym function is also now stricter about rejecting
      non-sensical method/receiver combinations. Notably, this means that
      typecheckfunc needs to call addmethod to validate the method before
      calling declare, which also means we no longer emit errors about
      redeclaring bogus methods.
      
      Change-Id: I9501c7a53dd70ef60e5c74603974e5ecc06e2003
      Reviewed-on: https://go-review.googlesource.com/104876Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      638f112d
    • Josh Bleecher Snyder's avatar
      runtime: avoid calling adjustpointers unnecessarily · 2e7e5777
      Josh Bleecher Snyder authored
      adjustpointers loops over a bitmap.
      If the length of that bitmap is zero,
      we can skip making the call entirely.
      This speeds up stack copying when there are
      no pointers present in either args or locals.
      
      name                old time/op  new time/op  delta
      StackCopyPtr-8       101ms ± 4%    90ms ± 4%  -10.95%  (p=0.000 n=87+93)
      StackCopy-8         80.1ms ± 4%  72.6ms ± 4%   -9.41%  (p=0.000 n=98+100)
      StackCopyNoCache-8   121ms ± 3%   113ms ± 3%   -6.57%  (p=0.000 n=98+97)
      
      Change-Id: I7a272e19bc9a14fa3e3318771ebd082dc6247d25
      Reviewed-on: https://go-review.googlesource.com/104737
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarAustin Clements <austin@google.com>
      2e7e5777
    • Richard Musiol's avatar
      cmd/compile/internal/gc: factor out beginning of SSAGenState.Call · 533fdfd0
      Richard Musiol authored
      This commit does not change the semantics of the Call method. Its
      purpose is to avoid duplication of code by making PrepareCall available
      for separate use by the wasm backend.
      
      Updates #18892
      
      Change-Id: I04a3098f56ebf0d995791c5375dd4c03b6a202a3
      Reviewed-on: https://go-review.googlesource.com/103275Reviewed-by: 's avatarJosh Bleecher Snyder <josharian@gmail.com>
      533fdfd0
    • Hana Kim's avatar
      cmd/trace: include taskless spans in /usertasks. · 8e351ae3
      Hana Kim authored
      Change-Id: Id4e3407ba497a018d5ace92813ba8e9653d0ac7d
      Reviewed-on: https://go-review.googlesource.com/104976Reviewed-by: 's avatarHeschi Kreinick <heschi@google.com>
      Run-TryBot: Heschi Kreinick <heschi@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      8e351ae3
    • Carlos Eduardo Seo's avatar
      math/big: improve performance on ppc64x by unrolling loops · fc8967e3
      Carlos Eduardo Seo authored
      This change improves performance of addVV, subVV and mulAddVWW
      by unrolling the loops, with improvements up to 1.45x.
      
      benchmark                    old ns/op     new ns/op     delta
      BenchmarkAddVV/1-16          5.79          5.85          +1.04%
      BenchmarkAddVV/2-16          6.41          6.62          +3.28%
      BenchmarkAddVV/3-16          6.89          7.35          +6.68%
      BenchmarkAddVV/4-16          7.47          8.26          +10.58%
      BenchmarkAddVV/5-16          8.04          8.18          +1.74%
      BenchmarkAddVV/10-16         10.9          11.2          +2.75%
      BenchmarkAddVV/100-16        81.7          57.0          -30.23%
      BenchmarkAddVV/1000-16       714           500           -29.97%
      BenchmarkAddVV/10000-16      7088          4946          -30.22%
      BenchmarkAddVV/100000-16     71514         49364         -30.97%
      BenchmarkSubVV/1-16          5.94          5.89          -0.84%
      BenchmarkSubVV/2-16          12.9          6.82          -47.13%
      BenchmarkSubVV/3-16          7.03          7.34          +4.41%
      BenchmarkSubVV/4-16          7.58          8.23          +8.58%
      BenchmarkSubVV/5-16          8.15          8.19          +0.49%
      BenchmarkSubVV/10-16         11.2          11.4          +1.79%
      BenchmarkSubVV/100-16        82.4          57.0          -30.83%
      BenchmarkSubVV/1000-16       715           499           -30.21%
      BenchmarkSubVV/10000-16      7089          4947          -30.22%
      BenchmarkSubVV/100000-16     71568         49378         -31.01%
      
      benchmark                    old MB/s     new MB/s      speedup
      BenchmarkAddVV/1-16          11048.49     10939.92      0.99x
      BenchmarkAddVV/2-16          19973.41     19323.60      0.97x
      BenchmarkAddVV/3-16          27847.09     26123.06      0.94x
      BenchmarkAddVV/4-16          34276.46     30976.54      0.90x
      BenchmarkAddVV/5-16          39781.92     39140.68      0.98x
      BenchmarkAddVV/10-16         58559.29     56894.68      0.97x
      BenchmarkAddVV/100-16        78354.88     112243.69     1.43x
      BenchmarkAddVV/1000-16       89592.74     127889.04     1.43x
      BenchmarkAddVV/10000-16      90292.39     129387.06     1.43x
      BenchmarkAddVV/100000-16     89492.92     129647.78     1.45x
      BenchmarkSubVV/1-16          10781.03     10861.22      1.01x
      BenchmarkSubVV/2-16          9949.27      18760.21      1.89x
      BenchmarkSubVV/3-16          27319.40     26166.01      0.96x
      BenchmarkSubVV/4-16          33764.35     31123.02      0.92x
      BenchmarkSubVV/5-16          39272.40     39050.31      0.99x
      BenchmarkSubVV/10-16         57262.87     56206.33      0.98x
      BenchmarkSubVV/100-16        77641.78     112280.86     1.45x
      BenchmarkSubVV/1000-16       89486.27     128064.08     1.43x
      BenchmarkSubVV/10000-16      90274.37     129356.59     1.43x
      BenchmarkSubVV/100000-16     89424.42     129610.50     1.45x
      
      Change-Id: I2795a82134d1e3b75e2634c76b8ca165a723ec7b
      Reviewed-on: https://go-review.googlesource.com/103495
      Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarLynn Boger <laboger@linux.vnet.ibm.com>
      fc8967e3
    • Joel Sing's avatar
      runtime: fix/improve exitThread on openbsd · a7bb8d3e
      Joel Sing authored
      OpenBSD's __threxit syscall takes a pointer to a 32-bit value that will be
      zeroed immediately before the thread exits. Make use of this instead of
      zeroing freeWait from the exitThread assembly and using hacks like switching
      to a static stack, so this works on 386.
      
      Change-Id: I3ec5ead82b6496404834d148f713794d5d9da723
      Reviewed-on: https://go-review.googlesource.com/105055Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: 's avatarAustin Clements <austin@google.com>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      a7bb8d3e
    • Ilya Tocar's avatar
      cmd/compile/internal/ssa: fix GO386=387 build · d3026dd3
      Ilya Tocar authored
      Don't generate FP ops with 1 operand in memory for 387.
      
      Change-Id: I23b49dfa2a1e60c8778c920230e64785a3ddfbd1
      Reviewed-on: https://go-review.googlesource.com/105035
      Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      d3026dd3
    • Matthew Dempsky's avatar
      cmd/compile: drop legacy code for generating iface wrappers · 6703adde
      Matthew Dempsky authored
      Originally, scalar values were directly stored within interface values
      as long as they fit into a pointer-sized slot of memory. And since
      interface method calls always pass the full pointer-sized value as the
      receiver argument, value-narrowing wrappers were necessary to adapt to
      the calling convention for methods with smaller receiver types.
      
      However, for precise garbage collection, we now only store actual
      pointers within interface values, so these wrappers are no longer
      necessary.
      
      Passes toolstash-check.
      
      Change-Id: I5303bfeb8d0f11db619b5a5d06b37ac898588670
      Reviewed-on: https://go-review.googlesource.com/104875
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: 's avatarAustin Clements <austin@google.com>
      6703adde
    • Daniel Martí's avatar
      cmd/internal/obj: various code cleanups · e8aa9a53
      Daniel Martí authored
      Mostly replacing C-Style loops with range expressions, but also other
      simplifications like the introduction of writeBool and unindenting some
      code.
      
      Passes toolstash -cmp on std cmd.
      
      Change-Id: I799bccd4e5d411428dcf122b8588a564a9217e7c
      Reviewed-on: https://go-review.googlesource.com/104936
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarMarvin Stenger <marvin.stenger94@gmail.com>
      Reviewed-by: 's avatarIlya Tocar <ilya.tocar@intel.com>
      e8aa9a53
    • isharipo's avatar
      cmd/internal/obj/x86: cleanup comments and consts · 47427d67
      isharipo authored
      - Unexport MaxLoopPad and LoopAlign; associated comments updated
      - Remove commented-out C code
      - Replace C-style /**/ code comments with single-line comments
      
      Change-Id: I51bd92a05b4d3823757b12efd798951c9f252bd4
      Reviewed-on: https://go-review.googlesource.com/104795
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      47427d67
    • isharipo's avatar
      cmd/internal/obj/x86: remove unused VEX constants · cefbf302
      isharipo authored
      VEX constants were used when instructions were added by hand.
      Now all VEX-encoded instructions are auto-generated by x86avxgen,
      so there is no need for those anymore.
      
      Change-Id: Ida63e5e23a8b819b15f61ac98980dec45a21617c
      Reviewed-on: https://go-review.googlesource.com/104775
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDaniel Martí <mvdan@mvdan.cc>
      cefbf302
    • Ben Shi's avatar
      cmd/compile: optimize 386 binary operations with a memory operand · 20cf5c49
      Ben Shi authored
      Some integer/float binary operations of 386 can take a direct memory
      operand, which is more efficient than loading to a register.
      
      These CL does this optimization by copying the similar solution
      of amd64. And the go1 benchmark shows some inprovements, especially
      the test case Template. (excluding noise)
      
      name                     old time/op    new time/op    delta
      BinaryTree17-4              3.42s ± 2%     3.40s ± 2%    ~     (p=0.069 n=38+39)
      Fannkuch11-4                3.48s ± 1%     3.53s ± 1%  +1.59%  (p=0.000 n=40+40)
      FmtFprintfEmpty-4          46.7ns ± 4%    46.3ns ± 3%  -1.03%  (p=0.001 n=40+40)
      FmtFprintfString-4         80.1ns ± 3%    80.6ns ± 3%  +0.56%  (p=0.029 n=40+40)
      FmtFprintfInt-4            92.4ns ± 2%    92.3ns ± 3%    ~     (p=0.847 n=40+40)
      FmtFprintfIntInt-4          147ns ± 3%     144ns ± 3%  -1.87%  (p=0.000 n=40+40)
      FmtFprintfPrefixedInt-4     182ns ± 2%     184ns ± 3%  +0.99%  (p=0.002 n=40+40)
      FmtFprintfFloat-4           387ns ± 3%     384ns ± 3%    ~     (p=0.069 n=40+40)
      FmtManyArgs-4               619ns ± 3%     616ns ± 3%    ~     (p=0.320 n=40+40)
      GobDecode-4                7.28ms ± 6%    7.27ms ± 5%    ~     (p=0.897 n=40+40)
      GobEncode-4                7.33ms ± 6%    7.21ms ± 6%  -1.56%  (p=0.022 n=38+40)
      Gzip-4                      357ms ± 4%     357ms ± 4%    ~     (p=0.071 n=40+40)
      Gunzip-4                   45.3ms ± 3%    45.4ms ± 3%    ~     (p=0.452 n=40+40)
      HTTPClientServer-4         63.0µs ± 2%    62.9µs ± 3%    ~     (p=0.760 n=38+39)
      JSONEncode-4               22.0ms ± 4%    21.7ms ± 4%  -1.49%  (p=0.000 n=40+40)
      JSONDecode-4               67.7ms ± 4%    68.3ms ± 3%  +0.86%  (p=0.039 n=40+40)
      Mandelbrot200-4            5.16ms ± 3%    5.17ms ± 3%    ~     (p=0.418 n=40+40)
      GoParse-4                  3.30ms ± 2%    3.32ms ± 3%  +0.55%  (p=0.017 n=40+40)
      RegexpMatchEasy0_32-4       104ns ± 3%     104ns ± 4%    ~     (p=0.992 n=40+40)
      RegexpMatchEasy0_1K-4       852ns ± 3%     851ns ± 2%    ~     (p=0.344 n=40+40)
      RegexpMatchEasy1_32-4       113ns ± 4%     113ns ± 5%    ~     (p=0.937 n=40+40)
      RegexpMatchEasy1_1K-4      1.03µs ± 5%    1.04µs ± 4%    ~     (p=0.430 n=40+40)
      RegexpMatchMedium_32-4      132ns ± 4%     131ns ± 3%  -1.06%  (p=0.027 n=40+40)
      RegexpMatchMedium_1K-4     43.0µs ± 3%    43.2µs ± 3%    ~     (p=0.122 n=40+40)
      RegexpMatchHard_32-4       2.21µs ± 4%    2.20µs ± 4%    ~     (p=0.146 n=40+40)
      RegexpMatchHard_1K-4       67.1µs ± 4%    67.2µs ± 3%    ~     (p=0.859 n=40+40)
      Revcomp-4                   1.85s ± 2%     1.85s ± 3%    ~     (p=0.184 n=40+40)
      Template-4                 70.1ms ± 4%    67.5ms ± 3%  -3.65%  (p=0.000 n=40+40)
      TimeParse-4                 457ns ±16%     439ns ± 4%    ~     (p=0.683 n=40+34)
      TimeFormat-4                413ns ± 3%     414ns ± 3%    ~     (p=0.850 n=40+40)
      [Geo mean]                 67.5µs         67.3µs       -0.38%
      
      name                     old speed      new speed      delta
      GobDecode-4               105MB/s ± 6%   106MB/s ± 5%    ~     (p=0.893 n=40+40)
      GobEncode-4               105MB/s ± 6%   107MB/s ± 7%  +1.60%  (p=0.023 n=38+40)
      Gzip-4                   54.4MB/s ± 4%  54.5MB/s ± 4%    ~     (p=0.073 n=40+40)
      Gunzip-4                  429MB/s ± 3%   428MB/s ± 3%    ~     (p=0.453 n=40+40)
      JSONEncode-4             88.3MB/s ± 5%  89.6MB/s ± 4%  +1.51%  (p=0.000 n=40+40)
      JSONDecode-4             28.7MB/s ± 4%  28.4MB/s ± 3%  -0.87%  (p=0.039 n=40+40)
      GoParse-4                17.6MB/s ± 3%  17.5MB/s ± 3%  -0.55%  (p=0.020 n=40+40)
      RegexpMatchEasy0_32-4     308MB/s ± 4%   308MB/s ± 5%    ~     (p=0.988 n=40+40)
      RegexpMatchEasy0_1K-4    1.20GB/s ± 3%  1.20GB/s ± 2%    ~     (p=0.329 n=40+40)
      RegexpMatchEasy1_32-4     283MB/s ± 4%   283MB/s ± 4%    ~     (p=0.507 n=40+40)
      RegexpMatchEasy1_1K-4     991MB/s ± 5%   987MB/s ± 4%    ~     (p=0.446 n=40+40)
      RegexpMatchMedium_32-4   7.54MB/s ± 4%  7.63MB/s ± 3%  +1.26%  (p=0.004 n=40+40)
      RegexpMatchMedium_1K-4   23.8MB/s ± 3%  23.7MB/s ± 4%    ~     (p=0.121 n=40+40)
      RegexpMatchHard_32-4     14.5MB/s ± 4%  14.6MB/s ± 4%    ~     (p=0.145 n=40+40)
      RegexpMatchHard_1K-4     15.3MB/s ± 4%  15.2MB/s ± 3%    ~     (p=0.874 n=40+40)
      Revcomp-4                 137MB/s ± 2%   137MB/s ± 3%    ~     (p=0.179 n=40+40)
      Template-4               27.7MB/s ± 4%  28.7MB/s ± 3%  +3.78%  (p=0.000 n=40+40)
      [Geo mean]               78.9MB/s       79.2MB/s       +0.38%
      
      Change-Id: I3ba688c253b665485c1ebdf5a75f4ce82cc3def3
      Reviewed-on: https://go-review.googlesource.com/102036
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarIlya Tocar <ilya.tocar@intel.com>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      20cf5c49
    • isharipo's avatar
      cmd/internal/obj/x86: make AsmBuf receiver name consistent · eef42f92
      isharipo authored
      Fixes golint receiver name complaints.
      
      We can't go with "a" name as it sometimes is used for obj.Addr args.
      
      Change-Id: I66556f4e3dc42cfaaa4db3ed7772fa6756ea9a9b
      Reviewed-on: https://go-review.googlesource.com/104796
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDaniel Martí <mvdan@mvdan.cc>
      eef42f92
    • Daniel Martí's avatar
      cmd/compile: early return/continue to unindent some code · fcfea247
      Daniel Martí authored
      While at it, also simplify a couple of switches.
      
      Doesn't pass toolstash -cmp on std cmd, because orderBlock(&n2.Nbody) is
      moved further down to the n3 loop.
      
      Change-Id: I20a2a6c21eb9a183a59572e0fca401a5041fc40a
      Reviewed-on: https://go-review.googlesource.com/104416
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      fcfea247
    • Daniel Martí's avatar
      test: skip locklinear's lockmany test for now · 97677273
      Daniel Martí authored
      Since it's been reliably failing on one of the linux-arm builders
      (arm5spacemonkey) for a long time.
      
      Updates #24221.
      
      Change-Id: I8fccc7e16631de497ccc2c285e510a110a93ad95
      Reviewed-on: https://go-review.googlesource.com/104535
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      97677273
    • Alberto Donizetti's avatar
      cmd/compile: stack-allocate worklist in ReachableBlocks · 9357bb9e
      Alberto Donizetti authored
      Stack-allocate a local worklist in the deadcode pass. A size of 64 for
      the pre-allocation is enough for >99% of the ReachableBlocks call in
      a typical package.
      
      name      old time/op       new time/op       delta
      Template        281ms ± 3%        278ms ± 2%  -1.03%  (p=0.049 n=20+20)
      Unicode         135ms ± 6%        134ms ± 6%    ~     (p=0.273 n=18+17)
      GoTypes         882ms ± 3%        880ms ± 2%    ~     (p=0.925 n=20+20)
      Compiler        4.01s ± 1%        4.02s ± 2%    ~     (p=0.640 n=20+20)
      SSA             9.61s ± 1%        9.75s ± 1%  +1.39%  (p=0.000 n=20+19)
      Flate           186ms ± 5%        185ms ± 7%    ~     (p=0.758 n=20+20)
      GoParser        219ms ± 5%        218ms ± 4%    ~     (p=0.149 n=20+20)
      Reflect         568ms ± 4%        562ms ± 1%    ~     (p=0.154 n=19+19)
      Tar             258ms ± 2%        257ms ± 3%    ~     (p=0.428 n=19+20)
      XML             316ms ± 2%        317ms ± 3%    ~     (p=0.901 n=20+19)
      
      name      old user-time/op  new user-time/op  delta
      Template        398ms ± 6%        388ms ± 6%  -2.55%  (p=0.007 n=20+20)
      Unicode         217ms ± 5%        213ms ± 6%  -1.90%  (p=0.036 n=17+20)
      GoTypes         1.21s ± 3%        1.20s ± 3%  -0.89%  (p=0.022 n=19+20)
      Compiler        5.56s ± 3%        5.53s ± 5%    ~     (p=0.779 n=20+20)
      SSA             13.9s ± 5%        14.0s ± 4%    ~     (p=0.529 n=20+20)
      Flate           248ms ±10%        252ms ± 4%    ~     (p=0.409 n=20+18)
      GoParser        305ms ± 4%        299ms ± 5%  -1.87%  (p=0.007 n=19+20)
      Reflect         754ms ± 2%        747ms ± 3%    ~     (p=0.107 n=20+19)
      Tar             360ms ± 5%        362ms ± 3%    ~     (p=0.534 n=20+18)
      XML             425ms ± 6%        429ms ± 4%    ~     (p=0.496 n=20+19)
      
      name      old alloc/op      new alloc/op      delta
      Template       38.8MB ± 0%       38.7MB ± 0%  -0.15%  (p=0.000 n=20+20)
      Unicode        29.1MB ± 0%       29.1MB ± 0%  -0.03%  (p=0.000 n=20+20)
      GoTypes         115MB ± 0%        115MB ± 0%  -0.13%  (p=0.000 n=20+20)
      Compiler        491MB ± 0%        490MB ± 0%  -0.15%  (p=0.000 n=18+19)
      SSA            1.40GB ± 0%       1.40GB ± 0%  -0.16%  (p=0.000 n=20+20)
      Flate          24.9MB ± 0%       24.8MB ± 0%  -0.17%  (p=0.000 n=20+20)
      GoParser       30.7MB ± 0%       30.6MB ± 0%  -0.16%  (p=0.000 n=20+20)
      Reflect        77.1MB ± 0%       77.0MB ± 0%  -0.11%  (p=0.000 n=19+20)
      Tar            39.0MB ± 0%       39.0MB ± 0%  -0.14%  (p=0.000 n=20+20)
      XML            44.6MB ± 0%       44.5MB ± 0%  -0.13%  (p=0.000 n=17+19)
      
      name      old allocs/op     new allocs/op     delta
      Template         379k ± 0%         378k ± 0%  -0.45%  (p=0.000 n=20+17)
      Unicode          336k ± 0%         336k ± 0%  -0.08%  (p=0.000 n=20+20)
      GoTypes         1.18M ± 0%        1.17M ± 0%  -0.37%  (p=0.000 n=20+20)
      Compiler        4.58M ± 0%        4.56M ± 0%  -0.38%  (p=0.000 n=20+20)
      SSA             11.4M ± 0%        11.4M ± 0%  -0.39%  (p=0.000 n=20+20)
      Flate            233k ± 0%         232k ± 0%  -0.51%  (p=0.000 n=20+20)
      GoParser         313k ± 0%         312k ± 0%  -0.48%  (p=0.000 n=19+20)
      Reflect          946k ± 0%         943k ± 0%  -0.31%  (p=0.000 n=20+20)
      Tar              388k ± 0%         387k ± 0%  -0.40%  (p=0.000 n=20+20)
      XML              411k ± 0%         409k ± 0%  -0.35%  (p=0.000 n=17+20)
      
      Change-Id: Iaec0b9471ded61be5eb3c9d1074e804672307644
      Reviewed-on: https://go-review.googlesource.com/104675Reviewed-by: 's avatarDaniel Martí <mvdan@mvdan.cc>
      9357bb9e
    • Matthew Dempsky's avatar
      cmd/compile: extract inline related fields into separate Inline type · 562a1999
      Matthew Dempsky authored
      Inl, Inldcl, and InlCost are only applicable to functions with bodies
      that can be inlined, so pull them out into a separate Inline type to
      make understanding them easier.
      
      A side benefit is that we can check if a function can be inlined by
      just checking if n.Func.Inl is non-nil, which simplifies handling of
      empty function bodies.
      
      While here, remove some unnecessary Curfn twiddling, and make imported
      functions use Inl.Dcl instead of Func.Dcl for consistency for local
      functions.
      
      Passes toolstash-check.
      
      Change-Id: Ifd4a80349d85d9e8e4484952b38ec4a63182e81f
      Reviewed-on: https://go-review.googlesource.com/104756
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      562a1999
  3. 04 Apr, 2018 5 commits
    • Robert Griesemer's avatar
      text/tabwriter: remove internal use of bytes.Buffer (cleanup) · f2b5f750
      Robert Griesemer authored
      Noticed that we can simply use a []byte slice while investigating
      a separate issue. Did the obvious simplification.
      
      Change-Id: I921ebbb42135b5f1a10109236ceb9ae6e94ae7e2
      Reviewed-on: https://go-review.googlesource.com/104757
      Run-TryBot: Robert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      f2b5f750
    • David Chase's avatar
      cmd/compile: adjust is-statement on Pos's to improve debugging · b9a36568
      David Chase authored
      Stores to auto tmp variables can be hoisted to places
      where the line numbers make debugging look "jumpy".
      Turning those instructions into ones with is_stmt = 0 in
      the DWARF (accomplished by marking ssa nodes with NotStmt)
      makes debugging look better while still attributing the
      instructions with the correct line number.
      
      The same is true for certain register allocator spills and
      reloads.
      
      Change-Id: I97a394eb522d4911cc40b4bf5bf76d3d7221f6c0
      Reviewed-on: https://go-review.googlesource.com/98415
      Run-TryBot: David Chase <drchase@google.com>
      Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      b9a36568
    • David Chase's avatar
      cmd/link: process is_stmt data into dwarf line tables · dead03b7
      David Chase authored
      To improve debugging, instructions should be annotated with
      DWARF is_stmt.  The DWARF default before was is_stmt=1, and
      to remove "jumpy" stepping the optimizer was tagging
      instructions with a no-position position, which interferes
      with the accuracy of profiling information.  This allows
      that to be corrected, and also allows more "jumpy" positions
      to be annotated with is_stmt=0 (these changes were not made
      for 1.10 because of worries about further messing up
      profiling).
      
      The is_stmt values are placed in a pc-encoded table and
      passed through a symbol derived from the name of the
      function and processed in the linker alongside its
      processing of each function's pc/line tables.
      
      The only change in binary size is in the .debug_line tables
      measured with "objdump -h --section=.debug_line go1.test"
      For go1.test, these are 2614 bytes larger,
      or 0.72% of the size of .debug_line,
      or 0.025% of the file size.
      
      This will increase in proportion to how much the is_stmt
      flag is used (toggled).
      
      Change-Id: Ic1f1aeccff44591ad0494d29e1a0202a3c506a7a
      Reviewed-on: https://go-review.googlesource.com/93664
      Run-TryBot: David Chase <drchase@google.com>
      Reviewed-by: 's avatarHeschi Kreinick <heschi@google.com>
      dead03b7
    • David Chase's avatar
      cmd/compile: add IsStmt breakpoint info to src.lico · 619679a3
      David Chase authored
      Add IsStmt information to src.lico so that suitable lines
      for breakpoints (or not) can be noted, eventually for
      communication to the debugger via the linker and DWARF.
      
      The expectation is that the front end will apply statement
      boundary marks because it has best information about the
      input, and the optimizer will attempt to preserve these.
      The exact method for placing these marks is still TBD;
      ideally stopping "at" line N in unoptimized code will occur
      at a point where none of the side effects of N have occurred
      and all of the inputs for line N can still be observed.
      The optimizer will work with the same markings supplied
      for unoptimized code.
      
      It is a goal that non-optimizing compilation should conserve
      statement marks.
      
      The optimizer will also use the not-a-statement annotation
      to indicate instructions that have a line number (for
      profiling purposes) but should not be the target of
      debugger step, next, or breakpoints.  Because instructions
      marked as statements are sometimes removed, a third value
      indicating that a position (instruction) can serve as a
      statement if the optimizer removes the current instruction
      marked as a statement for the same line.  The optimizer
      should attempt to conserve statement marks, but it is not
      a bug if some are lost.
      
      Includes changes to html output for GOSSAFUNC to indicate
      not-default is-a-statement with bold and not-a-statement
      with strikethrough.
      
      Change-Id: Ia22c9a682f276e2ca2a4ef7a85d4b6ebf9c62b7f
      Reviewed-on: https://go-review.googlesource.com/93663
      Run-TryBot: David Chase <drchase@google.com>
      Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      619679a3
    • Robert Griesemer's avatar
      go/printer, gofmt: tuned table alignment for better results · 542ea5ad
      Robert Griesemer authored
      The go/printer (and thus gofmt) uses a heuristic to determine
      whether to break alignment between elements of an expression
      list which is spread across multiple lines. The heuristic only
      kicked in if the entry sizes (character length) was above a
      certain threshold (20) and the ratio between the previous and
      current entry size was above a certain value (4).
      
      This heuristic worked reasonably most of the time, but also
      led to unfortunate breaks in many cases where a single entry
      was suddenly much smaller (or larger) then the previous one.
      
      The behavior of gofmt was sufficiently mysterious in some of
      these situations that many issues were filed against it.
      
      The simplest solution to address this problem is to remove
      the heuristic altogether and have a programmer introduce
      empty lines to force different alignments if it improves
      readability. The problem with that approach is that the
      places where it really matters, very long tables with many
      (hundreds, or more) entries, may be machine-generated and
      not "post-processed" by a human (e.g., unicode/utf8/tables.go).
      
      If a single one of those entries is overlong, the result
      would be that the alignment would force all comments or
      values in key:value pairs to be adjusted to that overlong
      value, making the table hard to read (e.g., that entry may
      not even be visible on screen and all other entries seem
      spaced out too wide).
      
      Instead, we opted for a slightly improved heuristic that
      behaves much better for "normal", human-written code.
      
      1) The threshold is increased from 20 to 40. This disables
      the heuristic for many common cases yet even if the alignment
      is not "ideal", 40 is not that many characters per line with
      todays screens, making it very likely that the entire line
      remains "visible" in an editor.
      
      2) Changed the heuristic to not simply look at the size ratio
      between current and previous line, but instead considering the
      geometric mean of the sizes of the previous (aligned) lines.
      This emphasizes the "overall picture" of the previous lines,
      rather than a single one (which might be an outlier).
      
      3) Changed the ratio from 4 to 2.5. Now that we ignore sizes
      below 40, a ratio of 4 would mean that a new entry would have
      to be 4 times bigger (160) or smaller (10) before alignment
      would be broken. A ratio of 2.5 seems more sensible.
      
      Applied updated gofmt to all of src and misc. Also tested
      against several former issues that complained about this
      and verified that the output for the given examples is
      satisfactory (added respective test cases).
      
      Some of the files changed because they were not gofmt-ed
      in the first place.
      
      For #644.
      For #7335.
      For #10392.
      (and probably more related issues)
      
      Fixes #22852.
      
      Change-Id: I5e48b3d3b157a5cf2d649833b7297b33f43a6f6e
      542ea5ad