1. 16 Oct, 2018 8 commits
    • Filippo Valsorda's avatar
      Revert "fmt: fix incorrect format of whole-number floats when using %#v" · a52289ef
      Filippo Valsorda authored
      Numbers without decimals are valid Go representations of whole-number
      floats. That is, "var x float64 = 5" is valid Go. Avoid breakage in
      tests that expect a certain output from %#v by reverting to it.
      
      To guarantee the right type is generated by a print use %T(%#v) instead.
      
      Added a test to lock in this behavior.
      
      This reverts commit 7c7cecc1.
      
      Fixes #27634
      Updates #26363
      
      Change-Id: I544c400a0903777dd216452a7e86dfe60b0b0283
      Reviewed-on: https://go-review.googlesource.com/c/142597
      Run-TryBot: Filippo Valsorda <filippo@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRob Pike <r@golang.org>
      Reviewed-by: 's avatarBryan C. Mills <bcmills@google.com>
      a52289ef
    • Matthew Dempsky's avatar
      cmd/compile: remove -dolinkobj flag · 965fa3b1
      Matthew Dempsky authored
      This used to be used by cmd/vet and some assembly generation tests, but
      those were removed in CL 37691 and CL 107336. No point in keeping an
      unneeded flag around.
      
      Fixes #28220.
      
      Change-Id: I59f8546954ab36ea61ceba81c10d6e16d74b966a
      Reviewed-on: https://go-review.googlesource.com/c/142677
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      965fa3b1
    • Matthew Dempsky's avatar
      cmd/compile/internal/gc: simplify typechecking definitions · 62c52a5e
      Matthew Dempsky authored
      There are only a handful of nodes that we need to pass to
      typecheckdef (OLITERAL, ONAME, OTYPE, and ONONAME), but typecheck1
      takes the awkward approach of calling typecheckdef on every node with
      Sym != nil, and then excluding a long list of uninteresting Ops that
      have a non-nil Sym.
      
      Passes toolstash-check.
      
      Change-Id: I0271d2faff0208ad57ddc1f1a540a5fbed870234
      Reviewed-on: https://go-review.googlesource.com/c/142657
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      62c52a5e
    • Lynn Boger's avatar
      test/codegen: enable more tests for ppc64/ppc64le · 39fa301b
      Lynn Boger authored
      Adding cases for ppc64,ppc64le to the codegen tests
      where appropriate.
      
      Change-Id: Idf8cbe88a4ab4406a4ef1ea777bd15a58b68f3ed
      Reviewed-on: https://go-review.googlesource.com/c/142557
      Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      39fa301b
    • Daniel Martí's avatar
      encoding/json: always verify we can get a field's value · 4b36e129
      Daniel Martí authored
      Calling .Interface on a struct field's reflect.Value isn't always safe.
      For example, if that field is an unexported anonymous struct.
      
      We only descended into this branch if the struct type had any methods,
      so this bug had gone unnoticed for a few release cycles.
      
      Add the check, and add a simple test case.
      
      Fixes #28145.
      
      Change-Id: I02f7e0ab9a4a0c18a5e2164211922fe9c3d30f64
      Reviewed-on: https://go-review.googlesource.com/c/141537
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      4b36e129
    • Daniel Martí's avatar
      encoding/json: fix "data changed underfoot?" panic · 5eff6bfd
      Daniel Martí authored
      Given a program as follows:
      
      	data := []byte(`{"F": {
      		"a": 2,
      		"3": 4
      	}}`)
      	json.Unmarshal(data, &map[string]map[int]int{})
      
      The JSON package should error, as "a" is not a valid integer. However,
      we'd encounter a panic:
      
      	panic: JSON decoder out of sync - data changing underfoot?
      
      The reason was that decodeState.object would return a nil error on
      encountering the invalid map key string, while saving the key type error
      for later. This broke if we were inside another object, as we would
      abruptly end parsing the nested object, leaving the decoder in an
      unexpected state.
      
      To fix this, simply avoid storing the map element and continue decoding
      the object, to leave the decoder state exactly as if we hadn't seen an
      invalid key type.
      
      This affected both signed and unsigned integer keys, so fix both and add
      two test cases.
      
      Updates #28189.
      
      Change-Id: I8a6204cc3ff9fb04ed769df7a20a824c8b94faff
      Reviewed-on: https://go-review.googlesource.com/c/142518Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      5eff6bfd
    • Ben Shi's avatar
      cmd/compile: optimize 386's load/store combination · 4b78fe57
      Ben Shi authored
      This CL adds more combinations of two consequtive MOVBload/MOVBstore
      to a unique MOVWload/MOVWstore.
      
      1. The size of the go executable decreases about 4KB, and the total
      size of pkg/linux_386 (excluding cmd/compile) decreases about 1.5KB.
      
      2. There is no regression in the go1 benchmark result, excluding noise.
      name                     old time/op    new time/op    delta
      BinaryTree17-4              3.28s ± 2%     3.29s ± 2%    ~     (p=0.151 n=40+40)
      Fannkuch11-4                3.52s ± 1%     3.51s ± 1%  -0.28%  (p=0.002 n=40+40)
      FmtFprintfEmpty-4          45.4ns ± 4%    45.0ns ± 4%  -0.89%  (p=0.019 n=40+40)
      FmtFprintfString-4         81.9ns ± 7%    81.3ns ± 1%    ~     (p=0.660 n=40+25)
      FmtFprintfInt-4            91.9ns ± 9%    91.4ns ± 9%    ~     (p=0.249 n=40+40)
      FmtFprintfIntInt-4          143ns ± 4%     143ns ± 4%    ~     (p=0.760 n=40+40)
      FmtFprintfPrefixedInt-4     184ns ± 3%     183ns ± 4%    ~     (p=0.485 n=40+40)
      FmtFprintfFloat-4           408ns ± 3%     409ns ± 3%    ~     (p=0.961 n=40+40)
      FmtManyArgs-4               597ns ± 4%     602ns ± 3%    ~     (p=0.413 n=40+40)
      GobDecode-4                7.13ms ± 6%    7.14ms ± 6%    ~     (p=0.859 n=40+40)
      GobEncode-4                6.86ms ± 9%    6.94ms ± 7%    ~     (p=0.162 n=40+40)
      Gzip-4                      395ms ± 4%     396ms ± 3%    ~     (p=0.099 n=40+40)
      Gunzip-4                   40.9ms ± 4%    41.1ms ± 3%    ~     (p=0.064 n=40+40)
      HTTPClientServer-4         63.6µs ± 2%    63.6µs ± 3%    ~     (p=0.832 n=36+39)
      JSONEncode-4               16.1ms ± 3%    15.8ms ± 3%  -1.60%  (p=0.001 n=40+40)
      JSONDecode-4               61.0ms ± 3%    61.5ms ± 4%    ~     (p=0.065 n=40+40)
      Mandelbrot200-4            5.16ms ± 3%    5.18ms ± 3%    ~     (p=0.056 n=40+40)
      GoParse-4                  3.25ms ± 2%    3.23ms ± 3%    ~     (p=0.727 n=40+40)
      RegexpMatchEasy0_32-4      90.2ns ± 3%    89.3ns ± 6%  -0.98%  (p=0.002 n=40+40)
      RegexpMatchEasy0_1K-4       812ns ± 3%     815ns ± 3%    ~     (p=0.309 n=40+40)
      RegexpMatchEasy1_32-4       103ns ± 6%     103ns ± 5%    ~     (p=0.680 n=40+40)
      RegexpMatchEasy1_1K-4      1.01µs ± 4%    1.02µs ± 3%    ~     (p=0.326 n=40+33)
      RegexpMatchMedium_32-4      120ns ± 4%     120ns ± 5%    ~     (p=0.834 n=40+40)
      RegexpMatchMedium_1K-4     40.1µs ± 3%    39.5µs ± 4%  -1.35%  (p=0.000 n=40+40)
      RegexpMatchHard_32-4       2.27µs ± 6%    2.23µs ± 4%  -1.67%  (p=0.011 n=40+40)
      RegexpMatchHard_1K-4       67.2µs ± 3%    67.2µs ± 3%    ~     (p=0.149 n=40+40)
      Revcomp-4                   1.84s ± 2%     1.86s ± 3%  +0.70%  (p=0.020 n=40+40)
      Template-4                 69.0ms ± 4%    69.8ms ± 3%  +1.20%  (p=0.003 n=40+40)
      TimeParse-4                 438ns ± 3%     439ns ± 4%    ~     (p=0.650 n=40+40)
      TimeFormat-4                412ns ± 3%     412ns ± 3%    ~     (p=0.888 n=40+40)
      [Geo mean]                 65.2µs         65.2µs       -0.04%
      
      name                     old speed      new speed      delta
      GobDecode-4               108MB/s ± 6%   108MB/s ± 6%    ~     (p=0.855 n=40+40)
      GobEncode-4               112MB/s ± 9%   111MB/s ± 8%    ~     (p=0.159 n=40+40)
      Gzip-4                   49.2MB/s ± 4%  49.1MB/s ± 3%    ~     (p=0.102 n=40+40)
      Gunzip-4                  474MB/s ± 3%   472MB/s ± 3%    ~     (p=0.063 n=40+40)
      JSONEncode-4              121MB/s ± 3%   123MB/s ± 3%  +1.62%  (p=0.001 n=40+40)
      JSONDecode-4             31.9MB/s ± 3%  31.6MB/s ± 4%    ~     (p=0.070 n=40+40)
      GoParse-4                17.9MB/s ± 2%  17.9MB/s ± 3%    ~     (p=0.696 n=40+40)
      RegexpMatchEasy0_32-4     355MB/s ± 3%   358MB/s ± 5%  +0.99%  (p=0.002 n=40+40)
      RegexpMatchEasy0_1K-4    1.26GB/s ± 3%  1.26GB/s ± 3%    ~     (p=0.381 n=40+40)
      RegexpMatchEasy1_32-4     310MB/s ± 5%   310MB/s ± 4%    ~     (p=0.655 n=40+40)
      RegexpMatchEasy1_1K-4    1.01GB/s ± 4%  1.01GB/s ± 3%    ~     (p=0.351 n=40+33)
      RegexpMatchMedium_32-4   8.32MB/s ± 4%  8.34MB/s ± 5%    ~     (p=0.696 n=40+40)
      RegexpMatchMedium_1K-4   25.6MB/s ± 3%  25.9MB/s ± 4%  +1.36%  (p=0.000 n=40+40)
      RegexpMatchHard_32-4     14.1MB/s ± 6%  14.3MB/s ± 4%  +1.64%  (p=0.011 n=40+40)
      RegexpMatchHard_1K-4     15.2MB/s ± 3%  15.2MB/s ± 3%    ~     (p=0.147 n=40+40)
      Revcomp-4                 138MB/s ± 2%   137MB/s ± 3%  -0.70%  (p=0.021 n=40+40)
      Template-4               28.1MB/s ± 4%  27.8MB/s ± 3%  -1.19%  (p=0.003 n=40+40)
      [Geo mean]               83.7MB/s       83.7MB/s       +0.03%
      
      Change-Id: I2a2b3a942b5c45467491515d201179fd192e65c9
      Reviewed-on: https://go-review.googlesource.com/c/141650
      Run-TryBot: Ben Shi <powerman1st@163.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      4b78fe57
    • Ben Shi's avatar
      test/codegen: fix confusing test cases · 3785be30
      Ben Shi authored
      ARMv7's MULAF/MULSF/MULAD/MULSD are not fused,
      this CL fixes the confusing test cases.
      
      Change-Id: I35022e207e2f0d24a23a7f6f188e41ba8eee9886
      Reviewed-on: https://go-review.googlesource.com/c/142439
      Run-TryBot: Ben Shi <powerman1st@163.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarAkhil Indurti <aindurti@gmail.com>
      Reviewed-by: 's avatarGiovanni Bajo <rasky@develer.com>
      3785be30
  2. 15 Oct, 2018 19 commits
    • Daniel Martí's avatar
      cmd/compile: don't panic on invalid map key declarations · 7f331313
      Daniel Martí authored
      In golang.org/cl/75310, the compiler's typechecker was changed so that
      map key types were validated at a later stage, to make sure that all the
      necessary type information was present.
      
      This still worked for map type declarations, but caused a regression for
      top-level map variable declarations. These now caused a fatal panic
      instead of a typechecking error.
      
      The cause was that checkMapKeys was run too early, before all
      typechecking was done. In particular, top-level map variable
      declarations are typechecked as external declarations, much later than
      where checkMapKeys was run.
      
      Add a test case for both exported and unexported top-level map
      declarations, and add a second call to checkMapKeys at the actual end of
      typechecking. Simply moving the one call isn't a good solution either;
      the comments expand on that.
      
      Fixes #28058.
      
      Change-Id: Ia5febb01a1d877447cf66ba44fb49a7e0f4f18a5
      Reviewed-on: https://go-review.googlesource.com/c/140417
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      7f331313
    • Martin Möhrmann's avatar
      internal/cpu: add invalid option warnings and support to enable cpu features · 3e0227f6
      Martin Möhrmann authored
      This CL adds the ability to enable the cpu feature FEATURE by specifying
      FEATURE=on in GODEBUGCPU. Syntax support to enable cpu features is useful
      in combination with a preceeding all=off to disable all but some specific
      cpu features. Example:
      
      GODEBUGCPU=all=off,sse3=on
      
      This CL implements printing of warnings for invalid GODEBUGCPU settings:
      - requests enabling features that are not supported with the current CPU
      - specifying values different than 'on' or 'off' for a feature
      - settings for unkown cpu feature names
      
      Updates #27218
      
      Change-Id: Ic13e5c4c35426a390c50eaa4bd2a408ef2ee21be
      Reviewed-on: https://go-review.googlesource.com/c/141800
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      3e0227f6
    • Martin Möhrmann's avatar
      strconv: add comment explaining bounded shift in formatBits · f81d73e8
      Martin Möhrmann authored
      The compiler can generate better code for shifts bounded to be less than 32
      and thereby known to be less than any register width.
      See https://golang.org/cl/109776.
      
      Change-Id: I0c4c9f0faafa065fce3c10fd328830deb92f9e38
      Reviewed-on: https://go-review.googlesource.com/c/111735
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      f81d73e8
    • Martin Möhrmann's avatar
      cmd/compile: avoid string allocations when map key is struct or array literal · a0f57c3f
      Martin Möhrmann authored
      x = map[string(byteslice)] is already optimized by the compiler to avoid a
      string allocation. This CL generalizes this optimization to:
      
      x = map[T1{ ... Tn{..., string(byteslice), ...} ... }]
      where T1 to Tn is a nesting of struct and array literals.
      
      Found in a hot code path that used a struct of strings made from []byte
      slices to make a map lookup.
      
      There are no uses of the more generalized optimization in the standard library.
      Passes toolstash -cmp.
      
      MapStringConversion/32/simple    21.9ns ± 2%    21.9ns ± 3%      ~     (p=0.995 n=17+20)
      MapStringConversion/32/struct    28.8ns ± 3%    22.0ns ± 2%   -23.80%  (p=0.000 n=20+20)
      MapStringConversion/32/array     28.5ns ± 2%    21.9ns ± 2%   -23.14%  (p=0.000 n=19+16)
      MapStringConversion/64/simple    21.0ns ± 2%    21.1ns ± 3%      ~     (p=0.072 n=19+18)
      MapStringConversion/64/struct    72.4ns ± 3%    21.3ns ± 2%   -70.53%  (p=0.000 n=20+20)
      MapStringConversion/64/array     72.8ns ± 1%    21.0ns ± 2%   -71.13%  (p=0.000 n=17+19)
      
      name                           old allocs/op  new allocs/op  delta
      MapStringConversion/32/simple      0.00           0.00           ~     (all equal)
      MapStringConversion/32/struct      0.00           0.00           ~     (all equal)
      MapStringConversion/32/array       0.00           0.00           ~     (all equal)
      MapStringConversion/64/simple      0.00           0.00           ~     (all equal)
      MapStringConversion/64/struct      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
      MapStringConversion/64/array       1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
      
      Change-Id: I483b4d84d8d74b1025b62c954da9a365e79b7a3a
      Reviewed-on: https://go-review.googlesource.com/c/116275Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      a0f57c3f
    • Martin Möhrmann's avatar
      cmd/compile: add intrinsics for runtime/internal/math on 386 and amd64 · a1ca4893
      Martin Möhrmann authored
      Add generic, 386 and amd64 specific ops and SSA rules for multiplication
      with overflow and branching based on overflow flags. Use these to intrinsify
      runtime/internal/math.MulUinptr.
      
      On amd64
        mul, overflow := math.MulUintptr(a, b)
        if overflow {
      is lowered to two instructions:
        MULQ SI
        JO 0x10ee35c
      
      No codegen tests as codegen can not currently test unexported internal runtime
      functions.
      
      amd64:
      name              old time/op  new time/op  delta
      MulUintptr/small  1.16ns ± 5%  0.88ns ± 6%  -24.36%  (p=0.000 n=19+20)
      MulUintptr/large  10.7ns ± 1%   1.1ns ± 1%  -89.28%  (p=0.000 n=17+19)
      
      Change-Id: If60739a86f820e5044d677276c21df90d3c7a86a
      Reviewed-on: https://go-review.googlesource.com/c/141820
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      a1ca4893
    • Martin Möhrmann's avatar
      cmd/compile: avoid implicit bounds checks after explicit checks for append · 9f66b41b
      Martin Möhrmann authored
      The generated code for the append builtin already checks if the appended
      to slice is large enough and calls growslice if that is not the case.
      Trust that this ensures the slice is large enough and avoid the
      implicit bounds check when slicing the slice to its new size.
      
      Removes 365 panicslice calls (-14%) from the go binary which
      reduces the binary size by ~12kbyte.
      
      Change-Id: I1b88418675ff409bc0b956853c9e95241274d5a6
      Reviewed-on: https://go-review.googlesource.com/c/119315
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      9f66b41b
    • Martin Möhrmann's avatar
      runtime/internal/math: add multiplication with overflow check · c9130cae
      Martin Möhrmann authored
      This CL adds a new internal math package for use by the runtime.
      The new package exports a MulUintptr function with uintptr arguments
      a and b and returns uintptr(a*b) and whether the full-width product
      x*y does overflow the uintptr value range (uintptr(x*y) != x*y).
      
      Uses of MulUinptr in the runtime and intrinsics for performance
      will be added in followup CLs.
      
      Updates #21588
      
      Change-Id: Ia5a02eeabc955249118e4edf68c67d9fc0858058
      Reviewed-on: https://go-review.googlesource.com/c/91755
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      c9130cae
    • Keith Randall's avatar
      cmd/compile: check order temp has correct type · 240a30da
      Keith Randall authored
      Followon from CL 140306
      
      Change-Id: Ic71033d2301105b15b60645d895a076107f44a2e
      Reviewed-on: https://go-review.googlesource.com/c/142178
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      240a30da
    • Alberto Donizetti's avatar
      test/codegen: test ppc64 TrailingZeros, OnesCount codegen · 7c96d87e
      Alberto Donizetti authored
      This change adds codegen tests for the intrinsification on ppc64 of
      the OnesCount{64,32,16,8}, and TrailingZeros{64,32,16,8} math/bits
      functions.
      
      Change-Id: Id3364921fbd18316850e15c8c71330c906187fdb
      Reviewed-on: https://go-review.googlesource.com/c/141897Reviewed-by: 's avatarLynn Boger <laboger@linux.vnet.ibm.com>
      7c96d87e
    • Josh Bleecher Snyder's avatar
      cmd/compile: fuse before branchelim · a55f3ee4
      Josh Bleecher Snyder authored
      The branchelim pass works better after fuse.
      Running fuse before branchelim also increases
      the stability of generated code amidst other compiler changes,
      which was the original motivation behind this change.
      
      The fuse pass is not cheap enough to run in its entirety
      before branchelim, but the most important half of it is.
      This change makes it possible to run "plain fuse" independently
      and does so before branchelim.
      
      During make.bash, elimIf occurrences increase from 4244 to 4288 (1%),
      and elimIfElse occurrences increase from 989 to 1079 (9%).
      
      Toolspeed impact is marginal; plain fuse pays for itself.
      
      name        old time/op       new time/op       delta
      Template          189ms ± 2%        189ms ± 2%    ~     (p=0.890 n=45+46)
      Unicode          93.2ms ± 5%       93.4ms ± 7%    ~     (p=0.790 n=48+48)
      GoTypes           662ms ± 4%        660ms ± 4%    ~     (p=0.186 n=48+49)
      Compiler          2.89s ± 4%        2.91s ± 3%  +0.89%  (p=0.050 n=49+44)
      SSA               8.23s ± 2%        8.21s ± 1%    ~     (p=0.165 n=46+44)
      Flate             123ms ± 4%        123ms ± 3%  +0.58%  (p=0.031 n=47+49)
      GoParser          154ms ± 4%        154ms ± 4%    ~     (p=0.492 n=49+48)
      Reflect           430ms ± 4%        429ms ± 4%    ~     (p=1.000 n=48+48)
      Tar               171ms ± 3%        170ms ± 4%    ~     (p=0.122 n=48+48)
      XML               232ms ± 3%        232ms ± 2%    ~     (p=0.850 n=46+49)
      [Geo mean]        394ms             394ms       +0.02%
      
      name        old user-time/op  new user-time/op  delta
      Template          236ms ± 5%        236ms ± 4%    ~     (p=0.934 n=50+50)
      Unicode           132ms ± 7%        130ms ± 9%    ~     (p=0.087 n=50+50)
      GoTypes           861ms ± 3%        867ms ± 4%    ~     (p=0.124 n=48+50)
      Compiler          3.93s ± 4%        3.94s ± 3%    ~     (p=0.584 n=49+44)
      SSA               12.2s ± 2%        12.3s ± 1%    ~     (p=0.610 n=46+45)
      Flate             149ms ± 4%        150ms ± 4%    ~     (p=0.194 n=48+49)
      GoParser          193ms ± 5%        191ms ± 6%    ~     (p=0.239 n=49+50)
      Reflect           553ms ± 5%        556ms ± 5%    ~     (p=0.091 n=49+49)
      Tar               218ms ± 5%        218ms ± 5%    ~     (p=0.359 n=49+50)
      XML               299ms ± 5%        298ms ± 4%    ~     (p=0.482 n=50+49)
      [Geo mean]        516ms             516ms       -0.01%
      
      name        old alloc/op      new alloc/op      delta
      Template         36.3MB ± 0%       36.3MB ± 0%  -0.02%  (p=0.000 n=49+49)
      Unicode          29.7MB ± 0%       29.7MB ± 0%    ~     (p=0.270 n=50+50)
      GoTypes           126MB ± 0%        126MB ± 0%  -0.34%  (p=0.000 n=50+49)
      Compiler          534MB ± 0%        531MB ± 0%  -0.50%  (p=0.000 n=50+50)
      SSA              1.98GB ± 0%       1.98GB ± 0%  -0.06%  (p=0.000 n=49+49)
      Flate            24.6MB ± 0%       24.6MB ± 0%  -0.29%  (p=0.000 n=50+50)
      GoParser         29.5MB ± 0%       29.4MB ± 0%  -0.15%  (p=0.000 n=49+50)
      Reflect          87.3MB ± 0%       87.2MB ± 0%  -0.13%  (p=0.000 n=49+50)
      Tar              35.6MB ± 0%       35.5MB ± 0%  -0.17%  (p=0.000 n=50+50)
      XML              48.2MB ± 0%       48.0MB ± 0%  -0.30%  (p=0.000 n=48+50)
      [Geo mean]       83.1MB            82.9MB       -0.20%
      
      name        old allocs/op     new allocs/op     delta
      Template           352k ± 0%         352k ± 0%  -0.01%  (p=0.004 n=49+49)
      Unicode            341k ± 0%         341k ± 0%    ~     (p=0.341 n=48+50)
      GoTypes           1.28M ± 0%        1.28M ± 0%  -0.03%  (p=0.000 n=50+49)
      Compiler          4.96M ± 0%        4.96M ± 0%  -0.05%  (p=0.000 n=50+49)
      SSA               15.5M ± 0%        15.5M ± 0%  -0.01%  (p=0.000 n=50+49)
      Flate              233k ± 0%         233k ± 0%  +0.01%  (p=0.032 n=49+49)
      GoParser           294k ± 0%         294k ± 0%    ~     (p=0.052 n=46+48)
      Reflect           1.04M ± 0%        1.04M ± 0%    ~     (p=0.171 n=50+47)
      Tar                343k ± 0%         343k ± 0%  -0.03%  (p=0.000 n=50+50)
      XML                429k ± 0%         429k ± 0%  -0.04%  (p=0.000 n=50+50)
      [Geo mean]         812k              812k       -0.02%
      
      Object files grow slightly; branchelim often increases binary size, at least on amd64.
      
      name        old object-bytes  new object-bytes  delta
      Template          509kB ± 0%        509kB ± 0%  -0.01%  (p=0.008 n=5+5)
      Unicode           224kB ± 0%        224kB ± 0%    ~     (all equal)
      GoTypes          1.84MB ± 0%       1.84MB ± 0%  +0.00%  (p=0.008 n=5+5)
      Compiler         6.71MB ± 0%       6.71MB ± 0%  +0.01%  (p=0.008 n=5+5)
      SSA              21.2MB ± 0%       21.2MB ± 0%  +0.01%  (p=0.008 n=5+5)
      Flate             324kB ± 0%        324kB ± 0%  -0.00%  (p=0.008 n=5+5)
      GoParser          404kB ± 0%        404kB ± 0%  -0.02%  (p=0.008 n=5+5)
      Reflect          1.40MB ± 0%       1.40MB ± 0%  +0.09%  (p=0.008 n=5+5)
      Tar               452kB ± 0%        452kB ± 0%  +0.06%  (p=0.008 n=5+5)
      XML               596kB ± 0%        596kB ± 0%  +0.00%  (p=0.008 n=5+5)
      [Geo mean]       1.04MB            1.04MB       +0.01%
      
      Change-Id: I535c711b85380ff657fc0f022bebd9cb14ddd07f
      Reviewed-on: https://go-review.googlesource.com/c/129378
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      a55f3ee4
    • Keith Randall's avatar
      cmd/compile: provide types for all order-allocated temporaries · 63e964e1
      Keith Randall authored
      Ensure that we correctly type the stack temps for regular closures,
      method function closures, and slice literals.
      
      Then we don't need to override the dummy types later.
      Furthermore, this allows order to reuse temporaries of these types.
      
      OARRAYLIT doesn't need a temporary as far as I can tell, so I
      removed that case from order.
      
      Change-Id: Ic58520fa50c90639393ff78f33d3c831d5c4acb9
      Reviewed-on: https://go-review.googlesource.com/c/140306Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      63e964e1
    • Keith Randall's avatar
      cmd/compile: fix gdb stepping test · 296b7aea
      Keith Randall authored
      Not sure why this changed behavior, but seems mostly harmless.
      
      Fixes #28198
      
      Change-Id: Ie25c6e1fcb64912a582c7ae7bf92c4c1642e83cb
      Reviewed-on: https://go-review.googlesource.com/c/141649Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      296b7aea
    • Ben Shi's avatar
      test/codegen: add tests of FMA for arm/arm64 · 93e27e01
      Ben Shi authored
      This CL adds tests of fused multiplication-accumulation
      on arm/arm64.
      
      Change-Id: Ic85d5277c0d6acb7e1e723653372dfaf96824a39
      Reviewed-on: https://go-review.googlesource.com/c/141652
      Run-TryBot: Ben Shi <powerman1st@163.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      93e27e01
    • Akhil Indurti's avatar
      internal/cpu: expose ARM feature flags for FMA · bb3bf5bb
      Akhil Indurti authored
      This change exposes feature flags needed to implement an FMA intrinsic
      on ARM CPUs via auxv's HWCAP bits. Specifically, it exposes HasVFPv4 to
      detect if an ARM processor has the fourth version of the vector floating
      point unit. The relevant instruction for this CL is VFMA, emitted in Go
      as FMULAD.
      
      Updates #26630.
      
      Change-Id: Ibbc04fb24c2b4d994f93762360f1a37bc6d83ff7
      Reviewed-on: https://go-review.googlesource.com/c/126315
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarMartin Möhrmann <moehrmann@google.com>
      bb3bf5bb
    • Martin Möhrmann's avatar
      cmd/compile: simplify as2 method of *Order · d6e80069
      Martin Möhrmann authored
      Merge the two for loops that set up the node lists for
      temporaries into one for loop.
      
      Passes toolstash -cmp
      
      Change-Id: Ibc739115f38c8869b0dcfbf9819fdc2fc96962e0
      Reviewed-on: https://go-review.googlesource.com/c/141819Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      d6e80069
    • avsharapov's avatar
      cmd/cgo: simplify switch statement to if statement · 9322b533
      avsharapov authored
      Change-Id: Ie7dce45d554fde69d682680f55abba6a7fc55036
      Reviewed-on: https://go-review.googlesource.com/c/142017Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      9322b533
    • Ivan Sharavuev's avatar
      pprof: replace bits = bits + "..." to bits += "..." where bits is a string. · e47c11d8
      Ivan Sharavuev authored
      Change-Id: Ic77ebbdf2670b7fdf2c381cd1ba768624b07e57c
      Reviewed-on: https://go-review.googlesource.com/c/141998
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      e47c11d8
    • OlgaVlPetrova's avatar
      src/cmd/compile/internal/ssa: replace `s = s + x' => 's += x'. · 85066acc
      OlgaVlPetrova authored
      Change-Id: I1f399a8a0aa200bfda01f97f920b1345e59956ba
      Reviewed-on: https://go-review.googlesource.com/c/142057
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      85066acc
    • Ben Shi's avatar
      test/codegen: add tests for multiplication-subtraction · c3208842
      Ben Shi authored
      This CL adds tests for armv7's MULS and arm64's MSUBW.
      
      Change-Id: Id0fd5d26fd477e4ed14389b0d33cad930423eb5b
      Reviewed-on: https://go-review.googlesource.com/c/141651
      Run-TryBot: Ben Shi <powerman1st@163.com>
      Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      c3208842
  3. 14 Oct, 2018 3 commits
    • Keith Randall's avatar
      cmd/compile: reuse temporaries in order pass · 389e9427
      Keith Randall authored
      Instead of allocating a new temporary each time one
      is needed, keep a list of temporaries which are free
      (have already been VARKILLed on every path) and use
      one of them.
      
      Should save a lot of stack space. In a function like this:
      
      func main() {
           fmt.Printf("%d %d\n", 2, 3)
           fmt.Printf("%d %d\n", 4, 5)
           fmt.Printf("%d %d\n", 6, 7)
      }
      
      The three [2]interface{} arrays used to hold the ... args
      all use the same autotmp, instead of 3 different autotmps
      as happened previous to this CL.
      
      Change-Id: I2d728e226f81e05ae68ca8247af62014a1b032d3
      Reviewed-on: https://go-review.googlesource.com/c/140301
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      389e9427
    • Keith Randall's avatar
      runtime,cmd/compile: pass strings and slices to convT2{E,I} by value · 0e9f8a21
      Keith Randall authored
      When we pass these types by reference, we usually have to allocate
      temporaries on the stack, initialize them, then pass their address
      to the conversion functions. It's simpler to pass these types
      directly by value.
      
      This particularly applies to conversions needed for fmt.Printf
      (to interface{} for constructing a [...]interface{}).
      
      func f(a, b, c string) {
           fmt.Printf("%s %s\n", a, b)
           fmt.Printf("%s %s\n", b, c)
      }
      
      This function's stack frame shrinks from 200 to 136 bytes, and
      its code shrinks from 535 to 453 bytes.
      
      The go binary shrinks 0.3%.
      
      Update #24286
      
      Aside: for this function f, we don't really need to allocate
      temporaries for the convT2E function. We could use the address
      of a, b, and c directly. That might get similar (or maybe better?)
      improvements. I investigated a bit, but it seemed complicated
      to do it safely. This change was much easier.
      
      Change-Id: I78cbe51b501fb41e1e324ce4203f0de56a1db82d
      Reviewed-on: https://go-review.googlesource.com/c/135377
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarJosh Bleecher Snyder <josharian@gmail.com>
      0e9f8a21
    • Keith Randall's avatar
      cmd/compile: optimize loads from readonly globals into constants · 653a4bd8
      Keith Randall authored
      Instead of
         MOVB go.string."foo"(SB), AX
      do
         MOVB $102, AX
      
      When we know the global we're loading from is readonly, we can
      do that read at compile time.
      
      I've made this arch-dependent mostly because the cases where this
      happens often are memory->memory moves, and those don't get
      decomposed until lowering.
      
      Did amd64/386/arm/arm64. Other architectures could follow.
      
      Update #26498
      
      Change-Id: I41b1dc831b2cd0a52dac9b97f4f4457888a46389
      Reviewed-on: https://go-review.googlesource.com/c/141118
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarJosh Bleecher Snyder <josharian@gmail.com>
      653a4bd8
  4. 13 Oct, 2018 4 commits
  5. 12 Oct, 2018 6 commits
    • Matthew Dempsky's avatar
      cmd/compile: remove ineffectual -i flag · b4150f76
      Matthew Dempsky authored
      This flag lost its usefulness in CL 34273.
      
      Change-Id: I033c29f105937139b4e359a340906be439f1ed07
      Reviewed-on: https://go-review.googlesource.com/c/141646
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      b4150f76
    • Mak Kolybabi's avatar
      doc: fix spelling of `comp[]hensive` to `comp[r]ehensive` · 57456527
      Mak Kolybabi authored
      Change-Id: Idd93e45fab30e7496105b84fc2fce1884711b580
      GitHub-Last-Rev: 43aa04e876655e31fc1c4b2b5ae0702472e49102
      GitHub-Pull-Request: golang/go#27983
      Reviewed-on: https://go-review.googlesource.com/c/141645Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      57456527
    • Russ Cox's avatar
      regexp: add partial Deprecation comment to Copy · bf68744a
      Russ Cox authored
      Change-Id: I21b7817e604a48330f1ee250f7b1b2adc1f16067
      Reviewed-on: https://go-review.googlesource.com/c/139784
      Run-TryBot: Russ Cox <rsc@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      bf68744a
    • Russ Cox's avatar
      regexp: add DeepEqual test · 5160e0d1
      Russ Cox authored
      This locks in behavior we accidentally broke
      and then restored during the Go 1.11 cycle.
      See #26219.
      
      It also locks in new behavior that DeepEqual
      always works, instead of only usually working.
      
      This CL is the final piece of a series of CLs to make
      DeepEqual always work, by eliminating the machine
      cache and making other related optimizations.
      Overall, this whole sequence of CLs achieves:
      
      name                             old time/op    new time/op    delta
      Find-12                             264ns ± 3%     260ns ± 0%   -1.59%  (p=0.000 n=10+9)
      FindAllNoMatches-12                 140ns ± 2%     133ns ± 0%   -5.34%  (p=0.000 n=10+7)
      FindString-12                       256ns ± 0%     249ns ± 0%   -2.73%  (p=0.000 n=8+8)
      FindSubmatch-12                     339ns ± 1%     333ns ± 1%   -1.73%  (p=0.000 n=9+10)
      FindStringSubmatch-12               322ns ± 0%     322ns ± 1%     ~     (p=0.450 n=8+10)
      Literal-12                          100ns ± 2%      92ns ± 0%   -8.13%  (p=0.000 n=10+10)
      NotLiteral-12                      1.50µs ± 0%    1.47µs ± 0%   -1.65%  (p=0.000 n=8+8)
      MatchClass-12                      2.18µs ± 0%    2.15µs ± 0%   -1.05%  (p=0.000 n=10+9)
      MatchClass_InRange-12              2.12µs ± 0%    2.11µs ± 0%   -0.65%  (p=0.000 n=10+9)
      ReplaceAll-12                      1.41µs ± 0%    1.41µs ± 0%     ~     (p=0.254 n=7+10)
      AnchoredLiteralShortNonMatch-12    89.8ns ± 0%    81.5ns ± 0%   -9.22%  (p=0.000 n=8+9)
      AnchoredLiteralLongNonMatch-12      105ns ± 3%      97ns ± 0%   -7.21%  (p=0.000 n=10+10)
      AnchoredShortMatch-12               141ns ± 0%     128ns ± 0%   -9.22%  (p=0.000 n=9+9)
      AnchoredLongMatch-12                276ns ± 4%     253ns ± 2%   -8.23%  (p=0.000 n=10+10)
      OnePassShortA-12                    620ns ± 0%     587ns ± 0%   -5.26%  (p=0.000 n=10+6)
      NotOnePassShortA-12                 575ns ± 3%     547ns ± 1%   -4.77%  (p=0.000 n=10+10)
      OnePassShortB-12                    493ns ± 0%     455ns ± 0%   -7.62%  (p=0.000 n=8+9)
      NotOnePassShortB-12                 423ns ± 0%     406ns ± 1%   -3.95%  (p=0.000 n=8+10)
      OnePassLongPrefix-12                112ns ± 0%     109ns ± 1%   -2.77%  (p=0.000 n=9+10)
      OnePassLongNotPrefix-12             405ns ± 0%     349ns ± 0%  -13.74%  (p=0.000 n=8+9)
      MatchParallelShared-12              501ns ± 1%      38ns ± 2%  -92.42%  (p=0.000 n=10+10)
      MatchParallelCopied-12             39.1ns ± 0%    38.6ns ± 1%   -1.38%  (p=0.002 n=6+10)
      QuoteMetaAll-12                    94.6ns ± 0%    94.8ns ± 0%   +0.26%  (p=0.001 n=10+9)
      QuoteMetaNone-12                   52.7ns ± 0%    52.7ns ± 0%     ~     (all equal)
      Match/Easy0/32-12                  79.1ns ± 0%    72.0ns ± 0%   -8.95%  (p=0.000 n=9+9)
      Match/Easy0/1K-12                   307ns ± 1%     297ns ± 0%   -3.32%  (p=0.000 n=10+7)
      Match/Easy0/32K-12                 4.65µs ± 2%    4.67µs ± 1%     ~     (p=0.633 n=10+8)
      Match/Easy0/1M-12                   234µs ± 0%     234µs ± 0%     ~     (p=0.684 n=10+10)
      Match/Easy0/32M-12                 7.98ms ± 1%    7.96ms ± 0%   -0.31%  (p=0.014 n=9+9)
      Match/Easy0i/32-12                 1.13µs ± 1%    1.10µs ± 0%   -3.18%  (p=0.000 n=9+10)
      Match/Easy0i/1K-12                 32.5µs ± 0%    31.7µs ± 0%   -2.61%  (p=0.000 n=9+9)
      Match/Easy0i/32K-12                1.59ms ± 0%    1.26ms ± 0%  -20.71%  (p=0.000 n=9+7)
      Match/Easy0i/1M-12                 51.0ms ± 0%    40.4ms ± 0%  -20.68%  (p=0.000 n=10+7)
      Match/Easy0i/32M-12                 1.63s ± 0%     1.30s ± 0%  -20.62%  (p=0.001 n=7+7)
      Match/Easy1/32-12                  75.1ns ± 1%    67.4ns ± 0%  -10.24%  (p=0.000 n=8+10)
      Match/Easy1/1K-12                   861ns ± 0%     879ns ± 0%   +2.18%  (p=0.000 n=8+8)
      Match/Easy1/32K-12                 39.2µs ± 1%    34.1µs ± 0%  -13.01%  (p=0.000 n=10+8)
      Match/Easy1/1M-12                  1.38ms ± 0%    1.17ms ± 0%  -15.06%  (p=0.000 n=10+8)
      Match/Easy1/32M-12                 44.2ms ± 1%    37.5ms ± 0%  -15.15%  (p=0.000 n=10+9)
      Match/Medium/32-12                 1.04µs ± 1%    1.03µs ± 0%   -0.64%  (p=0.002 n=9+8)
      Match/Medium/1K-12                 31.3µs ± 0%    31.2µs ± 0%   -0.36%  (p=0.000 n=9+9)
      Match/Medium/32K-12                1.44ms ± 0%    1.20ms ± 0%  -17.02%  (p=0.000 n=8+7)
      Match/Medium/1M-12                 46.1ms ± 0%    38.2ms ± 0%  -17.14%  (p=0.001 n=6+8)
      Match/Medium/32M-12                 1.48s ± 0%     1.23s ± 0%  -17.10%  (p=0.000 n=9+7)
      Match/Hard/32-12                   1.54µs ± 1%    1.47µs ± 0%   -4.64%  (p=0.000 n=9+10)
      Match/Hard/1K-12                   46.4µs ± 1%    44.4µs ± 0%   -4.35%  (p=0.000 n=9+8)
      Match/Hard/32K-12                  2.19ms ± 0%    1.78ms ± 7%  -18.74%  (p=0.000 n=8+10)
      Match/Hard/1M-12                   70.1ms ± 0%    57.7ms ± 7%  -17.62%  (p=0.000 n=8+10)
      Match/Hard/32M-12                   2.24s ± 0%     1.84s ± 8%  -17.92%  (p=0.000 n=8+10)
      Match/Hard1/32-12                  8.17µs ± 1%    7.95µs ± 0%   -2.72%  (p=0.000 n=8+10)
      Match/Hard1/1K-12                   254µs ± 2%     245µs ± 0%   -3.62%  (p=0.000 n=9+10)
      Match/Hard1/32K-12                 9.58ms ± 1%    8.54ms ± 7%  -10.87%  (p=0.000 n=10+10)
      Match/Hard1/1M-12                   306ms ± 1%     271ms ± 8%  -11.42%  (p=0.000 n=9+10)
      Match/Hard1/32M-12                  9.79s ± 1%     8.58s ± 9%  -12.37%  (p=0.000 n=9+10)
      Match_onepass_regex/32-12           808ns ± 0%     716ns ± 1%  -11.39%  (p=0.000 n=8+9)
      Match_onepass_regex/1K-12          27.8µs ± 0%    19.9µs ± 2%  -28.51%  (p=0.000 n=8+9)
      Match_onepass_regex/32K-12          925µs ± 0%     631µs ± 2%  -31.71%  (p=0.000 n=9+9)
      Match_onepass_regex/1M-12          29.5ms ± 0%    20.2ms ± 2%  -31.53%  (p=0.000 n=10+9)
      Match_onepass_regex/32M-12          945ms ± 0%     648ms ± 2%  -31.39%  (p=0.000 n=9+9)
      CompileOnepass-12                  4.67µs ± 0%    4.60µs ± 0%   -1.48%  (p=0.000 n=10+10)
      [Geo mean]                         24.5µs         21.4µs       -12.94%
      
      https://perf.golang.org/search?q=upload:20181004.5
      
      Change-Id: Icb17b306830dc5489efbb55900937b94ce0eb047
      Reviewed-on: https://go-review.googlesource.com/c/139783
      Run-TryBot: Russ Cox <rsc@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      5160e0d1
    • Russ Cox's avatar
      regexp: evaluate context flags lazily · 3ca1f28e
      Russ Cox authored
      There's no point in computing whether we're at the
      beginning of the line if the NFA isn't going to ask.
      Wait to compute that until asked.
      
      Whatever minor slowdowns were introduced by
      the conversion to pools that were not repaid by
      other optimizations are taken care of by this one.
      
      name                             old time/op    new time/op    delta
      Find-12                             252ns ± 0%     260ns ± 0%   +3.34%  (p=0.000 n=10+8)
      FindAllNoMatches-12                 136ns ± 4%     134ns ± 4%   -0.96%  (p=0.033 n=10+10)
      FindString-12                       246ns ± 0%     250ns ± 0%   +1.46%  (p=0.000 n=8+10)
      FindSubmatch-12                     332ns ± 1%     332ns ± 0%     ~     (p=0.101 n=9+10)
      FindStringSubmatch-12               321ns ± 1%     322ns ± 1%     ~     (p=0.717 n=9+10)
      Literal-12                         91.6ns ± 0%    92.3ns ± 0%   +0.74%  (p=0.000 n=9+9)
      NotLiteral-12                      1.47µs ± 0%    1.47µs ± 0%   +0.38%  (p=0.000 n=9+8)
      MatchClass-12                      2.15µs ± 0%    2.15µs ± 0%   +0.39%  (p=0.000 n=10+10)
      MatchClass_InRange-12              2.09µs ± 0%    2.11µs ± 0%   +0.75%  (p=0.000 n=9+9)
      ReplaceAll-12                      1.40µs ± 0%    1.40µs ± 0%     ~     (p=0.525 n=10+10)
      AnchoredLiteralShortNonMatch-12    83.5ns ± 0%    81.6ns ± 0%   -2.28%  (p=0.000 n=9+10)
      AnchoredLiteralLongNonMatch-12      101ns ± 0%      97ns ± 1%   -3.54%  (p=0.000 n=10+10)
      AnchoredShortMatch-12               131ns ± 0%     128ns ± 0%   -2.29%  (p=0.000 n=10+9)
      AnchoredLongMatch-12                268ns ± 1%     252ns ± 1%   -6.04%  (p=0.000 n=10+10)
      OnePassShortA-12                    614ns ± 0%     587ns ± 1%   -4.33%  (p=0.000 n=6+10)
      NotOnePassShortA-12                 552ns ± 0%     547ns ± 1%   -0.89%  (p=0.000 n=10+10)
      OnePassShortB-12                    494ns ± 0%     455ns ± 0%   -7.96%  (p=0.000 n=9+9)
      NotOnePassShortB-12                 411ns ± 0%     406ns ± 0%   -1.30%  (p=0.000 n=9+9)
      OnePassLongPrefix-12                109ns ± 0%     108ns ± 1%     ~     (p=0.064 n=8+9)
      OnePassLongNotPrefix-12             403ns ± 0%     349ns ± 0%  -13.30%  (p=0.000 n=9+8)
      MatchParallelShared-12             38.9ns ± 1%    37.9ns ± 1%   -2.65%  (p=0.000 n=10+8)
      MatchParallelCopied-12             39.2ns ± 1%    38.3ns ± 2%   -2.20%  (p=0.001 n=10+10)
      QuoteMetaAll-12                    94.5ns ± 0%    94.7ns ± 0%   +0.18%  (p=0.043 n=10+9)
      QuoteMetaNone-12                   52.7ns ± 0%    52.7ns ± 0%     ~     (all equal)
      Match/Easy0/32-12                  72.2ns ± 0%    71.9ns ± 0%   -0.38%  (p=0.009 n=8+10)
      Match/Easy0/1K-12                   296ns ± 1%     297ns ± 0%   +0.51%  (p=0.001 n=10+9)
      Match/Easy0/32K-12                 4.57µs ± 3%    4.61µs ± 2%     ~     (p=0.280 n=10+10)
      Match/Easy0/1M-12                   234µs ± 0%     234µs ± 0%     ~     (p=0.986 n=10+10)
      Match/Easy0/32M-12                 7.96ms ± 0%    7.98ms ± 0%   +0.22%  (p=0.010 n=10+9)
      Match/Easy0i/32-12                 1.09µs ± 0%    1.10µs ± 0%   +0.23%  (p=0.000 n=8+9)
      Match/Easy0i/1K-12                 31.7µs ± 0%    31.7µs ± 0%   +0.09%  (p=0.003 n=9+8)
      Match/Easy0i/32K-12                1.61ms ± 0%    1.27ms ± 1%  -21.03%  (p=0.000 n=8+10)
      Match/Easy0i/1M-12                 51.4ms ± 0%    40.4ms ± 0%  -21.29%  (p=0.000 n=8+8)
      Match/Easy0i/32M-12                 1.65s ± 0%     1.30s ± 1%  -21.22%  (p=0.000 n=9+9)
      Match/Easy1/32-12                  67.6ns ± 1%    67.2ns ± 0%     ~     (p=0.085 n=10+9)
      Match/Easy1/1K-12                   873ns ± 2%     880ns ± 0%   +0.78%  (p=0.006 n=9+7)
      Match/Easy1/32K-12                 39.7µs ± 1%    34.3µs ± 3%  -13.53%  (p=0.000 n=10+10)
      Match/Easy1/1M-12                  1.41ms ± 1%    1.19ms ± 3%  -15.48%  (p=0.000 n=10+10)
      Match/Easy1/32M-12                 44.9ms ± 1%    38.0ms ± 2%  -15.21%  (p=0.000 n=10+10)
      Match/Medium/32-12                 1.04µs ± 0%    1.03µs ± 0%   -0.57%  (p=0.000 n=9+9)
      Match/Medium/1K-12                 31.2µs ± 0%    31.4µs ± 1%   +0.61%  (p=0.000 n=8+10)
      Match/Medium/32K-12                1.45ms ± 1%    1.20ms ± 0%  -17.70%  (p=0.000 n=10+8)
      Match/Medium/1M-12                 46.4ms ± 0%    38.4ms ± 2%  -17.32%  (p=0.000 n=6+9)
      Match/Medium/32M-12                 1.49s ± 1%     1.24s ± 1%  -16.81%  (p=0.000 n=10+10)
      Match/Hard/32-12                   1.47µs ± 0%    1.47µs ± 0%   -0.31%  (p=0.000 n=9+10)
      Match/Hard/1K-12                   44.5µs ± 1%    44.4µs ± 0%     ~     (p=0.075 n=10+10)
      Match/Hard/32K-12                  2.09ms ± 0%    1.78ms ± 7%  -14.88%  (p=0.000 n=8+10)
      Match/Hard/1M-12                   67.8ms ± 5%    56.9ms ± 7%  -16.05%  (p=0.000 n=10+10)
      Match/Hard/32M-12                   2.17s ± 5%     1.84s ± 6%  -15.21%  (p=0.000 n=10+10)
      Match/Hard1/32-12                  7.89µs ± 0%    7.94µs ± 0%   +0.61%  (p=0.000 n=9+9)
      Match/Hard1/1K-12                   246µs ± 0%     245µs ± 0%   -0.30%  (p=0.010 n=9+10)
      Match/Hard1/32K-12                 8.93ms ± 0%    8.17ms ± 0%   -8.44%  (p=0.000 n=9+8)
      Match/Hard1/1M-12                   286ms ± 0%     269ms ± 9%   -5.66%  (p=0.028 n=9+10)
      Match/Hard1/32M-12                  9.16s ± 0%     8.61s ± 8%   -5.98%  (p=0.028 n=9+10)
      Match_onepass_regex/32-12           825ns ± 0%     712ns ± 0%  -13.75%  (p=0.000 n=8+8)
      Match_onepass_regex/1K-12          28.7µs ± 1%    19.8µs ± 0%  -30.99%  (p=0.000 n=9+8)
      Match_onepass_regex/32K-12          950µs ± 1%     628µs ± 0%  -33.83%  (p=0.000 n=9+8)
      Match_onepass_regex/1M-12          30.4ms ± 0%    20.1ms ± 0%  -33.74%  (p=0.000 n=9+8)
      Match_onepass_regex/32M-12          974ms ± 1%     646ms ± 0%  -33.73%  (p=0.000 n=9+8)
      CompileOnepass-12                  4.60µs ± 0%    4.59µs ± 0%     ~     (p=0.063 n=8+9)
      [Geo mean]                         23.1µs         21.3µs        -7.44%
      
      https://perf.golang.org/search?q=upload:20181004.4
      
      Change-Id: I47cdd09f6dcde1d7c317080e9b4df42c7d0a8d24
      Reviewed-on: https://go-review.googlesource.com/c/139782
      Run-TryBot: Russ Cox <rsc@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      3ca1f28e
    • Russ Cox's avatar
      regexp: use pools for NFA machines · a376435a
      Russ Cox authored
      Now the machine struct is only used for NFA execution.
      Use global pools to cache machines instead of per-Regexp lists.
      
      Also eliminate some tail calls in NFA execution, to pay for
      the added overhead of sync.Pool.
      
      name                             old time/op    new time/op    delta
      Find-12                             252ns ± 0%     252ns ± 0%     ~     (p=1.000 n=10+10)
      FindAllNoMatches-12                 134ns ± 1%     136ns ± 4%     ~     (p=0.443 n=9+10)
      FindString-12                       246ns ± 0%     246ns ± 0%   -0.16%  (p=0.046 n=10+8)
      FindSubmatch-12                     333ns ± 2%     332ns ± 1%     ~     (p=0.489 n=10+9)
      FindStringSubmatch-12               320ns ± 0%     321ns ± 1%   +0.55%  (p=0.005 n=10+9)
      Literal-12                         91.1ns ± 0%    91.6ns ± 0%   +0.55%  (p=0.000 n=10+9)
      NotLiteral-12                      1.45µs ± 0%    1.47µs ± 0%   +0.82%  (p=0.000 n=10+9)
      MatchClass-12                      2.19µs ± 0%    2.15µs ± 0%   -2.01%  (p=0.000 n=9+10)
      MatchClass_InRange-12              2.09µs ± 0%    2.09µs ± 0%     ~     (p=0.082 n=10+9)
      ReplaceAll-12                      1.39µs ± 0%    1.40µs ± 0%   +0.50%  (p=0.000 n=10+10)
      AnchoredLiteralShortNonMatch-12    82.4ns ± 0%    83.5ns ± 0%   +1.36%  (p=0.000 n=8+9)
      AnchoredLiteralLongNonMatch-12      106ns ± 1%     101ns ± 0%   -4.36%  (p=0.000 n=10+10)
      AnchoredShortMatch-12               130ns ± 0%     131ns ± 0%   +0.77%  (p=0.000 n=9+10)
      AnchoredLongMatch-12                272ns ± 0%     268ns ± 1%   -1.46%  (p=0.000 n=8+10)
      OnePassShortA-12                    615ns ± 0%     614ns ± 0%     ~     (p=0.094 n=10+6)
      NotOnePassShortA-12                 549ns ± 0%     552ns ± 0%   +0.52%  (p=0.000 n=9+10)
      OnePassShortB-12                    494ns ± 0%     494ns ± 0%     ~     (p=0.247 n=8+9)
      NotOnePassShortB-12                 412ns ± 1%     411ns ± 0%     ~     (p=0.625 n=10+9)
      OnePassLongPrefix-12                108ns ± 0%     109ns ± 0%   +0.93%  (p=0.000 n=10+8)
      OnePassLongNotPrefix-12             402ns ± 0%     403ns ± 0%   +0.14%  (p=0.041 n=8+9)
      MatchParallelShared-12             38.6ns ± 2%    38.9ns ± 1%     ~     (p=0.172 n=9+10)
      MatchParallelCopied-12             39.4ns ± 7%    39.2ns ± 1%     ~     (p=0.423 n=10+10)
      QuoteMetaAll-12                    94.9ns ± 0%    94.5ns ± 0%   -0.42%  (p=0.000 n=9+10)
      QuoteMetaNone-12                   52.7ns ± 0%    52.7ns ± 0%     ~     (all equal)
      Match/Easy0/32-12                  72.1ns ± 0%    72.2ns ± 0%     ~     (p=0.435 n=9+8)
      Match/Easy0/1K-12                   298ns ± 0%     296ns ± 1%   -1.01%  (p=0.000 n=8+10)
      Match/Easy0/32K-12                 4.64µs ± 1%    4.57µs ± 3%   -1.39%  (p=0.030 n=10+10)
      Match/Easy0/1M-12                   234µs ± 0%     234µs ± 0%     ~     (p=0.971 n=10+10)
      Match/Easy0/32M-12                 7.95ms ± 0%    7.96ms ± 0%     ~     (p=0.278 n=9+10)
      Match/Easy0i/32-12                 1.10µs ± 0%    1.09µs ± 0%   -0.29%  (p=0.000 n=9+8)
      Match/Easy0i/1K-12                 31.8µs ± 1%    31.7µs ± 0%     ~     (p=0.704 n=10+9)
      Match/Easy0i/32K-12                1.62ms ± 1%    1.61ms ± 0%   -1.12%  (p=0.000 n=10+8)
      Match/Easy0i/1M-12                 51.8ms ± 0%    51.4ms ± 0%   -0.84%  (p=0.000 n=8+8)
      Match/Easy0i/32M-12                 1.65s ± 0%     1.65s ± 0%   -0.46%  (p=0.000 n=9+9)
      Match/Easy1/32-12                  67.7ns ± 1%    67.6ns ± 1%     ~     (p=0.723 n=10+10)
      Match/Easy1/1K-12                   873ns ± 0%     873ns ± 2%     ~     (p=0.345 n=10+9)
      Match/Easy1/32K-12                 39.4µs ± 0%    39.7µs ± 1%   +0.66%  (p=0.000 n=10+10)
      Match/Easy1/1M-12                  1.39ms ± 0%    1.41ms ± 1%   +1.10%  (p=0.000 n=10+10)
      Match/Easy1/32M-12                 44.3ms ± 0%    44.9ms ± 1%   +1.18%  (p=0.000 n=10+10)
      Match/Medium/32-12                 1.04µs ± 0%    1.04µs ± 0%   -0.58%  (p=0.000 n=9+9)
      Match/Medium/1K-12                 31.4µs ± 0%    31.2µs ± 0%   -0.62%  (p=0.000 n=8+8)
      Match/Medium/32K-12                1.45ms ± 0%    1.45ms ± 1%     ~     (p=0.356 n=9+10)
      Match/Medium/1M-12                 46.4ms ± 0%    46.4ms ± 0%     ~     (p=0.142 n=8+6)
      Match/Medium/32M-12                 1.49s ± 1%     1.49s ± 1%     ~     (p=0.739 n=10+10)
      Match/Hard/32-12                   1.48µs ± 0%    1.47µs ± 0%   -0.53%  (p=0.000 n=9+9)
      Match/Hard/1K-12                   45.0µs ± 1%    44.5µs ± 1%   -1.06%  (p=0.000 n=10+10)
      Match/Hard/32K-12                  2.24ms ± 0%    2.09ms ± 0%   -6.56%  (p=0.000 n=8+8)
      Match/Hard/1M-12                   71.6ms ± 0%    67.8ms ± 5%   -5.36%  (p=0.000 n=7+10)
      Match/Hard/32M-12                   2.29s ± 0%     2.17s ± 5%   -5.40%  (p=0.000 n=9+10)
      Match/Hard1/32-12                  7.89µs ± 0%    7.89µs ± 0%     ~     (p=0.053 n=9+9)
      Match/Hard1/1K-12                   244µs ± 0%     246µs ± 0%   +0.71%  (p=0.000 n=10+9)
      Match/Hard1/32K-12                 10.3ms ± 0%     8.9ms ± 0%  -13.76%  (p=0.000 n=10+9)
      Match/Hard1/1M-12                   331ms ± 0%     286ms ± 0%  -13.72%  (p=0.000 n=9+9)
      Match/Hard1/32M-12                  10.6s ± 0%      9.2s ± 0%  -13.72%  (p=0.000 n=10+9)
      Match_onepass_regex/32-12           830ns ± 0%     825ns ± 0%   -0.57%  (p=0.000 n=9+8)
      Match_onepass_regex/1K-12          28.7µs ± 1%    28.7µs ± 1%   -0.22%  (p=0.040 n=9+9)
      Match_onepass_regex/32K-12          949µs ± 0%     950µs ± 1%     ~     (p=0.236 n=8+9)
      Match_onepass_regex/1M-12          30.4ms ± 0%    30.4ms ± 0%     ~     (p=0.059 n=8+9)
      Match_onepass_regex/32M-12          973ms ± 0%     974ms ± 1%     ~     (p=0.258 n=9+9)
      CompileOnepass-12                  4.64µs ± 0%    4.60µs ± 0%   -0.90%  (p=0.000 n=10+8)
      [Geo mean]                         23.3µs         23.1µs        -1.16%
      
      https://perf.golang.org/search?q=upload:20181004.3
      
      Change-Id: I46f3d52ce89c8cd992cf554473c27af81fd81bfd
      Reviewed-on: https://go-review.googlesource.com/c/139781
      Run-TryBot: Russ Cox <rsc@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      a376435a