1. 28 Apr, 2018 1 commit
  2. 27 Apr, 2018 12 commits
    • Richard Musiol's avatar
      misc/wasm: wasm_exec: non-zero exit code on compile error · adb52cff
      Richard Musiol authored
      Return a non-zero exit code if the WebAssembly host fails to compile
      the WebAssmbly bytecode to machine code.
      
      Change-Id: I774309db2872b6a2de77a1b0392608058414160d
      Reviewed-on: https://go-review.googlesource.com/110097Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      adb52cff
    • Ben Shi's avatar
      cmd/compile: optimize ARM64 with shifted register indexed load/store · aaf73c6d
      Ben Shi authored
      ARM64 supports efficient instructions which combine shift, addition, load/store
      together. Such as "MOVD (R0)(R1<<3), R2" and "MOVWU R6, (R4)(R1<<2)".
      
      This CL optimizes the compiler to emit such efficient instuctions. And below
      is some test data.
      
      1. binary size before/after
      binary                 size change
      pkg/linux_arm64        +80.1KB
      pkg/tool/linux_arm64   +121.9KB
      go                     -4.3KB
      gofmt                  -64KB
      
      2. go1 benchmark
      There is big improvement for the test case Fannkuch11, and slight
      improvement for sme others, excluding noise.
      
      name                     old time/op    new time/op    delta
      BinaryTree17-4              43.9s ± 2%     44.0s ± 2%     ~     (p=0.820 n=30+30)
      Fannkuch11-4                30.6s ± 2%     24.5s ± 3%  -19.93%  (p=0.000 n=25+30)
      FmtFprintfEmpty-4           500ns ± 0%     499ns ± 0%   -0.11%  (p=0.000 n=23+25)
      FmtFprintfString-4         1.03µs ± 0%    1.04µs ± 3%     ~     (p=0.065 n=29+30)
      FmtFprintfInt-4            1.15µs ± 3%    1.15µs ± 4%   -0.56%  (p=0.000 n=30+30)
      FmtFprintfIntInt-4         1.80µs ± 5%    1.82µs ± 0%     ~     (p=0.094 n=30+24)
      FmtFprintfPrefixedInt-4    2.17µs ± 5%    2.20µs ± 0%     ~     (p=0.100 n=30+23)
      FmtFprintfFloat-4          3.08µs ± 3%    3.09µs ± 4%     ~     (p=0.123 n=30+30)
      FmtManyArgs-4              7.41µs ± 4%    7.17µs ± 1%   -3.26%  (p=0.000 n=30+23)
      GobDecode-4                93.7ms ± 0%    94.7ms ± 4%     ~     (p=0.685 n=24+30)
      GobEncode-4                78.7ms ± 7%    77.1ms ± 0%     ~     (p=0.729 n=30+23)
      Gzip-4                      4.01s ± 0%     3.97s ± 5%   -1.11%  (p=0.037 n=24+30)
      Gunzip-4                    389ms ± 4%     384ms ± 0%     ~     (p=0.155 n=30+23)
      HTTPClientServer-4          536µs ± 1%     537µs ± 1%     ~     (p=0.236 n=30+30)
      JSONEncode-4                179ms ± 1%     182ms ± 6%     ~     (p=0.763 n=24+30)
      JSONDecode-4                843ms ± 0%     839ms ± 6%   -0.42%  (p=0.003 n=25+30)
      Mandelbrot200-4            46.5ms ± 0%    46.5ms ± 0%   +0.02%  (p=0.000 n=26+26)
      GoParse-4                  44.3ms ± 6%    43.3ms ± 0%     ~     (p=0.067 n=30+27)
      RegexpMatchEasy0_32-4      1.07µs ± 7%    1.07µs ± 4%     ~     (p=0.835 n=30+30)
      RegexpMatchEasy0_1K-4      5.51µs ± 0%    5.49µs ± 0%   -0.35%  (p=0.000 n=23+26)
      RegexpMatchEasy1_32-4      1.01µs ± 0%    1.02µs ± 4%   +0.96%  (p=0.014 n=24+30)
      RegexpMatchEasy1_1K-4      7.43µs ± 0%    7.18µs ± 0%   -3.41%  (p=0.000 n=23+24)
      RegexpMatchMedium_32-4     1.78µs ± 0%    1.81µs ± 4%   +1.47%  (p=0.012 n=23+30)
      RegexpMatchMedium_1K-4      547µs ± 1%     542µs ± 3%   -0.90%  (p=0.003 n=24+30)
      RegexpMatchHard_32-4       30.4µs ± 0%    29.7µs ± 0%   -2.15%  (p=0.000 n=19+23)
      RegexpMatchHard_1K-4        913µs ± 0%     915µs ± 6%   +0.25%  (p=0.012 n=24+30)
      Revcomp-4                   6.32s ± 1%     6.42s ± 4%     ~     (p=0.342 n=25+30)
      Template-4                  868ms ± 6%     878ms ± 6%   +1.15%  (p=0.000 n=30+30)
      TimeParse-4                4.57µs ± 4%    4.59µs ± 3%   +0.65%  (p=0.010 n=29+30)
      TimeFormat-4               4.51µs ± 0%    4.50µs ± 0%   -0.27%  (p=0.000 n=27+24)
      [Geo mean]                  695µs          689µs        -0.92%
      
      name                     old speed      new speed      delta
      GobDecode-4              8.19MB/s ± 0%  8.12MB/s ± 4%     ~     (p=0.680 n=24+30)
      GobEncode-4              9.76MB/s ± 7%  9.96MB/s ± 0%     ~     (p=0.616 n=30+23)
      Gzip-4                   4.84MB/s ± 0%  4.89MB/s ± 4%   +1.16%  (p=0.030 n=24+30)
      Gunzip-4                 49.9MB/s ± 4%  50.6MB/s ± 0%     ~     (p=0.162 n=30+23)
      JSONEncode-4             10.9MB/s ± 1%  10.7MB/s ± 6%     ~     (p=0.575 n=24+30)
      JSONDecode-4             2.30MB/s ± 0%  2.32MB/s ± 5%   +0.72%  (p=0.003 n=22+30)
      GoParse-4                1.31MB/s ± 6%  1.34MB/s ± 0%   +2.26%  (p=0.002 n=30+27)
      RegexpMatchEasy0_32-4    30.0MB/s ± 6%  30.0MB/s ± 4%     ~     (p=1.000 n=30+30)
      RegexpMatchEasy0_1K-4     186MB/s ± 0%   187MB/s ± 0%   +0.35%  (p=0.000 n=23+26)
      RegexpMatchEasy1_32-4    31.8MB/s ± 0%  31.5MB/s ± 4%   -0.92%  (p=0.012 n=25+30)
      RegexpMatchEasy1_1K-4     138MB/s ± 0%   143MB/s ± 0%   +3.53%  (p=0.000 n=23+24)
      RegexpMatchMedium_32-4    560kB/s ± 0%   553kB/s ± 4%   -1.19%  (p=0.005 n=23+30)
      RegexpMatchMedium_1K-4   1.87MB/s ± 0%  1.89MB/s ± 3%   +1.04%  (p=0.002 n=24+30)
      RegexpMatchHard_32-4     1.05MB/s ± 0%  1.08MB/s ± 0%   +2.40%  (p=0.000 n=19+23)
      RegexpMatchHard_1K-4     1.12MB/s ± 0%  1.12MB/s ± 5%   +0.12%  (p=0.006 n=25+30)
      Revcomp-4                40.2MB/s ± 1%  39.6MB/s ± 4%     ~     (p=0.242 n=25+30)
      Template-4               2.24MB/s ± 6%  2.21MB/s ± 6%   -1.15%  (p=0.000 n=30+30)
      [Geo mean]               7.87MB/s       7.91MB/s        +0.44%
      
      Change-Id: If374cb7abf83537aa0a176f73c0f736f7800db03
      Reviewed-on: https://go-review.googlesource.com/108735Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      aaf73c6d
    • Cherry Zhang's avatar
      cmd/link: fix plugin on linux/arm64 · ceda47d0
      Cherry Zhang authored
      The init function and runtime.addmoduledata were not added when
      building plugin, which caused the runtime could not find the
      module.
      
      Testplugin is still not enabled on linux/arm64
      (https://go.googlesource.com/go/+/master/src/cmd/dist/test.go#948)
      because the gold linker on the builder is too old, which fails
      with an internal error (see issue #17138). I tested locally and
      it passes.
      
      Fixes #24940.
      Updates #17138.
      
      Change-Id: I26aebca6c38a3443af0949471fa12b6d550e8c6c
      Reviewed-on: https://go-review.googlesource.com/109917
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      ceda47d0
    • Milan Knezevic's avatar
      cmd/compile: add softfloat support to mips64{,le} · 2959128d
      Milan Knezevic authored
      mips64 softfloat support is based on mips implementation and introduces
      new enviroment variable GOMIPS64.
      
      GOMIPS64 is a GOARCH=mips64{,le} specific option, for a choice between
      hard-float and soft-float. Valid values are 'hardfloat' (default) and
      'softfloat'. It is passed to the assembler as
      'GOMIPS64_{hardfloat,softfloat}'.
      
      Change-Id: I7f73078627f7cb37c588a38fb5c997fe09c56134
      Reviewed-on: https://go-review.googlesource.com/108475Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      2959128d
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj: convert unicode C to ASCII C · 62adf6fc
      Josh Bleecher Snyder authored
      Hex before: d0 a1
      Hex after: 43
      
      Not sure where that came from.
      
      Change-Id: I189e7e21f8faf480ba72846b956a149976f720f8
      Reviewed-on: https://go-review.googlesource.com/109777Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      62adf6fc
    • Zhou Peng's avatar
      testing: fix typo mistake · 1f56499d
      Zhou Peng authored
      Change-Id: I561640768c43491288e7f5bd1a34247787793dab
      Reviewed-on: https://go-review.googlesource.com/109935Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      1f56499d
    • Yasuhiro Matsumoto's avatar
      os: os: make Stat("*.txt") fail on windows · e656aebb
      Yasuhiro Matsumoto authored
      Fixes #24999
      
      Change-Id: Ie0bb6a6e0fa3992cdd272d42347af65ae7c95463
      Reviewed-on: https://go-review.googlesource.com/108755
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarAlex Brainman <alex.brainman@gmail.com>
      e656aebb
    • Daniel Martí's avatar
      cmd/compile: add initial README · a835c739
      Daniel Martí authored
      As a follow-up to the first README for cmd/compile/internal/ssa.
      
      Since this is the parent package for all the compiler packages, this
      README serves as an overview of the compiler and its packages. As more
      READMEs are added for specific parts with more detail, such as ssa's,
      they can be linked from this one.
      
      Thanks to Iskander Sharipov, Josh Bleecher Snyder, Matthew Dempsky,
      Alberto Donizetti, and Robert Griesemer for helping with all the details
      in this document.
      
      Change-Id: I820a535e25dce86ccc667ce1c6e92b75fc32f3af
      Reviewed-on: https://go-review.googlesource.com/103935Reviewed-by: 's avatarMartin Möhrmann <moehrmann@google.com>
      Reviewed-by: 's avatarJosh Bleecher Snyder <josharian@gmail.com>
      Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      a835c739
    • Josh Bleecher Snyder's avatar
      cmd/compile: increase initial allocation of LSym.R · a76249c3
      Josh Bleecher Snyder authored
      Not a big win, but cheap.
      
      name        old alloc/op      new alloc/op      delta
      Template         34.4MB ± 0%       34.4MB ± 0%  -0.20%  (p=0.000 n=15+15)
      Unicode          29.2MB ± 0%       29.3MB ± 0%  +0.17%  (p=0.000 n=15+15)
      GoTypes           113MB ± 0%        113MB ± 0%  -0.22%  (p=0.000 n=15+15)
      Compiler          509MB ± 0%        508MB ± 0%  -0.11%  (p=0.000 n=15+14)
      SSA              1.46GB ± 0%       1.46GB ± 0%  -0.08%  (p=0.000 n=14+15)
      Flate            23.8MB ± 0%       23.7MB ± 0%  -0.22%  (p=0.000 n=15+15)
      GoParser         27.9MB ± 0%       27.8MB ± 0%  -0.21%  (p=0.000 n=14+15)
      Reflect          77.2MB ± 0%       77.0MB ± 0%  -0.27%  (p=0.000 n=14+15)
      Tar              34.0MB ± 0%       33.9MB ± 0%  -0.21%  (p=0.000 n=13+15)
      XML              42.6MB ± 0%       42.5MB ± 0%  -0.15%  (p=0.000 n=15+15)
      [Geo mean]       75.8MB            75.7MB       -0.15%
      
      name        old allocs/op     new allocs/op     delta
      Template           322k ± 0%         320k ± 0%  -0.60%  (p=0.000 n=15+15)
      Unicode            337k ± 0%         336k ± 0%  -0.23%  (p=0.000 n=12+15)
      GoTypes           1.13M ± 0%        1.12M ± 0%  -0.58%  (p=0.000 n=15+14)
      Compiler          4.67M ± 0%        4.65M ± 0%  -0.38%  (p=0.000 n=14+15)
      SSA               11.7M ± 0%        11.6M ± 0%  -0.25%  (p=0.000 n=15+15)
      Flate              216k ± 0%         214k ± 0%  -0.67%  (p=0.000 n=15+15)
      GoParser           271k ± 0%         270k ± 0%  -0.57%  (p=0.000 n=15+15)
      Reflect            927k ± 0%         920k ± 0%  -0.72%  (p=0.000 n=13+14)
      Tar                318k ± 0%         316k ± 0%  -0.57%  (p=0.000 n=15+15)
      XML                376k ± 0%         375k ± 0%  -0.46%  (p=0.000 n=14+14)
      [Geo mean]         731k              727k       -0.50%
      
      Change-Id: I1417c5881e866fb3efe62a3d0fbe1134275da31a
      Reviewed-on: https://go-review.googlesource.com/109755
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      a76249c3
    • Josh Bleecher Snyder's avatar
      cmd/compile: log Ctz non-zero proofs · b9785fc8
      Josh Bleecher Snyder authored
      I forgot this in CL 109358.
      
      Change-Id: Ia5e8bd9cf43393f098b101a0d6a0c526e3e4f101
      Reviewed-on: https://go-review.googlesource.com/109775
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      b9785fc8
    • Kevin Burke's avatar
      cmd/vet: remove "only" from error message · 6b55407d
      Kevin Burke authored
      If the vetted function supplies zero arguments, previously you would
      get an error message like this:
      
          Printf format %v reads arg #1, but call has only 0 args
      
      "has only 0 args" is an odd construction, and "has 0 args" sounds
      better. Getting rid of "only" in all cases simplifies the code and
      reads just as well.
      
      Change-Id: I4706dfe4a75f13bf4db9c0650e459ca676710752
      Reviewed-on: https://go-review.googlesource.com/109457
      Run-TryBot: Kevin Burke <kev@inburke.com>
      Run-TryBot: David Symonds <dsymonds@golang.org>
      Reviewed-by: 's avatarDavid Symonds <dsymonds@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      6b55407d
    • Daniel Martí's avatar
      cmd/link/internal/ld: simple cleanups · a6b183fa
      Daniel Martí authored
      Simplify some C-style loops with range statements, and move some
      declarations closer to their uses.
      
      While at it, ensure that all the SymbolType consts are typed.
      
      Change-Id: I04b06afb2c1fb249ef8093a0c5cca0a597d1e05c
      Reviewed-on: https://go-review.googlesource.com/105217
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      a6b183fa
  3. 26 Apr, 2018 24 commits
  4. 25 Apr, 2018 3 commits
    • Josh Bleecher Snyder's avatar
      cmd/compile: use intrinsic for LeadingZeros8 on amd64 · c5f0104d
      Josh Bleecher Snyder authored
      The previous change sped up the pure computation form of LeadingZeros8.
      This places it somewhat close to the table lookup form.
      Depending on something that varies from toolchain to toolchain
      (alignment, perhaps?), the slowdown from ditching the table lookup
      is either 20% or 5%.
      
      This benchmark is the best case scenario for the table lookup:
      It is in the L1 cache already.
      
      I think we're close enough that we can switch to the computational version,
      and trust that the memory effects and binary size savings will be worth it.
      
      Code:
      
      func f8(x uint8)   { z = bits.LeadingZeros8(x) }
      
      Before:
      
      "".f8 STEXT nosplit size=34 args=0x8 locals=0x0
      	0x0000 00000 (x.go:7)	TEXT	"".f8(SB), NOSPLIT, $0-8
      	0x0000 00000 (x.go:7)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
      	0x0000 00000 (x.go:7)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
      	0x0000 00000 (x.go:7)	MOVBLZX	"".x+8(SP), AX
      	0x0005 00005 (x.go:7)	MOVBLZX	AL, AX
      	0x0008 00008 (x.go:7)	LEAQ	math/bits.len8tab(SB), CX
      	0x000f 00015 (x.go:7)	MOVBLZX	(CX)(AX*1), AX
      	0x0013 00019 (x.go:7)	ADDQ	$-8, AX
      	0x0017 00023 (x.go:7)	NEGQ	AX
      	0x001a 00026 (x.go:7)	MOVQ	AX, "".z(SB)
      	0x0021 00033 (x.go:7)	RET
      
      After:
      
      "".f8 STEXT nosplit size=30 args=0x8 locals=0x0
      	0x0000 00000 (x.go:7)	TEXT	"".f8(SB), NOSPLIT, $0-8
      	0x0000 00000 (x.go:7)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
      	0x0000 00000 (x.go:7)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
      	0x0000 00000 (x.go:7)	MOVBLZX	"".x+8(SP), AX
      	0x0005 00005 (x.go:7)	MOVBLZX	AL, AX
      	0x0008 00008 (x.go:7)	LEAL	1(AX)(AX*1), AX
      	0x000c 00012 (x.go:7)	BSRL	AX, AX
      	0x000f 00015 (x.go:7)	ADDQ	$-8, AX
      	0x0013 00019 (x.go:7)	NEGQ	AX
      	0x0016 00022 (x.go:7)	MOVQ	AX, "".z(SB)
      	0x001d 00029 (x.go:7)	RET
      
      Change-Id: Icc7db50a7820fb9a3da8a816d6b6940d7f8e193e
      Reviewed-on: https://go-review.googlesource.com/108942
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      c5f0104d
    • Josh Bleecher Snyder's avatar
      cmd/compile: optimize LeadingZeros(16|32) on amd64 · 1d321ada
      Josh Bleecher Snyder authored
      Introduce Len8 and Len16 ops and provide optimized lowerings for them.
      amd64 only for this CL, although it wouldn't surprise me
      if other architectures also admit of optimized lowerings.
      
      Also use and optimize the Len32 lowering, along the same lines.
      
      Leave Len8 unused for the moment; a subsequent CL will enable it.
      
      For 16 and 32 bits, this leads to a speed-up.
      
      name              old time/op  new time/op  delta
      LeadingZeros16-8  1.42ns ± 5%  1.23ns ± 5%  -13.42%  (p=0.000 n=20+20)
      LeadingZeros32-8  1.25ns ± 5%  1.03ns ± 5%  -17.63%  (p=0.000 n=20+16)
      
      Code:
      
      func f16(x uint16) { z = bits.LeadingZeros16(x) }
      func f32(x uint32) { z = bits.LeadingZeros32(x) }
      
      Before:
      
      "".f16 STEXT nosplit size=38 args=0x8 locals=0x0
      	0x0000 00000 (x.go:8)	TEXT	"".f16(SB), NOSPLIT, $0-8
      	0x0000 00000 (x.go:8)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
      	0x0000 00000 (x.go:8)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
      	0x0000 00000 (x.go:8)	MOVWLZX	"".x+8(SP), AX
      	0x0005 00005 (x.go:8)	MOVWLZX	AX, AX
      	0x0008 00008 (x.go:8)	BSRQ	AX, AX
      	0x000c 00012 (x.go:8)	MOVQ	$-1, CX
      	0x0013 00019 (x.go:8)	CMOVQEQ	CX, AX
      	0x0017 00023 (x.go:8)	ADDQ	$-15, AX
      	0x001b 00027 (x.go:8)	NEGQ	AX
      	0x001e 00030 (x.go:8)	MOVQ	AX, "".z(SB)
      	0x0025 00037 (x.go:8)	RET
      
      "".f32 STEXT nosplit size=34 args=0x8 locals=0x0
      	0x0000 00000 (x.go:9)	TEXT	"".f32(SB), NOSPLIT, $0-8
      	0x0000 00000 (x.go:9)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
      	0x0000 00000 (x.go:9)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
      	0x0000 00000 (x.go:9)	MOVL	"".x+8(SP), AX
      	0x0004 00004 (x.go:9)	BSRQ	AX, AX
      	0x0008 00008 (x.go:9)	MOVQ	$-1, CX
      	0x000f 00015 (x.go:9)	CMOVQEQ	CX, AX
      	0x0013 00019 (x.go:9)	ADDQ	$-31, AX
      	0x0017 00023 (x.go:9)	NEGQ	AX
      	0x001a 00026 (x.go:9)	MOVQ	AX, "".z(SB)
      	0x0021 00033 (x.go:9)	RET
      
      After:
      
      "".f16 STEXT nosplit size=30 args=0x8 locals=0x0
      	0x0000 00000 (x.go:8)	TEXT	"".f16(SB), NOSPLIT, $0-8
      	0x0000 00000 (x.go:8)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
      	0x0000 00000 (x.go:8)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
      	0x0000 00000 (x.go:8)	MOVWLZX	"".x+8(SP), AX
      	0x0005 00005 (x.go:8)	MOVWLZX	AX, AX
      	0x0008 00008 (x.go:8)	LEAL	1(AX)(AX*1), AX
      	0x000c 00012 (x.go:8)	BSRL	AX, AX
      	0x000f 00015 (x.go:8)	ADDQ	$-16, AX
      	0x0013 00019 (x.go:8)	NEGQ	AX
      	0x0016 00022 (x.go:8)	MOVQ	AX, "".z(SB)
      	0x001d 00029 (x.go:8)	RET
      
      "".f32 STEXT nosplit size=28 args=0x8 locals=0x0
      	0x0000 00000 (x.go:9)	TEXT	"".f32(SB), NOSPLIT, $0-8
      	0x0000 00000 (x.go:9)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
      	0x0000 00000 (x.go:9)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
      	0x0000 00000 (x.go:9)	MOVL	"".x+8(SP), AX
      	0x0004 00004 (x.go:9)	LEAQ	1(AX)(AX*1), AX
      	0x0009 00009 (x.go:9)	BSRQ	AX, AX
      	0x000d 00013 (x.go:9)	ADDQ	$-32, AX
      	0x0011 00017 (x.go:9)	NEGQ	AX
      	0x0014 00020 (x.go:9)	MOVQ	AX, "".z(SB)
      	0x001b 00027 (x.go:9)	RET
      
      Change-Id: I6c93c173752a7bfdeab8be30777ae05a736e1f4b
      Reviewed-on: https://go-review.googlesource.com/108941
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarGiovanni Bajo <rasky@develer.com>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      1d321ada
    • Josh Bleecher Snyder's avatar
      cmd/compile: optimize TrailingZeros(8|16) on amd64 · 54dbab52
      Josh Bleecher Snyder authored
      Introduce Ctz8 and Ctz16 ops and provide optimized lowerings for them.
      amd64 only for this CL, although it wouldn't surprise me
      if other architectures also admit of optimized lowerings.
      
      name               old time/op  new time/op  delta
      TrailingZeros8-8   1.33ns ± 6%  0.84ns ± 3%  -36.90%  (p=0.000 n=20+20)
      TrailingZeros16-8  1.26ns ± 5%  0.84ns ± 5%  -33.50%  (p=0.000 n=20+18)
      
      Code:
      
      func f8(x uint8)   { z = bits.TrailingZeros8(x) }
      func f16(x uint16) { z = bits.TrailingZeros16(x) }
      
      Before:
      
      "".f8 STEXT nosplit size=34 args=0x8 locals=0x0
      	0x0000 00000 (x.go:7)	TEXT	"".f8(SB), NOSPLIT, $0-8
      	0x0000 00000 (x.go:7)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
      	0x0000 00000 (x.go:7)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
      	0x0000 00000 (x.go:7)	MOVBLZX	"".x+8(SP), AX
      	0x0005 00005 (x.go:7)	MOVBLZX	AL, AX
      	0x0008 00008 (x.go:7)	BTSQ	$8, AX
      	0x000d 00013 (x.go:7)	BSFQ	AX, AX
      	0x0011 00017 (x.go:7)	MOVL	$64, CX
      	0x0016 00022 (x.go:7)	CMOVQEQ	CX, AX
      	0x001a 00026 (x.go:7)	MOVQ	AX, "".z(SB)
      	0x0021 00033 (x.go:7)	RET
      
      "".f16 STEXT nosplit size=34 args=0x8 locals=0x0
      	0x0000 00000 (x.go:8)	TEXT	"".f16(SB), NOSPLIT, $0-8
      	0x0000 00000 (x.go:8)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
      	0x0000 00000 (x.go:8)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
      	0x0000 00000 (x.go:8)	MOVWLZX	"".x+8(SP), AX
      	0x0005 00005 (x.go:8)	MOVWLZX	AX, AX
      	0x0008 00008 (x.go:8)	BTSQ	$16, AX
      	0x000d 00013 (x.go:8)	BSFQ	AX, AX
      	0x0011 00017 (x.go:8)	MOVL	$64, CX
      	0x0016 00022 (x.go:8)	CMOVQEQ	CX, AX
      	0x001a 00026 (x.go:8)	MOVQ	AX, "".z(SB)
      	0x0021 00033 (x.go:8)	RET
      
      After:
      
      "".f8 STEXT nosplit size=20 args=0x8 locals=0x0
      	0x0000 00000 (x.go:7)	TEXT	"".f8(SB), NOSPLIT, $0-8
      	0x0000 00000 (x.go:7)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
      	0x0000 00000 (x.go:7)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
      	0x0000 00000 (x.go:7)	MOVBLZX	"".x+8(SP), AX
      	0x0005 00005 (x.go:7)	BTSL	$8, AX
      	0x0009 00009 (x.go:7)	BSFL	AX, AX
      	0x000c 00012 (x.go:7)	MOVQ	AX, "".z(SB)
      	0x0013 00019 (x.go:7)	RET
      
      "".f16 STEXT nosplit size=20 args=0x8 locals=0x0
      	0x0000 00000 (x.go:8)	TEXT	"".f16(SB), NOSPLIT, $0-8
      	0x0000 00000 (x.go:8)	FUNCDATA	$0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
      	0x0000 00000 (x.go:8)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
      	0x0000 00000 (x.go:8)	MOVWLZX	"".x+8(SP), AX
      	0x0005 00005 (x.go:8)	BTSL	$16, AX
      	0x0009 00009 (x.go:8)	BSFL	AX, AX
      	0x000c 00012 (x.go:8)	MOVQ	AX, "".z(SB)
      	0x0013 00019 (x.go:8)	RET
      
      Change-Id: I0551e357348de2b724737d569afd6ac9f5c3aa11
      Reviewed-on: https://go-review.googlesource.com/108940
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarGiovanni Bajo <rasky@develer.com>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      54dbab52