1. 23 Nov, 2017 8 commits
  2. 22 Nov, 2017 21 commits
  3. 21 Nov, 2017 11 commits
    • Brad Fitzpatrick's avatar
      runtime: fix build on non-Linux platforms · 1e3f563b
      Brad Fitzpatrick authored
      CL 78538 was updated after running TryBots to depend on
      syscall.NanoSleep which isn't available on all non-Linux platforms.
      
      Change-Id: I1fa615232b3920453431861310c108b208628441
      Reviewed-on: https://go-review.googlesource.com/79175
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarAustin Clements <austin@google.com>
      1e3f563b
    • Dmitri Shuralyov's avatar
      time: rename TestLoadLocationFromTzinfo to TestLoadLocationFromTZData · 597213c8
      Dmitri Shuralyov authored
      Tzinfo was replaced with TZData during the review of CL 68890, but this
      instance was forgotten. Update it for consistency.
      
      Follows CL 68890.
      Updates #20629.
      
      Change-Id: Id6d3c4f5f7572b01065f2db556db605452d1b570
      Reviewed-on: https://go-review.googlesource.com/79176Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      597213c8
    • isharipo's avatar
      cmd/internal/obj/x86: fix /is4 encoding for VBLEND · 49322ca9
      isharipo authored
      Fixes VBLENDVP{D/S}, VPBLENDVB encoding for /is4 imm8[7:4]
      encoded register operand.
      
      Explanation:
      `reg[r]+regrex[r]+1` will yield correct values for 8..15 reg indexes,
      but for 0..7 it gives `index+1` results.
      There was no test that used lower 8 register with /is4 encoding,
      so the bug passed the tests.
      The proper solution is to get 4th bit from regrex with a proper shift:
      `reg[r]|(regrex[r]<<1)`.
      
      Instead of inlining `reg[r]|(regrex[r]<<1)` expr,
      using new `regIndex(r)` function.
      
      Test that reproduces this issue is added to
      amd64enc_extra.s test suite.
      
      Bug came from https://golang.org/cl/70650.
      
      Change-Id: I846a25e88d5e6df88df9d9c3f5fe94ec55416a33
      Reviewed-on: https://go-review.googlesource.com/78815
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarIlya Tocar <ilya.tocar@intel.com>
      49322ca9
    • Reilly Watson's avatar
      doc: fix some typos in diagnostics.html · 78615844
      Reilly Watson authored
      The section about custom pprof paths referenced the wrong path.
      
      This also fixes a couple minor grammatical issues elsewhere in the doc.
      
      Fixes #22832
      
      Change-Id: I890cceb53a13c1958d9cf958c658ccfcbb6863d5
      Reviewed-on: https://go-review.googlesource.com/79035Reviewed-by: 's avatarAlberto Donizetti <alb.donizetti@gmail.com>
      78615844
    • Brad Fitzpatrick's avatar
      time: fix build on Android · 40d8b4b2
      Brad Fitzpatrick authored
      Some type renames were missing in the android file from CL 79017
      
      Change-Id: I419215575ca7975241afb8d2069560c8b1d142c6
      Reviewed-on: https://go-review.googlesource.com/79136Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      40d8b4b2
    • Michael Pratt's avatar
      runtime: skip netpoll check if there are no waiters · b75b4d0e
      Michael Pratt authored
      If there are no netpoll waiters then calling netpoll will never find any
      goroutines. The later blocking netpoll in findrunnable already has this
      optimization.
      
      With golang.org/cl/78538 also applied, this change has a small impact on
      latency:
      
      name                             old time/op  new time/op  delta
      WakeupParallelSpinning/0s-12     13.6µs ± 1%  13.7µs ± 1%    ~     (p=0.873 n=19+20)
      WakeupParallelSpinning/1µs-12    17.7µs ± 0%  17.6µs ± 0%  -0.31%  (p=0.000 n=20+20)
      WakeupParallelSpinning/2µs-12    20.2µs ± 2%  19.9µs ± 1%  -1.59%  (p=0.000 n=20+19)
      WakeupParallelSpinning/5µs-12    32.0µs ± 1%  32.1µs ± 1%    ~     (p=0.201 n=20+19)
      WakeupParallelSpinning/10µs-12   51.7µs ± 0%  51.4µs ± 1%  -0.60%  (p=0.000 n=20+18)
      WakeupParallelSpinning/20µs-12   92.2µs ± 0%  92.2µs ± 0%    ~     (p=0.474 n=19+19)
      WakeupParallelSpinning/50µs-12    215µs ± 0%   215µs ± 0%    ~     (p=0.319 n=20+19)
      WakeupParallelSpinning/100µs-12   330µs ± 2%   331µs ± 2%    ~     (p=0.296 n=20+19)
      WakeupParallelSyscall/0s-12       127µs ± 0%   126µs ± 0%  -0.57%  (p=0.000 n=18+18)
      WakeupParallelSyscall/1µs-12      129µs ± 0%   128µs ± 1%  -0.43%  (p=0.000 n=18+19)
      WakeupParallelSyscall/2µs-12      131µs ± 1%   130µs ± 1%  -0.78%  (p=0.000 n=20+19)
      WakeupParallelSyscall/5µs-12      137µs ± 1%   136µs ± 0%  -0.54%  (p=0.000 n=18+19)
      WakeupParallelSyscall/10µs-12     147µs ± 1%   146µs ± 0%  -0.58%  (p=0.000 n=18+19)
      WakeupParallelSyscall/20µs-12     168µs ± 0%   167µs ± 0%  -0.52%  (p=0.000 n=19+19)
      WakeupParallelSyscall/50µs-12     228µs ± 0%   227µs ± 0%  -0.37%  (p=0.000 n=19+18)
      WakeupParallelSyscall/100µs-12    329µs ± 0%   328µs ± 0%  -0.28%  (p=0.000 n=20+18)
      
      There is a bigger improvement in CPU utilization. Before this CL, these
      benchmarks spent 12% of cycles in netpoll, which are gone after this CL.
      
      This also fixes the sched.lastpoll load, which should be atomic.
      
      Change-Id: I600961460608bd5ba3eeddc599493d2be62064c6
      Reviewed-on: https://go-review.googlesource.com/78915
      Run-TryBot: Michael Pratt <mpratt@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: 's avatarAustin Clements <austin@google.com>
      b75b4d0e
    • Jamie Liu's avatar
      runtime: only sleep before stealing work from a running P · 868c8b37
      Jamie Liu authored
      The sleep in question does not make sense if the stolen-from P cannot
      run the stolen G. The usleep(3) has been observed delaying execution of
      woken G's by ~60us; skipping it reduces the wakeup-to-execution latency
      to ~7us in these cases, improving CPU utilization.
      
      Benchmarks added by this change:
      
      name                             old time/op  new time/op  delta
      WakeupParallelSpinning/0s-12     14.4µs ± 1%  14.3µs ± 1%     ~     (p=0.227 n=19+20)
      WakeupParallelSpinning/1µs-12    18.3µs ± 0%  18.3µs ± 1%     ~     (p=0.950 n=20+19)
      WakeupParallelSpinning/2µs-12    22.3µs ± 1%  22.3µs ± 1%     ~     (p=0.670 n=20+18)
      WakeupParallelSpinning/5µs-12    31.7µs ± 0%  31.7µs ± 0%     ~     (p=0.460 n=20+17)
      WakeupParallelSpinning/10µs-12   51.8µs ± 0%  51.8µs ± 0%     ~     (p=0.883 n=20+20)
      WakeupParallelSpinning/20µs-12   91.9µs ± 0%  91.9µs ± 0%     ~     (p=0.245 n=20+20)
      WakeupParallelSpinning/50µs-12    214µs ± 0%   214µs ± 0%     ~     (p=0.509 n=19+20)
      WakeupParallelSpinning/100µs-12   335µs ± 0%   335µs ± 0%   -0.05%  (p=0.006 n=17+15)
      WakeupParallelSyscall/0s-12       228µs ± 2%   129µs ± 1%  -43.32%  (p=0.000 n=20+19)
      WakeupParallelSyscall/1µs-12      232µs ± 1%   131µs ± 1%  -43.60%  (p=0.000 n=19+20)
      WakeupParallelSyscall/2µs-12      236µs ± 1%   133µs ± 1%  -43.44%  (p=0.000 n=18+19)
      WakeupParallelSyscall/5µs-12      248µs ± 2%   139µs ± 1%  -43.68%  (p=0.000 n=18+19)
      WakeupParallelSyscall/10µs-12     263µs ± 3%   150µs ± 2%  -42.97%  (p=0.000 n=18+20)
      WakeupParallelSyscall/20µs-12     281µs ± 2%   170µs ± 1%  -39.43%  (p=0.000 n=19+19)
      WakeupParallelSyscall/50µs-12     345µs ± 4%   246µs ± 7%  -28.85%  (p=0.000 n=20+20)
      WakeupParallelSyscall/100µs-12    460µs ± 5%   350µs ± 4%  -23.85%  (p=0.000 n=20+20)
      
      Benchmarks associated with the change that originally added this sleep
      (see https://golang.org/s/go15gomaxprocs):
      
      name        old time/op  new time/op  delta
      Chain       19.4µs ± 2%  19.3µs ± 1%    ~     (p=0.101 n=19+20)
      ChainBuf    19.5µs ± 2%  19.4µs ± 2%    ~     (p=0.840 n=19+19)
      Chain-2     19.9µs ± 1%  19.9µs ± 2%    ~     (p=0.734 n=19+19)
      ChainBuf-2  20.0µs ± 2%  20.0µs ± 2%    ~     (p=0.175 n=19+17)
      Chain-4     20.3µs ± 1%  20.1µs ± 1%  -0.62%  (p=0.010 n=19+18)
      ChainBuf-4  20.3µs ± 1%  20.2µs ± 1%  -0.52%  (p=0.023 n=19+19)
      Powser       2.09s ± 1%   2.10s ± 3%    ~     (p=0.908 n=19+19)
      Powser-2     2.21s ± 1%   2.20s ± 1%  -0.35%  (p=0.010 n=19+18)
      Powser-4     2.31s ± 2%   2.31s ± 2%    ~     (p=0.578 n=18+19)
      Sieve        13.6s ± 1%   13.6s ± 1%    ~     (p=0.909 n=17+18)
      Sieve-2      8.02s ±52%   7.28s ±15%    ~     (p=0.336 n=20+16)
      Sieve-4      4.00s ±35%   3.98s ±26%    ~     (p=0.654 n=20+18)
      
      Change-Id: I58edd8ce01075859d871e2348fc0833e9c01f70f
      Reviewed-on: https://go-review.googlesource.com/78538Reviewed-by: 's avatarAustin Clements <austin@google.com>
      868c8b37
    • Florian Uekermann's avatar
      time: enable Location loading from user provided timezone data · 2951f909
      Florian Uekermann authored
      The return values of the LoadLocation are inherently dependent
      on the runtime environment. Add LoadLocationFromTZData, whose
      results depend only on the timezone data provided as arguments.
      
      Fixes #20629
      
      Change-Id: I43b181f4c05c219be3ec57327540263b7cb3b2aa
      Reviewed-on: https://go-review.googlesource.com/68890Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      2951f909
    • Wei Xiao's avatar
      bytes: add optimized countByte for arm64 · 9a14cd9e
      Wei Xiao authored
      Use SIMD instructions when counting a single byte.
      Inspired from runtime IndexByte implementation.
      
      Benchmark results of bytes, where 1 byte in every 8 is the one we are looking:
      
      name               old time/op   new time/op    delta
      CountSingle/10-8    96.1ns ± 1%    38.8ns ± 0%    -59.64%  (p=0.000 n=9+7)
      CountSingle/32-8     172ns ± 2%      36ns ± 1%    -79.27%  (p=0.000 n=10+10)
      CountSingle/4K-8    18.2µs ± 1%     0.9µs ± 0%    -95.17%  (p=0.000 n=9+10)
      CountSingle/4M-8    18.4ms ± 0%     0.9ms ± 0%    -95.00%  (p=0.000 n=10+9)
      CountSingle/64M-8    284ms ± 4%      19ms ± 0%    -93.40%  (p=0.000 n=10+10)
      
      name               old speed     new speed      delta
      CountSingle/10-8   104MB/s ± 1%   258MB/s ± 0%   +147.99%  (p=0.000 n=9+10)
      CountSingle/32-8   185MB/s ± 1%   897MB/s ± 1%   +385.33%  (p=0.000 n=9+10)
      CountSingle/4K-8   225MB/s ± 1%  4658MB/s ± 0%  +1967.40%  (p=0.000 n=9+10)
      CountSingle/4M-8   228MB/s ± 0%  4555MB/s ± 0%  +1901.71%  (p=0.000 n=10+9)
      CountSingle/64M-8  236MB/s ± 4%  3575MB/s ± 0%  +1414.69%  (p=0.000 n=10+10)
      
      Change-Id: Ifccb51b3c8658c49773fe05147c3cf3aead361e5
      Reviewed-on: https://go-review.googlesource.com/71111Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      9a14cd9e
    • Than McIntosh's avatar
      cmd/compile: ignore RegKill ops for non-phi after phi check · 63ef3cde
      Than McIntosh authored
      Relax the 'phi after non-phi' SSA sanity check to allow
      RegKill ops interspersed with phi ops in a block. This fixes
      a sanity check failure when -dwarflocationlists is enabled.
      
      Updates #22694.
      
      Change-Id: Iaae604ab6f1a8b150664dd120003727a6fb2f698
      Reviewed-on: https://go-review.googlesource.com/77610
      Run-TryBot: Than McIntosh <thanm@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      63ef3cde
    • Cherry Zhang's avatar
      cmd/compile: fix comment that -N does not disable escape analysis · 4fbf54fa
      Cherry Zhang authored
      -N does not disable escape analysis. Remove the outdated comment.
      
      Change-Id: I96978b3afd51324b7b4f8035cf4417fb2eac4ebc
      Reviewed-on: https://go-review.googlesource.com/79015Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      4fbf54fa