1. 06 Apr, 2016 26 commits
  2. 05 Apr, 2016 14 commits
    • Matthew Dempsky's avatar
      cmd/compile: move a lot of declarations outside of go.go · 5ba797bd
      Matthew Dempsky authored
      go.go is currently a grab bag of various unrelated type and variable
      declarations. Move a bunch of them into other more relevant source
      files.
      
      There are still more that can be moved, but these were the low hanging
      fruit with obvious homes.
      
      No code/comment changes. Just shuffling stuff around.
      
      Change-Id: I43dbe1a5b8b707709c1a3a034c693d38b8465063
      Reviewed-on: https://go-review.googlesource.com/21561
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      5ba797bd
    • Matthew Dempsky's avatar
      cmd/compile: add comments explaining how declarations/scopes work · cca4ddb4
      Matthew Dempsky authored
      Change-Id: I301760b015eb69ff12eee53473fdbf5e9f168413
      Reviewed-on: https://go-review.googlesource.com/21542Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      cca4ddb4
    • Brad Fitzpatrick's avatar
      crypto/rsa, crypto/ecdsa: fail earlier on zero parameters · d7c699d9
      Brad Fitzpatrick authored
      Change-Id: Ia6ed49d5ef3a256a55e6d4eaa1b4d9f0fc447013
      Reviewed-on: https://go-review.googlesource.com/21560Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      d7c699d9
    • Marcel van Lohuizen's avatar
      testing: improve output · 7e0d6602
      Marcel van Lohuizen authored
      This introduces a few changes
      - Skipped benchmarks now print a SKIP line, also if there was
      no output
      - The benchmark name is only printed if there the benchmark
      was not skipped or did not fail in the probe phase.
      
      It also fixes a bug of doubling a skip message in chatty mode in
      absense of a failure.
      
      The chatty flag is now passed in the common struct to allow
      for testing of the printed messages.
      
      Fixes #14799
      
      Change-Id: Ia8eb140c2e5bb467e66b8ef20a2f98f5d95415d5
      Reviewed-on: https://go-review.googlesource.com/21504Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Marcel van Lohuizen <mpvl@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      7e0d6602
    • Josh Bleecher Snyder's avatar
      cmd/compile: pull ssa OAPPEND expression handing into its own function · 5e1b7bde
      Josh Bleecher Snyder authored
      Pure code movement.
      
      Change-Id: Ia07ee0b0041c931b08adf090f262a6f74a6fdb01
      Reviewed-on: https://go-review.googlesource.com/21546
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      5e1b7bde
    • Josh Bleecher Snyder's avatar
      cmd/compile: give TLS relocations a name when dumping assembly · 7735dfb6
      Josh Bleecher Snyder authored
      Before:
      
      	...
      	0x00d0 ff ff ff e8 00 00 00 00 e9 23 ff ff ff cc cc cc  .........#......
      	rel 5+4 t=14 +0
      	rel 82+4 t=13 runtime.writeBarrier+0
      	...
      
      After:
      
      	...
      	0x00d0 ff ff ff e8 00 00 00 00 e9 23 ff ff ff cc cc cc  .........#......
      	rel 5+4 t=14 TLS+0
      	rel 82+4 t=13 runtime.writeBarrier+0
      	...
      
      Change-Id: Ibdaf694581b5fd5fb87fa8ce6a792f3eb4493622
      Reviewed-on: https://go-review.googlesource.com/21545Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      7735dfb6
    • Joe Tsai's avatar
      os: deprecate os.SEEK_SET, os.SEEK_CUR, and os.SEEK_END · 260ea689
      Joe Tsai authored
      CL/19862 introduced the same set of constants to the io package.
      We should steer users away from the os.SEEK* versions and towards
      the io.Seek* versions.
      
      Updates #6885
      
      Change-Id: I96ec5be3ec3439e1295c937159dadaf1ebfb2737
      Reviewed-on: https://go-review.googlesource.com/21540Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      260ea689
    • Robert Griesemer's avatar
      go/importer: match predeclared type list with gc's list in binary exporter · f79b50b8
      Robert Griesemer authored
      I think we had this code before but it may have gone lost somehow.
      
      Change-Id: Ifde490e686de0d2bfe907cbe19c9197f24f5fa8e
      Reviewed-on: https://go-review.googlesource.com/21537Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      f79b50b8
    • David Chase's avatar
      cmd/compile: note escape of parts of closured-capture vars · d8c815d8
      David Chase authored
      Missed a case for closure calls (OCALLFUNC && indirect) in
      esc.go:esccall.
      
      Cleanup to runtime code for windows to more thoroughly hide
      a technical escape.  Also made code pickier about failing
      to late non-optional kernel32.dll.
      
      Fixes #14409.
      
      Change-Id: Ie75486a2c8626c4583224e02e4872c2875f7bca5
      Reviewed-on: https://go-review.googlesource.com/20102
      Run-TryBot: David Chase <drchase@google.com>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      d8c815d8
    • Robert Griesemer's avatar
      crypto/dsa: eliminate invalid PublicKey early · eb876dd8
      Robert Griesemer authored
      For PublicKey.P == 0, Verify will fail. Don't even try.
      
      Change-Id: I1009f2b3dead8d0041626c946633acb10086d8c8
      Reviewed-on: https://go-review.googlesource.com/21533Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      eb876dd8
    • Brad Fitzpatrick's avatar
      doc: add httptest.ResponseRecorder note to go1.7.txt notes · b9531d31
      Brad Fitzpatrick authored
      Fixes #14928
      
      Change-Id: Id772eb623815cb2bb3e49de68a916762345a9dc1
      Reviewed-on: https://go-review.googlesource.com/21531Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      b9531d31
    • Dmitry Vyukov's avatar
      runtime: don't burn CPU unnecessarily · 475d113b
      Dmitry Vyukov authored
      Two GC-related functions, scang and casgstatus, wait in an active spin loop.
      Active spinning is never a good idea in user-space. Once we wait several
      times more than the expected wait time, something unexpected is happenning
      (e.g. the thread we are waiting for is descheduled or handling a page fault)
      and we need to yield to OS scheduler. Moreover, the expected wait time is
      very high for these functions: scang wait time can be tens of milliseconds,
      casgstatus can be hundreds of microseconds. It does not make sense to spin
      even for that time.
      
      go install -a std profile on a 4-core machine shows that 11% of time is spent
      in the active spin in scang:
      
        6.12%    compile  compile                [.] runtime.scang
        3.27%    compile  compile                [.] runtime.readgstatus
        1.72%    compile  compile                [.] runtime/internal/atomic.Load
      
      The active spin also increases tail latency in the case of the slightest
      oversubscription: GC goroutines spend whole quantum in the loop instead of
      executing user code.
      
      Here is scang wait time histogram during go install -a std:
      
      13707.0000 - 1815442.7667 [   118]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎...
      1815442.7667 - 3617178.5333 [     9]: ∎∎∎∎∎∎∎∎∎
      3617178.5333 - 5418914.3000 [    11]: ∎∎∎∎∎∎∎∎∎∎∎
      5418914.3000 - 7220650.0667 [     5]: ∎∎∎∎∎
      7220650.0667 - 9022385.8333 [    12]: ∎∎∎∎∎∎∎∎∎∎∎∎
      9022385.8333 - 10824121.6000 [    13]: ∎∎∎∎∎∎∎∎∎∎∎∎∎
      10824121.6000 - 12625857.3667 [    15]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
      12625857.3667 - 14427593.1333 [    18]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
      14427593.1333 - 16229328.9000 [    18]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
      16229328.9000 - 18031064.6667 [    32]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
      18031064.6667 - 19832800.4333 [    28]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
      19832800.4333 - 21634536.2000 [     6]: ∎∎∎∎∎∎
      21634536.2000 - 23436271.9667 [    15]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
      23436271.9667 - 25238007.7333 [    11]: ∎∎∎∎∎∎∎∎∎∎∎
      25238007.7333 - 27039743.5000 [    27]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
      27039743.5000 - 28841479.2667 [    20]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
      28841479.2667 - 30643215.0333 [    10]: ∎∎∎∎∎∎∎∎∎∎
      30643215.0333 - 32444950.8000 [     7]: ∎∎∎∎∎∎∎
      32444950.8000 - 34246686.5667 [     4]: ∎∎∎∎
      34246686.5667 - 36048422.3333 [     4]: ∎∎∎∎
      36048422.3333 - 37850158.1000 [     1]: ∎
      37850158.1000 - 39651893.8667 [     5]: ∎∎∎∎∎
      39651893.8667 - 41453629.6333 [     2]: ∎∎
      41453629.6333 - 43255365.4000 [     2]: ∎∎
      43255365.4000 - 45057101.1667 [     2]: ∎∎
      45057101.1667 - 46858836.9333 [     1]: ∎
      46858836.9333 - 48660572.7000 [     2]: ∎∎
      48660572.7000 - 50462308.4667 [     3]: ∎∎∎
      50462308.4667 - 52264044.2333 [     2]: ∎∎
      52264044.2333 - 54065780.0000 [     2]: ∎∎
      
      and the zoomed-in first part:
      
      13707.0000 - 19916.7667 [     2]: ∎∎
      19916.7667 - 26126.5333 [     2]: ∎∎
      26126.5333 - 32336.3000 [     9]: ∎∎∎∎∎∎∎∎∎
      32336.3000 - 38546.0667 [     8]: ∎∎∎∎∎∎∎∎
      38546.0667 - 44755.8333 [    12]: ∎∎∎∎∎∎∎∎∎∎∎∎
      44755.8333 - 50965.6000 [    10]: ∎∎∎∎∎∎∎∎∎∎
      50965.6000 - 57175.3667 [     5]: ∎∎∎∎∎
      57175.3667 - 63385.1333 [     6]: ∎∎∎∎∎∎
      63385.1333 - 69594.9000 [     5]: ∎∎∎∎∎
      69594.9000 - 75804.6667 [     6]: ∎∎∎∎∎∎
      75804.6667 - 82014.4333 [     6]: ∎∎∎∎∎∎
      82014.4333 - 88224.2000 [     4]: ∎∎∎∎
      88224.2000 - 94433.9667 [     1]: ∎
      94433.9667 - 100643.7333 [     1]: ∎
      100643.7333 - 106853.5000 [     2]: ∎∎
      106853.5000 - 113063.2667 [     0]:
      113063.2667 - 119273.0333 [     2]: ∎∎
      119273.0333 - 125482.8000 [     2]: ∎∎
      125482.8000 - 131692.5667 [     1]: ∎
      131692.5667 - 137902.3333 [     1]: ∎
      137902.3333 - 144112.1000 [     0]:
      144112.1000 - 150321.8667 [     2]: ∎∎
      150321.8667 - 156531.6333 [     1]: ∎
      156531.6333 - 162741.4000 [     1]: ∎
      162741.4000 - 168951.1667 [     0]:
      168951.1667 - 175160.9333 [     0]:
      175160.9333 - 181370.7000 [     1]: ∎
      181370.7000 - 187580.4667 [     1]: ∎
      187580.4667 - 193790.2333 [     2]: ∎∎
      193790.2333 - 200000.0000 [     0]:
      
      Here is casgstatus wait time histogram:
      
        631.0000 -  5276.6333 [     3]: ∎∎∎
       5276.6333 -  9922.2667 [     5]: ∎∎∎∎∎
       9922.2667 - 14567.9000 [     2]: ∎∎
      14567.9000 - 19213.5333 [     6]: ∎∎∎∎∎∎
      19213.5333 - 23859.1667 [     5]: ∎∎∎∎∎
      23859.1667 - 28504.8000 [     6]: ∎∎∎∎∎∎
      28504.8000 - 33150.4333 [     6]: ∎∎∎∎∎∎
      33150.4333 - 37796.0667 [     2]: ∎∎
      37796.0667 - 42441.7000 [     1]: ∎
      42441.7000 - 47087.3333 [     3]: ∎∎∎
      47087.3333 - 51732.9667 [     0]:
      51732.9667 - 56378.6000 [     1]: ∎
      56378.6000 - 61024.2333 [     0]:
      61024.2333 - 65669.8667 [     0]:
      65669.8667 - 70315.5000 [     0]:
      70315.5000 - 74961.1333 [     1]: ∎
      74961.1333 - 79606.7667 [     0]:
      79606.7667 - 84252.4000 [     0]:
      84252.4000 - 88898.0333 [     0]:
      88898.0333 - 93543.6667 [     0]:
      93543.6667 - 98189.3000 [     0]:
      98189.3000 - 102834.9333 [     0]:
      102834.9333 - 107480.5667 [     1]: ∎
      107480.5667 - 112126.2000 [     0]:
      112126.2000 - 116771.8333 [     0]:
      116771.8333 - 121417.4667 [     0]:
      121417.4667 - 126063.1000 [     0]:
      126063.1000 - 130708.7333 [     0]:
      130708.7333 - 135354.3667 [     0]:
      135354.3667 - 140000.0000 [     1]: ∎
      
      Ideally we eliminate the waiting by switching to async
      state machine for GC, but for now just yield to OS scheduler
      after a reasonable wait time.
      
      To choose yielding parameters I've measured
      golang.org/x/benchmarks/http tail latencies with different yield
      delays and oversubscription levels.
      
      With no oversubscription (to the degree possible):
      
      scang yield delay = 1, casgstatus yield delay = 1
      Latency-50   1.41ms ±15%  1.41ms ± 5%    ~     (p=0.611 n=13+12)
      Latency-95   5.21ms ± 2%  5.15ms ± 2%  -1.15%  (p=0.012 n=13+13)
      Latency-99   7.16ms ± 2%  7.05ms ± 2%  -1.54%  (p=0.002 n=13+13)
      Latency-999  10.7ms ± 9%  10.2ms ±10%  -5.46%  (p=0.004 n=12+13)
      
      scang yield delay = 5000, casgstatus yield delay = 3000
      Latency-50   1.41ms ±15%  1.41ms ± 8%    ~     (p=0.511 n=13+13)
      Latency-95   5.21ms ± 2%  5.14ms ± 2%  -1.23%  (p=0.006 n=13+13)
      Latency-99   7.16ms ± 2%  7.02ms ± 2%  -1.94%  (p=0.000 n=13+13)
      Latency-999  10.7ms ± 9%  10.1ms ± 8%  -6.14%  (p=0.000 n=12+13)
      
      scang yield delay = 10000, casgstatus yield delay = 5000
      Latency-50   1.41ms ±15%  1.45ms ± 6%    ~     (p=0.724 n=13+13)
      Latency-95   5.21ms ± 2%  5.18ms ± 1%    ~     (p=0.287 n=13+13)
      Latency-99   7.16ms ± 2%  7.05ms ± 2%  -1.64%  (p=0.002 n=13+13)
      Latency-999  10.7ms ± 9%  10.0ms ± 5%  -6.72%  (p=0.000 n=12+13)
      
      scang yield delay = 30000, casgstatus yield delay = 10000
      Latency-50   1.41ms ±15%  1.51ms ± 7%  +6.57%  (p=0.002 n=13+13)
      Latency-95   5.21ms ± 2%  5.21ms ± 2%    ~     (p=0.960 n=13+13)
      Latency-99   7.16ms ± 2%  7.06ms ± 2%  -1.50%  (p=0.012 n=13+13)
      Latency-999  10.7ms ± 9%  10.0ms ± 6%  -6.49%  (p=0.000 n=12+13)
      
      scang yield delay = 100000, casgstatus yield delay = 50000
      Latency-50   1.41ms ±15%  1.53ms ± 6%  +8.48%  (p=0.000 n=13+12)
      Latency-95   5.21ms ± 2%  5.23ms ± 2%    ~     (p=0.287 n=13+13)
      Latency-99   7.16ms ± 2%  7.08ms ± 2%  -1.21%  (p=0.004 n=13+13)
      Latency-999  10.7ms ± 9%   9.9ms ± 3%  -7.99%  (p=0.000 n=12+12)
      
      scang yield delay = 200000, casgstatus yield delay = 100000
      Latency-50   1.41ms ±15%  1.47ms ± 5%    ~     (p=0.072 n=13+13)
      Latency-95   5.21ms ± 2%  5.17ms ± 2%    ~     (p=0.091 n=13+13)
      Latency-99   7.16ms ± 2%  7.02ms ± 2%  -1.99%  (p=0.000 n=13+13)
      Latency-999  10.7ms ± 9%   9.9ms ± 5%  -7.86%  (p=0.000 n=12+13)
      
      With slight oversubscription (another instance of http benchmark
      was running in background with reduced GOMAXPROCS):
      
      scang yield delay = 1, casgstatus yield delay = 1
      Latency-50    840µs ± 3%   804µs ± 3%  -4.37%  (p=0.000 n=15+18)
      Latency-95   6.52ms ± 4%  6.03ms ± 4%  -7.51%  (p=0.000 n=18+18)
      Latency-99   10.8ms ± 7%  10.0ms ± 4%  -7.33%  (p=0.000 n=18+14)
      Latency-999  18.0ms ± 9%  16.8ms ± 7%  -6.84%  (p=0.000 n=18+18)
      
      scang yield delay = 5000, casgstatus yield delay = 3000
      Latency-50    840µs ± 3%   809µs ± 3%  -3.71%  (p=0.000 n=15+17)
      Latency-95   6.52ms ± 4%  6.11ms ± 4%  -6.29%  (p=0.000 n=18+18)
      Latency-99   10.8ms ± 7%   9.9ms ± 6%  -7.55%  (p=0.000 n=18+18)
      Latency-999  18.0ms ± 9%  16.5ms ±11%  -8.49%  (p=0.000 n=18+18)
      
      scang yield delay = 10000, casgstatus yield delay = 5000
      Latency-50    840µs ± 3%   823µs ± 5%  -2.06%  (p=0.002 n=15+18)
      Latency-95   6.52ms ± 4%  6.32ms ± 3%  -3.05%  (p=0.000 n=18+18)
      Latency-99   10.8ms ± 7%  10.2ms ± 4%  -5.22%  (p=0.000 n=18+18)
      Latency-999  18.0ms ± 9%  16.7ms ±10%  -7.09%  (p=0.000 n=18+18)
      
      scang yield delay = 30000, casgstatus yield delay = 10000
      Latency-50    840µs ± 3%   836µs ± 5%    ~     (p=0.442 n=15+18)
      Latency-95   6.52ms ± 4%  6.39ms ± 3%  -2.00%  (p=0.000 n=18+18)
      Latency-99   10.8ms ± 7%  10.2ms ± 6%  -5.15%  (p=0.000 n=18+17)
      Latency-999  18.0ms ± 9%  16.6ms ± 8%  -7.48%  (p=0.000 n=18+18)
      
      scang yield delay = 100000, casgstatus yield delay = 50000
      Latency-50    840µs ± 3%   836µs ± 6%    ~     (p=0.401 n=15+18)
      Latency-95   6.52ms ± 4%  6.40ms ± 4%  -1.79%  (p=0.010 n=18+18)
      Latency-99   10.8ms ± 7%  10.2ms ± 5%  -4.95%  (p=0.000 n=18+18)
      Latency-999  18.0ms ± 9%  16.5ms ±14%  -8.17%  (p=0.000 n=18+18)
      
      scang yield delay = 200000, casgstatus yield delay = 100000
      Latency-50    840µs ± 3%   828µs ± 2%  -1.49%  (p=0.001 n=15+17)
      Latency-95   6.52ms ± 4%  6.38ms ± 4%  -2.04%  (p=0.001 n=18+18)
      Latency-99   10.8ms ± 7%  10.2ms ± 4%  -4.77%  (p=0.000 n=18+18)
      Latency-999  18.0ms ± 9%  16.9ms ± 9%  -6.23%  (p=0.000 n=18+18)
      
      With significant oversubscription (background http benchmark
      was running with full GOMAXPROCS):
      
      scang yield delay = 1, casgstatus yield delay = 1
      Latency-50   1.32ms ±12%  1.30ms ±13%    ~     (p=0.454 n=14+14)
      Latency-95   16.3ms ±10%  15.3ms ± 7%  -6.29%  (p=0.001 n=14+14)
      Latency-99   29.4ms ±10%  27.9ms ± 5%  -5.04%  (p=0.001 n=14+12)
      Latency-999  49.9ms ±19%  45.9ms ± 5%  -8.00%  (p=0.008 n=14+13)
      
      scang yield delay = 5000, casgstatus yield delay = 3000
      Latency-50   1.32ms ±12%  1.29ms ± 9%    ~     (p=0.227 n=14+14)
      Latency-95   16.3ms ±10%  15.4ms ± 5%  -5.27%  (p=0.002 n=14+14)
      Latency-99   29.4ms ±10%  27.9ms ± 6%  -5.16%  (p=0.001 n=14+14)
      Latency-999  49.9ms ±19%  46.8ms ± 8%  -6.21%  (p=0.050 n=14+14)
      
      scang yield delay = 10000, casgstatus yield delay = 5000
      Latency-50   1.32ms ±12%  1.35ms ± 9%     ~     (p=0.401 n=14+14)
      Latency-95   16.3ms ±10%  15.0ms ± 4%   -7.67%  (p=0.000 n=14+14)
      Latency-99   29.4ms ±10%  27.4ms ± 5%   -6.98%  (p=0.000 n=14+14)
      Latency-999  49.9ms ±19%  44.7ms ± 5%  -10.56%  (p=0.000 n=14+11)
      
      scang yield delay = 30000, casgstatus yield delay = 10000
      Latency-50   1.32ms ±12%  1.36ms ±10%     ~     (p=0.246 n=14+14)
      Latency-95   16.3ms ±10%  14.9ms ± 5%   -8.31%  (p=0.000 n=14+14)
      Latency-99   29.4ms ±10%  27.4ms ± 7%   -6.70%  (p=0.000 n=14+14)
      Latency-999  49.9ms ±19%  44.9ms ±15%  -10.13%  (p=0.003 n=14+14)
      
      scang yield delay = 100000, casgstatus yield delay = 50000
      Latency-50   1.32ms ±12%  1.41ms ± 9%  +6.37%  (p=0.008 n=14+13)
      Latency-95   16.3ms ±10%  15.1ms ± 8%  -7.45%  (p=0.000 n=14+14)
      Latency-99   29.4ms ±10%  27.5ms ±12%  -6.67%  (p=0.002 n=14+14)
      Latency-999  49.9ms ±19%  45.9ms ±16%  -8.06%  (p=0.019 n=14+14)
      
      scang yield delay = 200000, casgstatus yield delay = 100000
      Latency-50   1.32ms ±12%  1.42ms ±10%   +7.21%  (p=0.003 n=14+14)
      Latency-95   16.3ms ±10%  15.0ms ± 7%   -7.59%  (p=0.000 n=14+14)
      Latency-99   29.4ms ±10%  27.3ms ± 8%   -7.20%  (p=0.000 n=14+14)
      Latency-999  49.9ms ±19%  44.8ms ± 8%  -10.21%  (p=0.001 n=14+13)
      
      All numbers are on 8 cores and with GOGC=10 (http benchmark has
      tiny heap, few goroutines and low allocation rate, so by default
      GC barely affects tail latency).
      
      10us/5us yield delays seem to provide a reasonable compromise
      and give 5-10% tail latency reduction. That's what used in this change.
      
      go install -a std results on 4 core machine:
      
      name      old time/op  new time/op  delta
      Time       8.39s ± 2%   7.94s ± 2%  -5.34%  (p=0.000 n=47+49)
      UserTime   24.6s ± 2%   22.9s ± 2%  -6.76%  (p=0.000 n=49+49)
      SysTime    1.77s ± 9%   1.89s ±11%  +7.00%  (p=0.000 n=49+49)
      CpuLoad    315ns ± 2%   313ns ± 1%  -0.59%  (p=0.000 n=49+48) # %CPU
      MaxRSS    97.1ms ± 4%  97.5ms ± 9%    ~     (p=0.838 n=46+49) # bytes
      
      Update #14396
      Update #14189
      
      Change-Id: I3f4109bf8f7fd79b39c466576690a778232055a2
      Reviewed-on: https://go-review.googlesource.com/21503
      Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      Reviewed-by: 's avatarAustin Clements <austin@google.com>
      475d113b
    • Dmitry Vyukov's avatar
      runtime: sleep less when we can do work · 3b246fa8
      Dmitry Vyukov authored
      Usleep(100) in runqgrab negatively affects latency and throughput
      of parallel application. We are sleeping instead of doing useful work.
      This is effect is particularly visible on windows where minimal
      sleep duration is 1-15ms.
      
      Reduce sleep from 100us to 3us and use osyield on windows.
      Sync chan send/recv takes ~50ns, so 3us gives us ~50x overshoot.
      
      benchmark                    old ns/op     new ns/op     delta
      BenchmarkChanSync-12         216           217           +0.46%
      BenchmarkChanSyncWork-12     27213         25816         -5.13%
      
      CPU consumption goes up from 106% to 108% in the first case,
      and from 107% to 125% in the second case.
      
      Test case from #14790 on windows:
      
      BenchmarkDefaultResolution-8  4583372   29720    -99.35%
      Benchmark1ms-8                992056    30701    -96.91%
      
      99-th latency percentile for HTTP request serving is improved by up to 15%
      (see http://golang.org/cl/20835 for details).
      
      The following benchmarks are from the change that originally added this sleep
      (see https://golang.org/s/go15gomaxprocs):
      
      name        old time/op  new time/op  delta
      Chain       22.6µs ± 2%  22.7µs ± 6%    ~      (p=0.905 n=9+10)
      ChainBuf    22.4µs ± 3%  22.5µs ± 4%    ~      (p=0.780 n=9+10)
      Chain-2     23.5µs ± 4%  24.9µs ± 1%  +5.66%   (p=0.000 n=10+9)
      ChainBuf-2  23.7µs ± 1%  24.4µs ± 1%  +3.31%   (p=0.000 n=9+10)
      Chain-4     24.2µs ± 2%  25.1µs ± 3%  +3.70%   (p=0.000 n=9+10)
      ChainBuf-4  24.4µs ± 5%  25.0µs ± 2%  +2.37%  (p=0.023 n=10+10)
      Powser       2.37s ± 1%   2.37s ± 1%    ~       (p=0.423 n=8+9)
      Powser-2     2.48s ± 2%   2.57s ± 2%  +3.74%   (p=0.000 n=10+9)
      Powser-4     2.66s ± 1%   2.75s ± 1%  +3.40%  (p=0.000 n=10+10)
      Sieve        13.3s ± 2%   13.3s ± 2%    ~      (p=1.000 n=10+9)
      Sieve-2      7.00s ± 2%   7.44s ±16%    ~      (p=0.408 n=8+10)
      Sieve-4      4.13s ±21%   3.85s ±22%    ~       (p=0.113 n=9+9)
      
      Fixes #14790
      
      Change-Id: Ie7c6a1c4f9c8eb2f5d65ab127a3845386d6f8b5d
      Reviewed-on: https://go-review.googlesource.com/20835Reviewed-by: 's avatarAustin Clements <austin@google.com>
      3b246fa8
    • Ilya Tocar's avatar
      cmd/compile/internal/amd64: Use 32-bit operands for byte operations · 036d09d5
      Ilya Tocar authored
      We already generate ADDL for byte operations, reflect this in code.
      This also allows inc/dec for +-1 operation, which are 1-byte shorter,
      and enables lea for 3-operand addition/subtraction.
      
      Change-Id: Ibfdfee50667ca4cd3c28f72e3dece0c6d114d3ae
      Reviewed-on: https://go-review.googlesource.com/21251Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      036d09d5