1. 26 Mar, 2018 7 commits
    • Zhou Peng's avatar
      runtime: fix comment typo · 3412baaa
      Zhou Peng authored
      This was a typo mistake according to if cond and runtime/mheap.go:323
      
      Change-Id: Id046d4afbfe0ea43cb29e1a9f400e1f130de221d
      Reviewed-on: https://go-review.googlesource.com/102575Reviewed-by: 's avatarAustin Clements <austin@google.com>
      3412baaa
    • Erwin Oegema's avatar
      path/filepath: change example to print the correct path on failure · 683e2fd5
      Erwin Oegema authored
      This change makes errors in the example code a bit better, as it's no use to show the root dir when an error occurs walking a subdirectory or file.
      
      Change-Id: I546276e9b151fabba5357258f03bfbd47a508201
      GitHub-Last-Rev: 398c1eeb6164a7edc6fdee8cb8c17c3bd0b649ef
      GitHub-Pull-Request: golang/go#24536
      Reviewed-on: https://go-review.googlesource.com/102535Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      683e2fd5
    • Agniva De Sarker's avatar
      io: document that ReadAtLeast and ReadFull can drop errors · 665af046
      Agniva De Sarker authored
      Add a note that if an error is returned after having read
      at least the minimum no. of bytes, the error is set to nil.
      
      Fixes #20477
      
      Change-Id: I75ba5ee967be3ff80249e40d459da4afeeb53463
      Reviewed-on: https://go-review.googlesource.com/102459Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      665af046
    • Hana Kim's avatar
      internal/trace: compute span stats as computing goroutine stats · 68a1c9c4
      Hana Kim authored
      Move part of UserSpan event processing from cmd/trace.analyzeAnnotations
      to internal/trace.GoroutineStats that returns analyzed per-goroutine
      execution information. Now the execution information includes list of
      spans and their execution information.
      
      cmd/trace.analyzeAnnotations utilizes the span execution information
      from internal/trace.GoroutineStats and connects them with task
      information.
      
      Change-Id: Ib7f79a3ba652a4ae55cd81ea17565bcc7e241c5c
      Reviewed-on: https://go-review.googlesource.com/101917
      Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarHeschi Kreinick <heschi@google.com>
      Reviewed-by: 's avatarPeter Weinberger <pjw@google.com>
      68a1c9c4
    • Ilya Tocar's avatar
      cmd/compile/internal/ssa: optimize away double NEG on amd64 · 24cd1120
      Ilya Tocar authored
      When lowering some ops on amd64 we generate additional NEGQ.
      This may result in code like this:
      
      NEGQ R12
      NEGQ R12
      
      Optimize it away. Gain is not significant, about ~0.5% gain in geomean
      in compress/flate and 200 bytes codesize reduction in go tool.
      
      Full results below:
      
      name                             old time/op    new time/op    delta
      Encode/Digits/Huffman/1e4-6        65.8µs ± 0%    65.7µs ± 0%  -0.21%  (p=0.010 n=10+9)
      Encode/Digits/Huffman/1e5-6         633µs ± 0%     632µs ± 0%    ~     (p=0.370 n=8+9)
      Encode/Digits/Huffman/1e6-6        6.30ms ± 1%    6.29ms ± 1%    ~     (p=0.796 n=10+10)
      Encode/Digits/Speed/1e4-6           281µs ± 0%     280µs ± 1%  -0.34%  (p=0.043 n=8+10)
      Encode/Digits/Speed/1e5-6          2.66ms ± 0%    2.66ms ± 0%  -0.09%  (p=0.043 n=10+10)
      Encode/Digits/Speed/1e6-6          26.3ms ± 0%    26.3ms ± 0%    ~     (p=0.190 n=10+10)
      Encode/Digits/Default/1e4-6         554µs ± 0%     557µs ± 0%  +0.46%  (p=0.001 n=9+10)
      Encode/Digits/Default/1e5-6        8.63ms ± 1%    8.62ms ± 1%    ~     (p=0.912 n=10+10)
      Encode/Digits/Default/1e6-6        92.7ms ± 1%    92.2ms ± 1%    ~     (p=0.052 n=10+10)
      Encode/Digits/Compression/1e4-6     558µs ± 1%     557µs ± 1%    ~     (p=0.481 n=10+10)
      Encode/Digits/Compression/1e5-6    8.58ms ± 0%    8.61ms ± 1%    ~     (p=0.315 n=8+10)
      Encode/Digits/Compression/1e6-6    92.3ms ± 1%    92.4ms ± 1%    ~     (p=0.971 n=10+10)
      Encode/Twain/Huffman/1e4-6         89.5µs ± 0%    89.0µs ± 1%  -0.48%  (p=0.001 n=9+9)
      Encode/Twain/Huffman/1e5-6          727µs ± 1%     728µs ± 0%    ~     (p=0.604 n=10+9)
      Encode/Twain/Huffman/1e6-6         7.21ms ± 0%    7.19ms ± 1%    ~     (p=0.696 n=8+10)
      Encode/Twain/Speed/1e4-6            320µs ± 1%     321µs ± 1%    ~     (p=0.353 n=10+10)
      Encode/Twain/Speed/1e5-6           2.63ms ± 0%    2.62ms ± 1%  -0.33%  (p=0.016 n=8+10)
      Encode/Twain/Speed/1e6-6           25.8ms ± 0%    25.8ms ± 0%    ~     (p=0.360 n=10+8)
      Encode/Twain/Default/1e4-6          677µs ± 1%     671µs ± 1%  -0.88%  (p=0.000 n=10+10)
      Encode/Twain/Default/1e5-6         10.5ms ± 1%    10.3ms ± 0%  -2.06%  (p=0.000 n=10+10)
      Encode/Twain/Default/1e6-6          113ms ± 1%     111ms ± 1%  -1.96%  (p=0.000 n=10+9)
      Encode/Twain/Compression/1e4-6      688µs ± 0%     679µs ± 1%  -1.30%  (p=0.000 n=7+10)
      Encode/Twain/Compression/1e5-6     11.6ms ± 1%    11.3ms ± 1%  -2.10%  (p=0.000 n=10+10)
      Encode/Twain/Compression/1e6-6      126ms ± 1%     124ms ± 0%  -1.57%  (p=0.000 n=10+10)
      [Geo mean]                         3.45ms         3.44ms       -0.46%
      
      name                             old speed      new speed      delta
      Encode/Digits/Huffman/1e4-6       152MB/s ± 0%   152MB/s ± 0%  +0.21%  (p=0.009 n=10+9)
      Encode/Digits/Huffman/1e5-6       158MB/s ± 0%   158MB/s ± 0%    ~     (p=0.336 n=8+9)
      Encode/Digits/Huffman/1e6-6       159MB/s ± 1%   159MB/s ± 1%    ~     (p=0.781 n=10+10)
      Encode/Digits/Speed/1e4-6        35.6MB/s ± 0%  35.7MB/s ± 1%  +0.34%  (p=0.020 n=8+10)
      Encode/Digits/Speed/1e5-6        37.6MB/s ± 0%  37.7MB/s ± 0%  +0.09%  (p=0.049 n=10+10)
      Encode/Digits/Speed/1e6-6        38.0MB/s ± 0%  38.0MB/s ± 0%    ~     (p=0.146 n=10+10)
      Encode/Digits/Default/1e4-6      18.0MB/s ± 0%  18.0MB/s ± 0%  -0.45%  (p=0.002 n=9+10)
      Encode/Digits/Default/1e5-6      11.6MB/s ± 1%  11.6MB/s ± 1%    ~     (p=0.644 n=10+10)
      Encode/Digits/Default/1e6-6      10.8MB/s ± 1%  10.8MB/s ± 1%  +0.51%  (p=0.044 n=10+10)
      Encode/Digits/Compression/1e4-6  17.9MB/s ± 1%  17.9MB/s ± 1%    ~     (p=0.468 n=10+10)
      Encode/Digits/Compression/1e5-6  11.7MB/s ± 0%  11.6MB/s ± 1%    ~     (p=0.322 n=8+10)
      Encode/Digits/Compression/1e6-6  10.8MB/s ± 1%  10.8MB/s ± 1%    ~     (p=0.983 n=10+10)
      Encode/Twain/Huffman/1e4-6        112MB/s ± 0%   112MB/s ± 1%  +0.42%  (p=0.002 n=8+9)
      Encode/Twain/Huffman/1e5-6        138MB/s ± 1%   137MB/s ± 0%    ~     (p=0.616 n=10+9)
      Encode/Twain/Huffman/1e6-6        139MB/s ± 0%   139MB/s ± 1%    ~     (p=0.652 n=8+10)
      Encode/Twain/Speed/1e4-6         31.3MB/s ± 1%  31.2MB/s ± 1%    ~     (p=0.342 n=10+10)
      Encode/Twain/Speed/1e5-6         38.0MB/s ± 0%  38.1MB/s ± 1%  +0.33%  (p=0.011 n=8+10)
      Encode/Twain/Speed/1e6-6         38.8MB/s ± 0%  38.7MB/s ± 0%    ~     (p=0.325 n=10+8)
      Encode/Twain/Default/1e4-6       14.8MB/s ± 1%  14.9MB/s ± 1%  +0.88%  (p=0.000 n=10+10)
      Encode/Twain/Default/1e5-6       9.48MB/s ± 1%  9.68MB/s ± 0%  +2.11%  (p=0.000 n=10+10)
      Encode/Twain/Default/1e6-6       8.86MB/s ± 1%  9.03MB/s ± 1%  +1.97%  (p=0.000 n=10+9)
      Encode/Twain/Compression/1e4-6   14.5MB/s ± 0%  14.7MB/s ± 1%  +1.31%  (p=0.000 n=7+10)
      Encode/Twain/Compression/1e5-6   8.63MB/s ± 1%  8.82MB/s ± 1%  +2.17%  (p=0.000 n=10+10)
      Encode/Twain/Compression/1e6-6   7.92MB/s ± 1%  8.05MB/s ± 1%  +1.59%  (p=0.000 n=10+10)
      [Geo mean]                       29.0MB/s       29.1MB/s       +0.47%
      
      // symSizeComp `which go` go_old:
      
      section differences:
      global text (code) = 203 bytes (0.005131%)
      read-only data = 1 bytes (0.000057%)
      Total difference 204 bytes (0.003297%)
      
      Change-Id: Ie2cdfa1216472d78694fff44d215b3b8e71cf7bf
      Reviewed-on: https://go-review.googlesource.com/102277
      Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      24cd1120
    • Hana (Hyang-Ah) Kim's avatar
      cmd/trace: beautify goroutine page · ea1f4832
      Hana (Hyang-Ah) Kim authored
      - Summary: also includes links to pprof data.
      - Sortable table: sorting is done on server-side. The intention is
        that later, I want to add pagination feature and limit the page
        size the browser has to handle.
      - Stacked horizontal bar graph to present total time breakdown.
      - Human-friendly time representation.
      - No dependency on external fancy javascript libraries to allow
        it to function without an internet connection.
      
      Change-Id: I91e5c26746e59ad0329dfb61e096e11f768c7b73
      Reviewed-on: https://go-review.googlesource.com/102156
      Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarAndrew Bonventre <andybons@golang.org>
      Reviewed-by: 's avatarHeschi Kreinick <heschi@google.com>
      ea1f4832
    • Alex Brainman's avatar
      os: do not test Lstat in TestDevNullFile · d2dd2e15
      Alex Brainman authored
      CL 102456 added Lstat check to TestDevNullFile.
      But some systems have /dev/null as a symlink,
      so Lstat test is wrong. Remove the test.
      
      Fixes #24521
      
      Change-Id: I149110b08dd05db6495ec4eccbcf943e444332f9
      Reviewed-on: https://go-review.googlesource.com/102461
      Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
      Reviewed-by: 's avatarTobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      d2dd2e15
  2. 25 Mar, 2018 6 commits
    • Alberto Donizetti's avatar
      cmd/compile: avoid some allocations in regalloc · 2ba98f1a
      Alberto Donizetti authored
      Compilebench:
      name      old time/op       new time/op       delta
      Template        283ms ± 3%        281ms ± 4%    ~     (p=0.242 n=20+20)
      Unicode         137ms ± 6%        135ms ± 6%    ~     (p=0.194 n=20+19)
      GoTypes         890ms ± 2%        883ms ± 1%  -0.74%  (p=0.001 n=19+19)
      Compiler        4.21s ± 2%        4.20s ± 2%  -0.40%  (p=0.033 n=20+19)
      SSA             9.86s ± 2%        9.68s ± 1%  -1.80%  (p=0.000 n=20+19)
      Flate           185ms ± 5%        185ms ± 7%    ~     (p=0.429 n=20+20)
      GoParser        222ms ± 3%        222ms ± 4%    ~     (p=0.588 n=19+20)
      Reflect         572ms ± 2%        570ms ± 3%    ~     (p=0.113 n=19+20)
      Tar             263ms ± 4%        259ms ± 2%  -1.41%  (p=0.013 n=20+20)
      XML             321ms ± 2%        321ms ± 4%    ~     (p=0.835 n=20+19)
      
      name      old user-time/op  new user-time/op  delta
      Template        400ms ± 5%        405ms ± 5%    ~     (p=0.096 n=20+20)
      Unicode         217ms ± 8%        213ms ± 8%    ~     (p=0.242 n=20+20)
      GoTypes         1.23s ± 3%        1.22s ± 3%    ~     (p=0.923 n=19+20)
      Compiler        5.76s ± 6%        5.81s ± 2%    ~     (p=0.687 n=20+19)
      SSA             14.2s ± 4%        14.0s ± 4%    ~     (p=0.121 n=20+20)
      Flate           248ms ± 7%        251ms ±10%    ~     (p=0.369 n=20+20)
      GoParser        308ms ± 5%        305ms ± 6%    ~     (p=0.336 n=19+20)
      Reflect         771ms ± 2%        766ms ± 2%    ~     (p=0.113 n=20+19)
      Tar             370ms ± 5%        362ms ± 7%  -2.06%  (p=0.036 n=19+20)
      XML             435ms ± 4%        432ms ± 5%    ~     (p=0.369 n=20+20)
      
      name      old alloc/op      new alloc/op      delta
      Template       39.5MB ± 0%       39.4MB ± 0%  -0.20%  (p=0.000 n=20+20)
      Unicode        29.1MB ± 0%       29.1MB ± 0%    ~     (p=0.064 n=20+20)
      GoTypes         117MB ± 0%        117MB ± 0%  -0.17%  (p=0.000 n=20+20)
      Compiler        503MB ± 0%        502MB ± 0%  -0.15%  (p=0.000 n=19+19)
      SSA            1.42GB ± 0%       1.42GB ± 0%  -0.16%  (p=0.000 n=20+20)
      Flate          25.3MB ± 0%       25.3MB ± 0%  -0.19%  (p=0.000 n=20+20)
      GoParser       31.4MB ± 0%       31.3MB ± 0%  -0.14%  (p=0.000 n=20+18)
      Reflect        78.1MB ± 0%       77.9MB ± 0%  -0.34%  (p=0.000 n=20+19)
      Tar            40.1MB ± 0%       40.0MB ± 0%  -0.17%  (p=0.000 n=20+20)
      XML            45.3MB ± 0%       45.2MB ± 0%  -0.13%  (p=0.000 n=20+20)
      
      name      old allocs/op     new allocs/op     delta
      Template         393k ± 0%         392k ± 0%  -0.21%  (p=0.000 n=20+19)
      Unicode          337k ± 0%         337k ± 0%  -0.02%  (p=0.000 n=20+20)
      GoTypes         1.22M ± 0%        1.22M ± 0%  -0.21%  (p=0.000 n=20+20)
      Compiler        4.77M ± 0%        4.76M ± 0%  -0.16%  (p=0.000 n=20+20)
      SSA             11.8M ± 0%        11.8M ± 0%  -0.12%  (p=0.000 n=20+20)
      Flate            242k ± 0%         241k ± 0%  -0.20%  (p=0.000 n=20+20)
      GoParser         324k ± 0%         324k ± 0%  -0.14%  (p=0.000 n=20+20)
      Reflect          985k ± 0%         981k ± 0%  -0.38%  (p=0.000 n=20+20)
      Tar              403k ± 0%         402k ± 0%  -0.19%  (p=0.000 n=20+20)
      XML              424k ± 0%         424k ± 0%  -0.16%  (p=0.000 n=19+20)
      
      Change-Id: I131e382b64cd6db11a9263a477d45d80c180c499
      Reviewed-on: https://go-review.googlesource.com/102421
      Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      2ba98f1a
    • Agniva De Sarker's avatar
      net/http: use top-level font media type · c0ce2925
      Agniva De Sarker authored
      RFC 8081 declares a top level font media type for all types of fonts.
      Updating the mime types in sniffer to reflect the new changes.
      
      Fixes #24524
      
      Change-Id: Iba6cef4c5974e9930e14705720d42550ee87ba56
      Reviewed-on: https://go-review.googlesource.com/102458
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      c0ce2925
    • Alex Brainman's avatar
      os: treat "nul" as DevNull file on windows · 48c4eeee
      Alex Brainman authored
      Also add more tests to test both nul and NUL on windows.
      
      Fixes #24482
      
      Change-Id: I3dfe68ec8de7f90ca869c1096dde0054df3c5cf6
      Reviewed-on: https://go-review.googlesource.com/102457Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      48c4eeee
    • Josh Bleecher Snyder's avatar
      net: deflake lookup tests · 5ce92d03
      Josh Bleecher Snyder authored
      The build dashboard is dotted with net test failures.
      We cannot declare all builders to have flaky networks,
      although all fundamentally do.
      
      Instead, add a simple retry/backoff loop to the ones that
      show up most commonly on the dashboard at this moment.
      
      If this approach works well in practice, we can
      incrementally apply it to other flaky net tests.
      
      Change-Id: I69c1ca6ce5b347ad549c7eb18d0438373f6e2489
      Reviewed-on: https://go-review.googlesource.com/102397
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      5ce92d03
    • Kevin Burke's avatar
      database/sql: add more examples · 6f08b9fa
      Kevin Burke authored
      This aims to expand the coverage of examples showing how the sql
      package works, as well as to address a number of issues I've observed
      while explaining how the database package works:
      
      - The best way to issue UPDATE or INSERT queries, that don't need
      to scan anything in return. (Previously, we had no examples for any
      Execute statement).
      
      - How to use prepared statements and transactions.
      
      - How to aggregate arguments from a Query/QueryContext query into
      a slice.
      
      Furthermore just having examples in more places should help, as users
      click on e.g. the "Rows" return parameter and are treated with the
      lack of any example about how Rows is used.
      
      Switch package examples to use QueryContext/QueryRowContext; I think
      it is a good practice to prepare users to issue queries with a timeout
      attached, even if they are not using it immediately.
      
      Change-Id: I4e63af91c7e4fff88b25f820906104ecefde4cc3
      Reviewed-on: https://go-review.googlesource.com/91015Reviewed-by: 's avatarDaniel Theophanes <kardianos@gmail.com>
      Run-TryBot: Daniel Theophanes <kardianos@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      6f08b9fa
    • Alex Brainman's avatar
      os: document DevNull on windows · 782f9ce5
      Alex Brainman authored
      DevNull is documented on darwin, dragonfly, freebsd, linux,
      nacl, netbsd, openbsd, solaris and plan9, but not on windows.
      Add missing documentation.
      
      Change-Id: Icdbded0dd5e322ed4360cbce6bee4cdca5cfbe72
      Reviewed-on: https://go-review.googlesource.com/102456
      Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      782f9ce5
  3. 24 Mar, 2018 12 commits
    • Daniel Martí's avatar
      all: remove some unused return parameters · 8da180f6
      Daniel Martí authored
      As found by unparam. Picked the low-hanging fruit, consisting only of
      errors that were always nil and results that were never used. Left out
      those that were useful for consistency with other func signatures.
      
      Change-Id: I06b52bbd3541f8a5d66659c909bd93cb3e172018
      Reviewed-on: https://go-review.googlesource.com/102418
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      8da180f6
    • Daniel Martí's avatar
      cmd/compile/internal/gc: various cleanups · b1892d74
      Daniel Martí authored
      Remove a couple of unnecessary var declarations, an unused sort.Sort
      type, and simplify a range by using the two-name variant.
      
      Change-Id: Ia251f634db0bfbe8b1d553b8659272ddbd13b2c3
      Reviewed-on: https://go-review.googlesource.com/102336
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      b1892d74
    • Agniva De Sarker's avatar
      net/http: add sniffing support for woff2 · bf8eef2a
      Agniva De Sarker authored
      Sniffing woff2 is now added to the spec -
      https://github.com/whatwg/mimesniff/commit/e29b9f4a22843bf6c7f0177223b0147bc03e37f7
      
      Change-Id: Ie63744454d0ee54ed0f985c2873d7eb20a14015a
      Reviewed-on: https://go-review.googlesource.com/102455Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      bf8eef2a
    • Daniel Nephin's avatar
      cmd/test2json: document missing "skip" action · 5526ef1c
      Daniel Nephin authored
      Change-Id: I906e61170279f0647598e2fd4fa931aac1b69288
      GitHub-Last-Rev: f6df43e8e10e3b032a67490611c0ba5ad8e948df
      GitHub-Pull-Request: golang/go#24517
      Reviewed-on: https://go-review.googlesource.com/102396Reviewed-by: 's avatarBrad Fitzpatrick <bradfitz@golang.org>
      5526ef1c
    • Alberto Donizetti's avatar
      test/codegen: port tbz/tbnz arm64 tests · a27cd4fd
      Alberto Donizetti authored
      And delete them from asm_test.
      
      Change-Id: I34fcf85ae8ce09cd146fe4ce6a0ae7616bd97e2d
      Reviewed-on: https://go-review.googlesource.com/102296
      Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarGiovanni Bajo <rasky@develer.com>
      a27cd4fd
    • Tobias Klauser's avatar
      runtime: adjust GOARM floating point compatibility error message · 786899a7
      Tobias Klauser authored
      As pointed out by Josh Bleecher Snyder in CL 99780.
      
      The check is for GOARM > 6, so suggest to recompile with either GOARM=5
      or GOARM=6.
      
      Change-Id: I6a97e87bdc17aa3932f5c8cb598bba85c3cf4be9
      Reviewed-on: https://go-review.googlesource.com/101936
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarJosh Bleecher Snyder <josharian@gmail.com>
      786899a7
    • Giovanni Bajo's avatar
      cmd/compile: in prove, shortcircuit self-facts · d54902ec
      Giovanni Bajo authored
      Sometimes, we can end up calling update with a self-relation
      about a variable (x REL x). In this case, there is no need
      to record anything: the relation is unsatisfiable if and only
      if it doesn't contain eq.
      
      This also helps avoiding infinite loop in next CL that will
      introduce transitive closure of relations.
      
      Passes toolstash -cmp.
      
      Change-Id: Ic408452ec1c13653f22ada35466ec98bc14aaa8e
      Reviewed-on: https://go-review.googlesource.com/100276Reviewed-by: 's avatarAustin Clements <austin@google.com>
      d54902ec
    • Giovanni Bajo's avatar
      cmd/compile: in prove, fail fast when unsat is found · 385d936f
      Giovanni Bajo authored
      When an unsatisfiable relation is recorded in the facts table,
      there is no need to compute further relations or updates
      additional data structures.
      
      Since we're about to transitively propagate relations, make
      sure to fail as fast as possible to avoid doing useless work
      in dead branches.
      
      Passes toolstash -cmp.
      
      Change-Id: I23eed376d62776824c33088163c7ac9620abce85
      Reviewed-on: https://go-review.googlesource.com/100275Reviewed-by: 's avatarAustin Clements <austin@google.com>
      385d936f
    • Giovanni Bajo's avatar
      cmd/compile: add patterns for bit set/clear/complement on amd64 · 79112707
      Giovanni Bajo authored
      This patch completes implementation of BT(Q|L), and adds support
      for BT(S|R|C)(Q|L).
      
      Example of code changes from time.(*Time).addSec:
      
              if t.wall&hasMonotonic != 0 {
        0x1073465               488b08                  MOVQ 0(AX), CX
        0x1073468               4889ca                  MOVQ CX, DX
        0x107346b               48c1e93f                SHRQ $0x3f, CX
        0x107346f               48c1e13f                SHLQ $0x3f, CX
        0x1073473               48f7c1ffffffff          TESTQ $-0x1, CX
        0x107347a               746b                    JE 0x10734e7
      
              if t.wall&hasMonotonic != 0 {
        0x1073435               488b08                  MOVQ 0(AX), CX
        0x1073438               480fbae13f              BTQ $0x3f, CX
        0x107343d               7363                    JAE 0x10734a2
      
      Another example:
      
                              t.wall = t.wall&nsecMask | uint64(dsec)<<nsecShift | hasMonotonic
        0x10734c8               4881e1ffffff3f          ANDQ $0x3fffffff, CX
        0x10734cf               48c1e61e                SHLQ $0x1e, SI
        0x10734d3               4809ce                  ORQ CX, SI
        0x10734d6               48b90000000000000080    MOVQ $0x8000000000000000, CX
        0x10734e0               4809f1                  ORQ SI, CX
        0x10734e3               488908                  MOVQ CX, 0(AX)
      
                              t.wall = t.wall&nsecMask | uint64(dsec)<<nsecShift | hasMonotonic
        0x107348b		4881e2ffffff3f		ANDQ $0x3fffffff, DX
        0x1073492		48c1e61e		SHLQ $0x1e, SI
        0x1073496		4809f2			ORQ SI, DX
        0x1073499		480fbaea3f		BTSQ $0x3f, DX
        0x107349e		488910			MOVQ DX, 0(AX)
      
      Go1 benchmarks seem unaffected, and I would be surprised
      otherwise:
      
      name                     old time/op    new time/op     delta
      BinaryTree17-4              2.64s ± 4%      2.56s ± 9%  -2.92%  (p=0.008 n=9+9)
      Fannkuch11-4                2.90s ± 1%      2.95s ± 3%  +1.76%  (p=0.010 n=10+9)
      FmtFprintfEmpty-4          35.3ns ± 1%     34.5ns ± 2%  -2.34%  (p=0.004 n=9+8)
      FmtFprintfString-4         57.0ns ± 1%     58.4ns ± 5%  +2.52%  (p=0.029 n=9+10)
      FmtFprintfInt-4            59.8ns ± 3%     59.8ns ± 6%    ~     (p=0.565 n=10+10)
      FmtFprintfIntInt-4         93.9ns ± 3%     91.2ns ± 5%  -2.94%  (p=0.014 n=10+9)
      FmtFprintfPrefixedInt-4     107ns ± 6%      104ns ± 6%    ~     (p=0.099 n=10+10)
      FmtFprintfFloat-4           187ns ± 3%      188ns ± 3%    ~     (p=0.505 n=10+9)
      FmtManyArgs-4               410ns ± 1%      415ns ± 6%    ~     (p=0.649 n=8+10)
      GobDecode-4                5.30ms ± 3%     5.27ms ± 3%    ~     (p=0.436 n=10+10)
      GobEncode-4                4.62ms ± 5%     4.47ms ± 2%  -3.24%  (p=0.001 n=9+10)
      Gzip-4                      197ms ± 4%      193ms ± 3%    ~     (p=0.123 n=10+10)
      Gunzip-4                   30.4ms ± 3%     30.1ms ± 3%    ~     (p=0.481 n=10+10)
      HTTPClientServer-4         76.3µs ± 1%     76.0µs ± 1%    ~     (p=0.236 n=8+9)
      JSONEncode-4               10.5ms ± 9%     10.3ms ± 3%    ~     (p=0.280 n=10+10)
      JSONDecode-4               42.3ms ±10%     41.3ms ± 2%    ~     (p=0.053 n=9+10)
      Mandelbrot200-4            3.80ms ± 2%     3.72ms ± 2%  -2.15%  (p=0.001 n=9+10)
      GoParse-4                  2.88ms ±10%     2.81ms ± 2%    ~     (p=0.247 n=10+10)
      RegexpMatchEasy0_32-4      69.5ns ± 4%     68.6ns ± 2%    ~     (p=0.171 n=10+10)
      RegexpMatchEasy0_1K-4       165ns ± 3%      162ns ± 3%    ~     (p=0.137 n=10+10)
      RegexpMatchEasy1_32-4      65.7ns ± 6%     64.4ns ± 2%  -2.02%  (p=0.037 n=10+10)
      RegexpMatchEasy1_1K-4       278ns ± 2%      279ns ± 3%    ~     (p=0.991 n=8+9)
      RegexpMatchMedium_32-4     99.3ns ± 3%     98.5ns ± 4%    ~     (p=0.457 n=10+9)
      RegexpMatchMedium_1K-4     30.1µs ± 1%     30.4µs ± 2%    ~     (p=0.173 n=8+10)
      RegexpMatchHard_32-4       1.40µs ± 2%     1.41µs ± 4%    ~     (p=0.565 n=10+10)
      RegexpMatchHard_1K-4       42.5µs ± 1%     41.5µs ± 3%  -2.13%  (p=0.002 n=8+9)
      Revcomp-4                   332ms ± 4%      328ms ± 5%    ~     (p=0.720 n=9+10)
      Template-4                 48.3ms ± 2%     49.6ms ± 3%  +2.56%  (p=0.002 n=8+10)
      TimeParse-4                 252ns ± 2%      249ns ± 3%    ~     (p=0.116 n=9+10)
      TimeFormat-4                262ns ± 4%      252ns ± 3%  -4.01%  (p=0.000 n=9+10)
      
      name                     old speed      new speed       delta
      GobDecode-4               145MB/s ± 3%    146MB/s ± 3%    ~     (p=0.436 n=10+10)
      GobEncode-4               166MB/s ± 5%    172MB/s ± 2%  +3.28%  (p=0.001 n=9+10)
      Gzip-4                   98.6MB/s ± 4%  100.4MB/s ± 3%    ~     (p=0.123 n=10+10)
      Gunzip-4                  639MB/s ± 3%    645MB/s ± 3%    ~     (p=0.481 n=10+10)
      JSONEncode-4              185MB/s ± 8%    189MB/s ± 3%    ~     (p=0.280 n=10+10)
      JSONDecode-4             46.0MB/s ± 9%   47.0MB/s ± 2%  +2.21%  (p=0.046 n=9+10)
      GoParse-4                20.1MB/s ± 9%   20.6MB/s ± 2%    ~     (p=0.239 n=10+10)
      RegexpMatchEasy0_32-4     460MB/s ± 4%    467MB/s ± 2%    ~     (p=0.165 n=10+10)
      RegexpMatchEasy0_1K-4    6.19GB/s ± 3%   6.28GB/s ± 3%    ~     (p=0.165 n=10+10)
      RegexpMatchEasy1_32-4     487MB/s ± 5%    497MB/s ± 2%  +2.00%  (p=0.043 n=10+10)
      RegexpMatchEasy1_1K-4    3.67GB/s ± 2%   3.67GB/s ± 3%    ~     (p=0.963 n=8+9)
      RegexpMatchMedium_32-4   10.1MB/s ± 3%   10.1MB/s ± 4%    ~     (p=0.435 n=10+9)
      RegexpMatchMedium_1K-4   34.0MB/s ± 1%   33.7MB/s ± 2%    ~     (p=0.173 n=8+10)
      RegexpMatchHard_32-4     22.9MB/s ± 2%   22.7MB/s ± 4%    ~     (p=0.565 n=10+10)
      RegexpMatchHard_1K-4     24.0MB/s ± 3%   24.7MB/s ± 3%  +2.64%  (p=0.001 n=9+9)
      Revcomp-4                 766MB/s ± 4%    775MB/s ± 5%    ~     (p=0.720 n=9+10)
      Template-4               40.2MB/s ± 2%   39.2MB/s ± 3%  -2.47%  (p=0.002 n=8+10)
      
      The rules match ~1800 times during all.bash.
      
      Fixes #18943
      
      Change-Id: I64be1ada34e89c486dfd935bf429b35652117ed4
      Reviewed-on: https://go-review.googlesource.com/94766
      Run-TryBot: Giovanni Bajo <rasky@develer.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      79112707
    • isharipo's avatar
      cmd/compile/internal/gc: properly initialize ssa.Func Type field · 3afd2d7f
      isharipo authored
      The ssa.Func has Type field that is described as
      function signature type.
      
      It never gets any value and remains nil.
      This leads to "<T>" signature printed representation.
      
      Given this function declaration:
      	func foo(x int, f func() string) (int, error)
      
      GOSSAFUNC printed it as below:
      	compiling foo
      	foo <T>
      
      After this change:
      	compiling foo
      	foo func(int, func() string) (int, error)
      
      Change-Id: Iec5eec8aac5c76ff184659e30f41b2f5fe86d329
      Reviewed-on: https://go-review.googlesource.com/102375
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarJosh Bleecher Snyder <josharian@gmail.com>
      3afd2d7f
    • Matthew Dempsky's avatar
      cmd/compile: always write pack files · ea668e18
      Matthew Dempsky authored
      By always writing out pack files, the object file format can be
      simplified somewhat. In particular, the export data format will no
      longer require escaping, because the pack file provides appropriate
      framing.
      
      This CL does not affect build systems that use -pack, which includes
      all major Go build systems (cmd/go, gb, bazel).
      
      Also, existing package import logic already distinguishes pack/object
      files based on file contents rather than file extension.
      
      The only exception is cmd/pack, which specially handled object files
      created by cmd/compile when used with the 'c' mode. This mode is
      extended to now recognize the pack files produced by cmd/compile and
      handle them as before.
      
      Passes toolstash-check.
      
      Updates #21705.
      Updates #24512.
      
      Change-Id: Idf131013bfebd73a5cde7e087eb19964503a9422
      Reviewed-on: https://go-review.googlesource.com/102236
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      ea668e18
    • Matthew Dempsky's avatar
      cmd/link: skip __.PKGDEF in archives · 699b0d4e
      Matthew Dempsky authored
      The __.PKGDEF file is a compiler object file only intended for other
      compilers. Also, for build systems that use -linkobj, all of the
      information it contains is present within the linker object files
      already, so look for it there instead.
      
      This requires a little bit of code reorganization. Significantly,
      previously when loading an archive file, the __.PKGDEF file was
      authoritative on whether the package was "main" and/or "safe". Now
      that we're using the Go object files instead, there's the issue that
      there can be multiple Go object files in an archive (because when
      using assembly, each assembly file becomes its own additional object
      file).
      
      The solution taken here is to check if any object file within the
      package declares itself as "main" and/or "safe".
      
      Updates #24512.
      
      Change-Id: I70243a293bdf34b8555c0bf1833f8933b2809449
      Reviewed-on: https://go-review.googlesource.com/102281
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      699b0d4e
  4. 23 Mar, 2018 5 commits
  5. 22 Mar, 2018 9 commits
    • Matthew Dempsky's avatar
      cmd/compile: change unsafeUintptrTag from var to const · 50921bfa
      Matthew Dempsky authored
      Change-Id: Ie30878199e24cce5b75428e6b602c017ebd16642
      Reviewed-on: https://go-review.googlesource.com/102175
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      Reviewed-by: 's avatarDaniel Martí <mvdan@mvdan.cc>
      50921bfa
    • Adam Langley's avatar
      crypto/x509: follow OpenSSL and emit Extension structures directly in CSRs. · 0b37f05d
      Adam Langley authored
      I don't know if I got lost in the old PKCS documents, or whether this is
      a case where reality diverges from the spec, but OpenSSL clearly stuffs
      PKIX Extension objects in CSR attributues directly[1].
      
      In either case, doing what OpenSSL does seems valid here and allows the
      critical flag in extensions to be serialised.
      
      Fixes #13739.
      
      [1] https://github.com/openssl/openssl/blob/e3713c365c2657236439fea00822a43aa396d112/crypto/x509/x509_req.c#L173
      
      Change-Id: Ic1e73ba9bd383a357a2aa8fc4f6bd76811bbefcc
      Reviewed-on: https://go-review.googlesource.com/70851
      Run-TryBot: Adam Langley <agl@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarFilippo Valsorda <filippo@golang.org>
      0b37f05d
    • Mike Danese's avatar
      crypto/tls: support keying material export · c529141d
      Mike Danese authored
      This change implement keying material export as described in:
      
      https://tools.ietf.org/html/rfc5705
      
      I verified the implementation against openssl s_client and openssl
      s_server.
      
      Change-Id: I4dcdd2fb929c63ab4e92054616beab6dae7b1c55
      Signed-off-by: 's avatarMike Danese <mikedanese@google.com>
      Reviewed-on: https://go-review.googlesource.com/85115
      Run-TryBot: Adam Langley <agl@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarAdam Langley <agl@golang.org>
      c529141d
    • Daniel Martí's avatar
      cmd/compile: use more range fors in gc · 02798ed9
      Daniel Martí authored
      Slightly simplifies the code. Made sure to exclude the cases that would
      change behavior, such as when the iterated value is a string, when the
      index is modified within the body, or when the slice is modified.
      
      Also checked that all the elements are of pointer type, to avoid the
      corner case where non-pointer types could be copied by mistake.
      
      Change-Id: Iea64feb2a9a6a4c94ada9ff3ace40ee173505849
      Reviewed-on: https://go-review.googlesource.com/100557
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      02798ed9
    • Austin Clements's avatar
      cmd/compile: fix GOEXPERIMENT=preemptibleloops type-checking · 48f990b4
      Austin Clements authored
      This experiment has gone stale. It causes a type-checking failure
      because the condition of the OIF produced by range loop lowering has
      type "untyped bool". Fix this by typechecking the whole OIF statement,
      not just its condition.
      
      This doesn't quite fix the whole experiment, but it gets further.
      Something about preemption point insertion is causing failures like
      "internal compiler error: likeliness prediction 1 for block b10 with 1
      successors" in cmd/compile/internal/gc.
      
      Change-Id: I7d80d618d7c91c338bf5f2a8dc174d582a479df3
      Reviewed-on: https://go-review.googlesource.com/102157
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarDavid Chase <drchase@google.com>
      48f990b4
    • Travis Bischel's avatar
      cmd/compile: specialize Move up to 79B on amd64 · 4f7b7748
      Travis Bischel authored
      Move currently uses mov instructions directly up to 31 bytes and then
      switches to duffcopy. Moving 31 bytes is 4 instructions corresponding to
      two loads and two stores, (or 6 if !useSSE) depending on the usage,
      duffcopy is five (one or two mov, two or three lea, one call).
      
      This adds direct mov instructions for Move's of size 32, 48, and 64 with
      sse and for only size 32 without.
      With useSSE:
      - 32 is 4 instructions (byte +/- comparison below)
      - 33 thru 48 is 6
      - 49 thru 64 is 8
      
      Without:
      - 32 is 8
      
      Note that the only platform with useSSE set to false is plan 9. I have
      built three projects based off tip and tip with this patch and the
      project's byte size is equal to or less than they were prior.
      
      The basis of this change is that copying data with instructions directly
      is nearly free, whereas calling into duffcopy adds a bit of overhead.
      This is most noticeable in range statements where elements are 32+
      bytes. For code with the following pattern:
      
      func Benchmark32Range(b *testing.B) {
              var f s32
              for _, count := range []int{10, 100, 1000, 10000} {
                      name := strconv.Itoa(count)
                      b.Run(name, func(b *testing.B) {
                              base := make([]s32, count)
                              for i := 0; i < b.N; i++ {
                                      for _, v := range base {
                                              f = v
                                      }
                              }
                      })
              }
              _ = f
      }
      
      These are the resulting benchmarks:
      Benchmark16Range/10-4        19.1          19.1          +0.00%
      Benchmark16Range/100-4       169           170           +0.59%
      Benchmark16Range/1000-4      1684          1691          +0.42%
      Benchmark16Range/10000-4     18147         18124         -0.13%
      Benchmark31Range/10-4        141           142           +0.71%
      Benchmark31Range/100-4       1407          1410          +0.21%
      Benchmark31Range/1000-4      14070         14074         +0.03%
      Benchmark31Range/10000-4     141781        141759        -0.02%
      Benchmark32Range/10-4        71.4          32.2          -54.90%
      Benchmark32Range/100-4       695           326           -53.09%
      Benchmark32Range/1000-4      7166          3313          -53.77%
      Benchmark32Range/10000-4     72571         35425         -51.19%
      Benchmark64Range/10-4        87.8          64.9          -26.08%
      Benchmark64Range/100-4       868           629           -27.53%
      Benchmark64Range/1000-4      9355          6907          -26.17%
      Benchmark64Range/10000-4     94463         70385         -25.49%
      Benchmark79Range/10-4        177           152           -14.12%
      Benchmark79Range/100-4       1769          1531          -13.45%
      Benchmark79Range/1000-4      17893         15532         -13.20%
      Benchmark79Range/10000-4     178947        155551        -13.07%
      Benchmark80Range/10-4        99.6          99.7          +0.10%
      Benchmark80Range/100-4       987           985           -0.20%
      Benchmark80Range/1000-4      10573         10560         -0.12%
      Benchmark80Range/10000-4     106792        106639        -0.14%
      
      For runtime's BenchCopyFat* benchmarks:
      CopyFat8-4     0.40ns ± 0%  0.40ns ± 0%      ~     (all equal)
      CopyFat12-4    0.40ns ± 0%  0.80ns ± 0%  +100.00%  (p=0.000 n=9+9)
      CopyFat16-4    0.40ns ± 0%  0.80ns ± 0%  +100.00%  (p=0.000 n=10+8)
      CopyFat24-4    0.80ns ± 0%  0.40ns ± 0%   -50.00%  (p=0.001 n=8+9)
      CopyFat32-4    2.01ns ± 0%  0.40ns ± 0%   -80.10%  (p=0.000 n=8+8)
      CopyFat64-4    2.87ns ± 0%  0.40ns ± 0%   -86.07%  (p=0.000 n=8+10)
      CopyFat128-4   4.82ns ± 0%  4.82ns ± 0%      ~     (p=1.000 n=8+8)
      CopyFat256-4   8.83ns ± 0%  8.83ns ± 0%      ~     (p=1.000 n=8+8)
      CopyFat512-4   16.9ns ± 0%  16.9ns ± 0%      ~     (all equal)
      CopyFat520-4   14.6ns ± 0%  14.6ns ± 1%      ~     (p=0.529 n=8+9)
      CopyFat1024-4  32.9ns ± 0%  33.0ns ± 0%    +0.20%  (p=0.041 n=8+9)
      
      Function calls are not benefitted as much due how they are compiled, but
      other benchmarks I ran show that calling function with 64 byte elements
      is marginally improved.
      
      The main downside with this change is that it may increase binary sizes
      depending on the size of the copy, but this change also decreases
      binaries for moves of 48 bytes or less.
      
      For the following code:
      package main
      
      type size [32]byte
      
      //go:noinline
      func use(t size) {
      }
      
      //go:noinline
      func get() size {
      	var z size
      	return z
      }
      
      func main() {
      	var a size
      	use(a)
      }
      
      Changing size around gives the following assembly leading up to the call
      (the initialization and actual call are removed):
      
      tip func call with 32B arg: 27B
          48 89 e7                 mov    %rsp,%rdi
          48 8d 74 24 20           lea    0x20(%rsp),%rsi
          48 89 6c 24 f0           mov    %rbp,-0x10(%rsp)
          48 8d 6c 24 f0           lea    -0x10(%rsp),%rbp
          e8 53 ab ff ff           callq  448964 <runtime.duffcopy+0x364>
          48 8b 6d 00              mov    0x0(%rbp),%rbp
      
      modified: 19B (-8B)
          0f 10 44 24 20           movups 0x20(%rsp),%xmm0
          0f 11 04 24              movups %xmm0,(%rsp)
          0f 10 44 24 30           movups 0x30(%rsp),%xmm0
          0f 11 44 24 10           movups %xmm0,0x10(%rsp)
      -
      tip with 47B arg: 29B
          48 8d 7c 24 0f           lea    0xf(%rsp),%rdi
          48 8d 74 24 40           lea    0x40(%rsp),%rsi
          48 89 6c 24 f0           mov    %rbp,-0x10(%rsp)
          48 8d 6c 24 f0           lea    -0x10(%rsp),%rbp
          e8 43 ab ff ff           callq  448964 <runtime.duffcopy+0x364>
          48 8b 6d 00              mov    0x0(%rbp),%rbp
      
      modified: 20B (-9B)
          0f 10 44 24 40           movups 0x40(%rsp),%xmm0
          0f 11 44 24 0f           movups %xmm0,0xf(%rsp)
          0f 10 44 24 50           movups 0x50(%rsp),%xmm0
          0f 11 44 24 1f           movups %xmm0,0x1f(%rsp)
      -
      tip with 64B arg: 27B
          48 89 e7                 mov    %rsp,%rdi
          48 8d 74 24 40           lea    0x40(%rsp),%rsi
          48 89 6c 24 f0           mov    %rbp,-0x10(%rsp)
          48 8d 6c 24 f0           lea    -0x10(%rsp),%rbp
          e8 1f ab ff ff           callq  448948 <runtime.duffcopy+0x348>
          48 8b 6d 00              mov    0x0(%rbp),%rbp
      
      modified: 39B [+12B]
          0f 10 44 24 40           movups 0x40(%rsp),%xmm0
          0f 11 04 24              movups %xmm0,(%rsp)
          0f 10 44 24 50           movups 0x50(%rsp),%xmm0
          0f 11 44 24 10           movups %xmm0,0x10(%rsp)
          0f 10 44 24 60           movups 0x60(%rsp),%xmm0
          0f 11 44 24 20           movups %xmm0,0x20(%rsp)
          0f 10 44 24 70           movups 0x70(%rsp),%xmm0
          0f 11 44 24 30           movups %xmm0,0x30(%rsp)
      -
      tip with 79B arg: 29B
          48 8d 7c 24 0f           lea    0xf(%rsp),%rdi
          48 8d 74 24 60           lea    0x60(%rsp),%rsi
          48 89 6c 24 f0           mov    %rbp,-0x10(%rsp)
          48 8d 6c 24 f0           lea    -0x10(%rsp),%rbp
          e8 09 ab ff ff           callq  448948 <runtime.duffcopy+0x348>
          48 8b 6d 00              mov    0x0(%rbp),%rbp
      
      modified: 46B [+17B]
          0f 10 44 24 60           movups 0x60(%rsp),%xmm0
          0f 11 44 24 0f           movups %xmm0,0xf(%rsp)
          0f 10 44 24 70           movups 0x70(%rsp),%xmm0
          0f 11 44 24 1f           movups %xmm0,0x1f(%rsp)
          0f 10 84 24 80 00 00     movups 0x80(%rsp),%xmm0
          00
          0f 11 44 24 2f           movups %xmm0,0x2f(%rsp)
          0f 10 84 24 90 00 00     movups 0x90(%rsp),%xmm0
          00
          0f 11 44 24 3f           movups %xmm0,0x3f(%rsp)
      
      So, at best we save 9B, at worst we gain 17. I do not think that copying
      around 65+B sized types is common enough to bloat program sizes. Using
      bincmp on the go binary itself shows a zero byte difference; there are
      gains and losses all over. One of the largest gains in binary size comes
      from cmd/go/internal/cache.(*Cache).Get, which passes around a 64 byte
      sized type -- this is one of the cases I would expect to be benefitted
      by this change.
      
      I think that this marginal improvement in struct copying for 64 byte
      structs is worth it: most data structs / work items I use in my programs
      are small, but few are smaller than 32 bytes: with one slice, the budget
      is up. The 32 rule alone would allow another 16 bytes, the 48 and 64
      rules allow another 32 and 48.
      
      Change-Id: I19a8f9190d5d41825091f17f268f4763bfc12a62
      Reviewed-on: https://go-review.googlesource.com/100718Reviewed-by: 's avatarIlya Tocar <ilya.tocar@intel.com>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      4f7b7748
    • Alberto Donizetti's avatar
      test/codegen: port direct comparisons with memory tests · fc6280d4
      Alberto Donizetti authored
      And remove them from asm_test.
      
      Change-Id: I1ca29b40546d6de06f20bfd550ed8ff87f495454
      Reviewed-on: https://go-review.googlesource.com/102115
      Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      fc6280d4
    • Carlos Eduardo Seo's avatar
      cmd/compile/internal/ppc64, runtime internal/atomic, sync/atomic: implement… · 6633bb2a
      Carlos Eduardo Seo authored
      cmd/compile/internal/ppc64, runtime internal/atomic, sync/atomic: implement faster atomics for ppc64x
      
      This change implements faster atomics for ppc64x based on the ISA 2.07B,
      Appendix B.2 recommendations, replacing SYNC/ISYNC by LWSYNC in some
      cases.
      
      Updates #21348
      
      name                                           old time/op new time/op    delta
      Cond1-16                                           955ns     856ns      -10.33%
      Cond2-16                                          2.38µs    2.03µs      -14.59%
      Cond4-16                                          5.90µs    5.44µs       -7.88%
      Cond8-16                                          12.1µs    11.1µs       -8.42%
      Cond16-16                                         27.0µs    25.1µs       -7.04%
      Cond32-16                                         59.1µs    55.5µs       -6.14%
      LoadMostlyHits/*sync_test.DeepCopyMap-16          22.1ns    24.1ns       +9.02%
      LoadMostlyHits/*sync_test.RWMutexMap-16            252ns     249ns       -1.20%
      LoadMostlyHits/*sync.Map-16                       16.2ns    16.3ns         ~
      LoadMostlyMisses/*sync_test.DeepCopyMap-16        22.3ns    22.6ns         ~
      LoadMostlyMisses/*sync_test.RWMutexMap-16          249ns     247ns       -0.51%
      LoadMostlyMisses/*sync.Map-16                     12.7ns    12.7ns         ~
      LoadOrStoreBalanced/*sync_test.RWMutexMap-16      1.27µs    1.17µs       -7.54%
      LoadOrStoreBalanced/*sync.Map-16                  1.12µs    1.10µs       -2.35%
      LoadOrStoreUnique/*sync_test.RWMutexMap-16        1.75µs    1.68µs       -3.84%
      LoadOrStoreUnique/*sync.Map-16                    2.07µs    1.97µs       -5.13%
      LoadOrStoreCollision/*sync_test.DeepCopyMap-16    15.8ns    15.9ns         ~
      LoadOrStoreCollision/*sync_test.RWMutexMap-16      496ns     424ns      -14.48%
      LoadOrStoreCollision/*sync.Map-16                 6.07ns    6.07ns         ~
      Range/*sync_test.DeepCopyMap-16                   1.65µs    1.64µs         ~
      Range/*sync_test.RWMutexMap-16                     278µs     288µs       +3.75%
      Range/*sync.Map-16                                2.00µs    2.01µs         ~
      AdversarialAlloc/*sync_test.DeepCopyMap-16        3.45µs    3.44µs         ~
      AdversarialAlloc/*sync_test.RWMutexMap-16          226ns     227ns         ~
      AdversarialAlloc/*sync.Map-16                     1.09µs    1.07µs       -2.36%
      AdversarialDelete/*sync_test.DeepCopyMap-16        553ns     550ns       -0.57%
      AdversarialDelete/*sync_test.RWMutexMap-16         273ns     274ns         ~
      AdversarialDelete/*sync.Map-16                     247ns     249ns         ~
      UncontendedSemaphore-16                           79.0ns    65.5ns      -17.11%
      ContendedSemaphore-16                              112ns      97ns      -13.77%
      MutexUncontended-16                               3.34ns    2.51ns      -24.69%
      Mutex-16                                           266ns     191ns      -28.26%
      MutexSlack-16                                      226ns     159ns      -29.55%
      MutexWork-16                                       377ns     338ns      -10.14%
      MutexWorkSlack-16                                  335ns     308ns       -8.20%
      MutexNoSpin-16                                     196ns     184ns       -5.91%
      MutexSpin-16                                       710ns     666ns       -6.21%
      Once-16                                           1.29ns    1.29ns         ~
      Pool-16                                           8.64ns    8.71ns         ~
      PoolOverflow-16                                   1.60µs    1.44µs      -10.25%
      SemaUncontended-16                                5.39ns    4.42ns      -17.96%
      SemaSyntNonblock-16                                539ns     483ns      -10.42%
      SemaSyntBlock-16                                   413ns     354ns      -14.20%
      SemaWorkNonblock-16                                305ns     258ns      -15.36%
      SemaWorkBlock-16                                   266ns     229ns      -14.06%
      RWMutexUncontended-16                             12.9ns     9.7ns      -24.80%
      RWMutexWrite100-16                                 203ns     147ns      -27.47%
      RWMutexWrite10-16                                  177ns     119ns      -32.74%
      RWMutexWorkWrite100-16                             435ns     403ns       -7.39%
      RWMutexWorkWrite10-16                              642ns     611ns       -4.79%
      WaitGroupUncontended-16                           4.67ns    3.70ns      -20.92%
      WaitGroupAddDone-16                                402ns     355ns      -11.54%
      WaitGroupAddDoneWork-16                            208ns     250ns      +20.09%
      WaitGroupWait-16                                  1.21ns    1.21ns         ~
      WaitGroupWaitWork-16                              5.91ns    5.87ns       -0.81%
      WaitGroupActuallyWait-16                          92.2ns    85.8ns       -6.91%
      
      Updates #21348
      
      Change-Id: Ibb9b271d11b308264103829e176c6d9fe8f867d3
      Reviewed-on: https://go-review.googlesource.com/95175
      Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarLynn Boger <laboger@linux.vnet.ibm.com>
      6633bb2a
    • Giovanni Bajo's avatar
      doc: first version of new contribute guide · a3d83269
      Giovanni Bajo authored
      I've reorganized the guide and rewritten large sections.
      
      The structure is now more clear and logical, and can
      be understood and navigated using the summary displayed at
      the top of the page (before, the summary was confusing because
      the guide contained H1s that were being ignored by the summary).
      
      Both the initial onboarding process and the Gerrit
      change submission process have been reworked to
      include a concise checklist of steps that can be
      read and understood in a few seconds, for people
      that don't want or need to bother with details.
      More in-depth descriptions have been moved into
      separate sections, one per each checklist step.
      This is by far the biggest improvement, as the previous
      approach of having to read several pages just to understand
      the requires steps was very scaring for beginners, in
      addition of being harder to navigate.
      
      GitHub pull requests have been integrated as a different
      way to submit a change, suggested for first time contributors.
      
      The review process has been described in more details,
      documenting the workflow and the used conventions.
      
      Most miscellanea have been moved into an "advanced
      topics" chapter.
      
      Paragraphs describing how to use git have been removed
      to simplify reading. This guide should focus on Go contribution,
      and not help users getting familiar with git, for which many
      guides are available.
      
      Change-Id: I6f4b76583c9878b230ba1d0225745a1708fad2e8
      Reviewed-on: https://go-review.googlesource.com/93495Reviewed-by: 's avatarRob Pike <r@golang.org>
      a3d83269
  6. 21 Mar, 2018 1 commit
    • Ilya Tocar's avatar
      compress/bzip2: remove bit-tricks · 9eb21948
      Ilya Tocar authored
      Since compiler is now able to generate conditional moves, we can replace
      bit-tricks with simple if/else. This even results in slightly better performance:
      
      name            old time/op    new time/op    delta
      DecodeDigits-6    13.4ms ± 4%    13.0ms ± 2%  -2.63%  (p=0.003 n=10+10)
      DecodeTwain-6     37.5ms ± 1%    36.3ms ± 1%  -3.03%  (p=0.000 n=10+9)
      DecodeRand-6      4.23ms ± 1%    4.07ms ± 1%  -3.67%  (p=0.000 n=10+9)
      
      name            old speed      new speed      delta
      DecodeDigits-6  7.47MB/s ± 4%  7.67MB/s ± 2%  +2.69%  (p=0.002 n=10+10)
      DecodeTwain-6   10.4MB/s ± 1%  10.7MB/s ± 1%  +3.25%  (p=0.000 n=10+8)
      DecodeRand-6    3.87MB/s ± 1%  4.03MB/s ± 2%  +4.08%  (p=0.000 n=10+10)
      diff --git a/src/compress/bzip2/huffman.go b/src/compress/bzip2/huffman.go
      
      Change-Id: Ie96ef1a9e07013b07e78f22cdccd531f3341caca
      Reviewed-on: https://go-review.googlesource.com/102015
      Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
      Reviewed-by: 's avatarJoe Tsai <joetsai@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      9eb21948