1. 17 Feb, 2018 6 commits
  2. 16 Feb, 2018 7 commits
  3. 15 Feb, 2018 27 commits
    • Balaram Makam's avatar
      cmd/compile: arm64 intrinsics for math/bits.OnesCount · fcba0514
      Balaram Makam authored
      This adds math/bits intrinsics for OnesCount on arm64.
      
      name         old time/op  new time/op  delta
      OnesCount    3.81ns ± 0%  1.60ns ± 0%  -57.96%  (p=0.000 n=7+8)
      OnesCount8   1.60ns ± 0%  1.60ns ± 0%     ~     (all equal)
      OnesCount16  2.41ns ± 0%  1.60ns ± 0%  -33.61%  (p=0.000 n=8+8)
      OnesCount32  4.17ns ± 0%  1.60ns ± 0%  -61.58%  (p=0.000 n=8+8)
      OnesCount64  3.80ns ± 0%  1.60ns ± 0%  -57.84%  (p=0.000 n=8+8)
      
      Update #18616
      
      Conflicts:
      	src/cmd/compile/internal/gc/asm_test.go
      
      Change-Id: I63ac2f63acafdb1f60656ab8a56be0b326eec5cb
      Reviewed-on: https://go-review.googlesource.com/90835
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      fcba0514
    • Matthew Dempsky's avatar
      cmd/compile/internal/gc: use functype instead of OTFUNC · c26fac88
      Matthew Dempsky authored
      Slightly simpler.
      
      Change-Id: Ic3a96675c56cc8c2e336b932536c2247f8cbb96d
      Reviewed-on: https://go-review.googlesource.com/39996
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      Reviewed-by: 's avatarRobert Griesemer <gri@golang.org>
      c26fac88
    • Austin Clements's avatar
      runtime: replace _MaxMem with maxAlloc · d7691d05
      Austin Clements authored
      Now that we have memLimit, also having _MaxMem is a bit confusing.
      
      Replace it with maxAlloc, which better conveys what it limits. We also
      define maxAlloc slightly differently: since it's now clear that it
      limits allocation size, we can account for a subtle difference between
      32-bit and 64-bit.
      
      Change-Id: Iac39048018cc0dae7f0919e25185fee4b3eed529
      Reviewed-on: https://go-review.googlesource.com/85890
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      d7691d05
    • Austin Clements's avatar
      runtime: move comment about address space sizes to malloc.go · 90666b8a
      Austin Clements authored
      Currently there's a detailed comment in lfstack_64bit.go about address
      space limitations on various architectures. Since that's now relevant
      to malloc, move it to a more prominent place in the documentation for
      memLimitBits.
      
      Updates #10460.
      
      Change-Id: If9708291cf3a288057b8b3ba0ba6a59e3602bbd6
      Reviewed-on: https://go-review.googlesource.com/85889
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      90666b8a
    • Austin Clements's avatar
      runtime: remove non-reserved heap logic · 51ae88ee
      Austin Clements authored
      Currently large sysReserve calls on some OSes don't actually reserve
      the memory, but just check that it can be reserved. This was important
      when we called sysReserve to "reserve" many gigabytes for the heap up
      front, but now that we map memory in small increments as we need it,
      this complication is no longer necessary.
      
      This has one curious side benefit: currently, on Linux, allocations
      that are large enough to be rejected by mmap wind up freezing the
      application for a long time before it panics. This happens because
      sysReserve doesn't reserve the memory, so sysMap calls mmap_fixed,
      which calls mmap, which fails because the mapping is too large.
      However, mmap_fixed doesn't inspect *why* mmap fails, so it falls back
      to probing every page in the desired region individually with mincore
      before performing an (otherwise dangerous) MAP_FIXED mapping, which
      will also fail. This takes a long time for a large region. Now this
      logic is gone, so the mmap failure leads to an immediate panic.
      
      Updates #10460.
      
      Change-Id: I8efe88c611871cdb14f99fadd09db83e0161ca2e
      Reviewed-on: https://go-review.googlesource.com/85888
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      51ae88ee
    • Austin Clements's avatar
      runtime: use sparse mappings for the heap · 2b415549
      Austin Clements authored
      This replaces the contiguous heap arena mapping with a potentially
      sparse mapping that can support heap mappings anywhere in the address
      space.
      
      This has several advantages over the current approach:
      
      * There is no longer any limit on the size of the Go heap. (Currently
        it's limited to 512GB.) Hence, this fixes #10460.
      
      * It eliminates many failures modes of heap initialization and
        growing. In particular it eliminates any possibility of panicking
        with an address space conflict. This can happen for many reasons and
        even causes a low but steady rate of TSAN test failures because of
        conflicts with the TSAN runtime. See #16936 and #11993.
      
      * It eliminates the notion of "non-reserved" heap, which was added
        because creating huge address space reservations (particularly on
        64-bit) led to huge process VSIZE. This was at best confusing and at
        worst conflicted badly with ulimit -v. However, the non-reserved
        heap logic is complicated, can race with other mappings in non-pure
        Go binaries (e.g., #18976), and requires that the entire heap be
        either reserved or non-reserved. We currently maintain the latter
        property, but it's quite difficult to convince yourself of that, and
        hence difficult to keep correct. This logic is still present, but
        will be removed in the next CL.
      
      * It fixes problems on 32-bit where skipping over parts of the address
        space leads to mapping huge (and never-to-be-used) metadata
        structures. See #19831.
      
      This also completely rewrites and significantly simplifies
      mheap.sysAlloc, which has been a source of many bugs. E.g., #21044,
       #20259, #18651, and #13143 (and maybe #23222).
      
      This change also makes it possible to allocate individual objects
      larger than 512GB. As a result, a few tests that expected huge
      allocations to fail needed to be changed to make even larger
      allocations. However, at the moment attempting to allocate a humongous
      object may cause the program to freeze for several minutes on Linux as
      we fall back to probing every page with addrspace_free. That logic
      (and this failure mode) will be removed in the next CL.
      
      Fixes #10460.
      Fixes #22204 (since it rewrites the code involved).
      
      This slightly slows down compilebench and the x/benchmarks garbage
      benchmark.
      
      name       old time/op     new time/op     delta
      Template       184ms ± 1%      185ms ± 1%    ~     (p=0.065 n=10+9)
      Unicode       86.9ms ± 3%     86.3ms ± 1%    ~     (p=0.631 n=10+10)
      GoTypes        599ms ± 0%      602ms ± 0%  +0.56%  (p=0.000 n=10+9)
      Compiler       2.87s ± 1%      2.89s ± 1%  +0.51%  (p=0.002 n=9+10)
      SSA            7.29s ± 1%      7.25s ± 1%    ~     (p=0.182 n=10+9)
      Flate          118ms ± 2%      118ms ± 1%    ~     (p=0.113 n=9+9)
      GoParser       147ms ± 1%      148ms ± 1%  +1.07%  (p=0.003 n=9+10)
      Reflect        401ms ± 1%      404ms ± 1%  +0.71%  (p=0.003 n=10+9)
      Tar            175ms ± 1%      175ms ± 1%    ~     (p=0.604 n=9+10)
      XML            209ms ± 1%      210ms ± 1%    ~     (p=0.052 n=10+10)
      
      (https://perf.golang.org/search?q=upload:20171231.4)
      
      name                       old time/op  new time/op  delta
      Garbage/benchmem-MB=64-12  2.23ms ± 1%  2.25ms ± 1%  +0.84%  (p=0.000 n=19+19)
      
      (https://perf.golang.org/search?q=upload:20171231.3)
      
      Relative to the start of the sparse heap changes (starting at and
      including "runtime: fix various contiguous bitmap assumptions"),
      overall slowdown is roughly 1% on GC-intensive benchmarks:
      
      name        old time/op     new time/op     delta
      Template        183ms ± 1%      185ms ± 1%  +1.32%  (p=0.000 n=9+9)
      Unicode        84.9ms ± 2%     86.3ms ± 1%  +1.65%  (p=0.000 n=9+10)
      GoTypes         595ms ± 1%      602ms ± 0%  +1.19%  (p=0.000 n=9+9)
      Compiler        2.86s ± 0%      2.89s ± 1%  +0.91%  (p=0.000 n=9+10)
      SSA             7.19s ± 0%      7.25s ± 1%  +0.75%  (p=0.000 n=8+9)
      Flate           117ms ± 1%      118ms ± 1%  +1.10%  (p=0.000 n=10+9)
      GoParser        146ms ± 2%      148ms ± 1%  +1.48%  (p=0.002 n=10+10)
      Reflect         398ms ± 1%      404ms ± 1%  +1.51%  (p=0.000 n=10+9)
      Tar             173ms ± 1%      175ms ± 1%  +1.17%  (p=0.000 n=10+10)
      XML             208ms ± 1%      210ms ± 1%  +0.62%  (p=0.011 n=10+10)
      [Geo mean]      369ms           373ms       +1.17%
      
      (https://perf.golang.org/search?q=upload:20180101.2)
      
      name                       old time/op  new time/op  delta
      Garbage/benchmem-MB=64-12  2.22ms ± 1%  2.25ms ± 1%  +1.51%  (p=0.000 n=20+19)
      
      (https://perf.golang.org/search?q=upload:20180101.3)
      
      Change-Id: I5daf4cfec24b252e5a57001f0a6c03f22479d0f0
      Reviewed-on: https://go-review.googlesource.com/85887
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      2b415549
    • Austin Clements's avatar
      runtime: eliminate most uses of mheap_.arena_* · 45ffeab5
      Austin Clements authored
      This replaces all uses of the mheap_.arena_* fields outside of
      mallocinit and sysAlloc. These fields fundamentally assume a
      contiguous heap between two bounds, so eliminating these is necessary
      for a sparse heap.
      
      Many of these are replaced with checks for non-nil spans at the test
      address (which in turn checks for a non-nil entry in the heap arena
      array). Some of them are just for debugging and somewhat meaningless
      with a sparse heap, so those we just delete.
      
      Updates #10460.
      
      Change-Id: I8345b95ffc610aed694f08f74633b3c63506a41f
      Reviewed-on: https://go-review.googlesource.com/85886
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      45ffeab5
    • Austin Clements's avatar
      runtime: make span map sparse · d6e82185
      Austin Clements authored
      This splits the span map into separate chunks for every 64MB of the
      heap. The span map chunks now live in the same indirect structure as
      the bitmap.
      
      Updates #10460.
      
      This causes a slight improvement in compilebench and the x/benchmarks
      garbage benchmark. I'm not sure why it improves performance.
      
      name       old time/op     new time/op     delta
      Template       185ms ± 1%      184ms ± 1%    ~            (p=0.315 n=9+10)
      Unicode       86.9ms ± 1%     86.9ms ± 3%    ~            (p=0.356 n=9+10)
      GoTypes        602ms ± 1%      599ms ± 0%  -0.59%         (p=0.002 n=9+10)
      Compiler       2.89s ± 0%      2.87s ± 1%  -0.50%          (p=0.003 n=9+9)
      SSA            7.25s ± 0%      7.29s ± 1%    ~            (p=0.400 n=9+10)
      Flate          118ms ± 1%      118ms ± 2%    ~            (p=0.065 n=10+9)
      GoParser       147ms ± 2%      147ms ± 1%    ~            (p=0.549 n=10+9)
      Reflect        403ms ± 1%      401ms ± 1%  -0.47%         (p=0.035 n=9+10)
      Tar            176ms ± 1%      175ms ± 1%  -0.59%         (p=0.013 n=10+9)
      XML            211ms ± 1%      209ms ± 1%  -0.83%        (p=0.011 n=10+10)
      
      (https://perf.golang.org/search?q=upload:20171231.1)
      
      name                       old time/op  new time/op  delta
      Garbage/benchmem-MB=64-12  2.24ms ± 1%  2.23ms ± 1%  -0.36%  (p=0.001 n=20+19)
      
      (https://perf.golang.org/search?q=upload:20171231.2)
      
      Change-Id: I2563f8704ab9812434947faf293c5327f9b0d07a
      Reviewed-on: https://go-review.googlesource.com/85885
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      d6e82185
    • Austin Clements's avatar
      runtime: abstract remaining mheap.spans access · 0de5324d
      Austin Clements authored
      This abstracts the remaining direct accesses to mheap.spans into new
      mheap.setSpan and mheap.setSpans methods.
      
      For #10460.
      
      Change-Id: Id1db8bc5e34a77a9221032aa2e62d05322707364
      Reviewed-on: https://go-review.googlesource.com/85884
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      0de5324d
    • Austin Clements's avatar
      runtime: make the heap bitmap sparse · c0392d2e
      Austin Clements authored
      This splits the heap bitmap into separate chunks for every 64MB of the
      heap and introduces an index mapping from virtual address to metadata.
      It modifies the heapBits abstraction to use this two-level structure.
      Finally, it modifies heapBitsSetType to unroll the bitmap into the
      object itself and then copy it out if the bitmap would span
      discontiguous bitmap chunks.
      
      This is a step toward supporting general sparse heaps, which will
      eliminate address space conflict failures as well as the limit on the
      heap size.
      
      It's also advantageous for 32-bit. 32-bit already supports
      discontiguous heaps by always starting the arena at address 0.
      However, as a result, with a contiguous bitmap, if the kernel chooses
      a high address (near 2GB) for a heap mapping, the runtime is forced to
      map up to 128MB of heap bitmap. Now the runtime can map sections of
      the bitmap for just the parts of the address space used by the heap.
      
      Updates #10460.
      
      This slightly slows down the x/garbage and compilebench benchmarks.
      However, I think the slowdown is acceptably small.
      
      name        old time/op     new time/op     delta
      Template        178ms ± 1%      180ms ± 1%  +0.78%    (p=0.029 n=10+10)
      Unicode        85.7ms ± 2%     86.5ms ± 2%    ~       (p=0.089 n=10+10)
      GoTypes         594ms ± 0%      599ms ± 1%  +0.70%    (p=0.000 n=9+9)
      Compiler        2.86s ± 0%      2.87s ± 0%  +0.40%    (p=0.001 n=9+9)
      SSA             7.23s ± 2%      7.29s ± 2%  +0.94%    (p=0.029 n=10+10)
      Flate           116ms ± 1%      117ms ± 1%  +0.99%    (p=0.000 n=9+9)
      GoParser        146ms ± 1%      146ms ± 0%    ~       (p=0.193 n=10+7)
      Reflect         399ms ± 0%      403ms ± 1%  +0.89%    (p=0.001 n=10+10)
      Tar             173ms ± 1%      174ms ± 1%  +0.91%    (p=0.013 n=10+9)
      XML             208ms ± 1%      210ms ± 1%  +0.93%    (p=0.000 n=10+10)
      [Geo mean]      368ms           371ms       +0.79%
      
      name                       old time/op  new time/op  delta
      Garbage/benchmem-MB=64-12  2.17ms ± 1%  2.21ms ± 1%  +2.15%  (p=0.000 n=20+20)
      
      Change-Id: I037fd283221976f4f61249119d6b97b100bcbc66
      Reviewed-on: https://go-review.googlesource.com/85883
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      c0392d2e
    • Austin Clements's avatar
      runtime: fix various contiguous bitmap assumptions · f61057c4
      Austin Clements authored
      There are various places that assume the heap bitmap is contiguous and
      scan it sequentially. We're about to split up the heap bitmap. This
      commit modifies all of these except heapBitsSetType to use the
      heapBits abstractions so they can transparently switch to a
      discontiguous bitmap.
      
      Updates #10460. This is a step toward supporting sparse heaps.
      
      Change-Id: I2f3994a5785e4dccb66602fb3950bbd290d9392c
      Reviewed-on: https://go-review.googlesource.com/85882
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      f61057c4
    • Austin Clements's avatar
      runtime: lay out heap bitmap forward in memory · 29e9c4d4
      Austin Clements authored
      Currently the heap bitamp is laid in reverse order in memory relative
      to the heap itself. This was originally done out of "excessive
      cleverness" so that computing a bitmap pointer could load only the
      arena_start field and so that heaps could be more contiguous by
      growing the arena and the bitmap out from a common center point.
      
      However, this appears to have no actual performance benefit, it
      complicates nearly every use of the bitmap, and it makes already
      confusing code more confusing. Furthermore, it's still possible to use
      a single field (the new bitmap_delta) for the bitmap pointer
      computation by employing slightly different excessive cleverness.
      
      Hence, this CL puts the bitmap into forward order.
      
      This is a (very) updated version of CL 9404.
      
      Change-Id: I743587cc626c4ecd81e660658bad85b54584108c
      Reviewed-on: https://go-review.googlesource.com/85881
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      29e9c4d4
    • Austin Clements's avatar
      runtime: use spanOf* more widely · 4de46862
      Austin Clements authored
      The logic in the spanOf* functions is open-coded in a lot of places
      right now. Replace these with calls to the spanOf* functions.
      
      Change-Id: I3cc996aceb9a529b60fea7ec6fef22008c012978
      Reviewed-on: https://go-review.googlesource.com/85880
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      4de46862
    • Austin Clements's avatar
      runtime: consolidate mheap.lookup* and spanOf* · a90f9a00
      Austin Clements authored
      I think we'd forgotten about the mheap.lookup APIs when we introduced
      spanOf*, but, at any rate, the spanOf* functions are used far more
      widely at this point, so this CL eliminates the mheap.lookup*
      functions in favor of spanOf*.
      
      Change-Id: I15facd0856e238bb75d990e838a092b5bef5bdfc
      Reviewed-on: https://go-review.googlesource.com/85879
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      a90f9a00
    • Austin Clements's avatar
      runtime: split object finding out of heapBitsForObject · 058bb7ea
      Austin Clements authored
      heapBitsForObject does two things: it finds the base of the object and
      it creates the heapBits for the base of the object. There are several
      places where we just care about the base of the object. Furthermore,
      greyobject only needs the heapBits in the checkmark path and can
      easily compute them only when needed. Once we eliminate passing the
      heap bits to grayobject, almost all uses of heapBitsForObject don't
      need the heap bits.
      
      Hence, this splits heapBitsForObject into findObject and
      heapBitsForAddr (the latter already exists), removes the hbits
      argument to grayobject, and replaces all heapBitsForObject calls with
      calls to findObject.
      
      In addition to making things cleaner overall, heapBitsForAddr is going
      to get more expensive shortly, so it's important that we don't do it
      needlessly.
      
      Note that there's an interesting performance pitfall here. I had
      originally moved findObject to mheap.go, since it made more sense
      there. However, that leads to a ~2% slow down and a whopping 11%
      increase in L1 icache misses on both the x/garbage and compilebench
      benchmarks. This suggests we may want to be more principled about
      this, but, for now, let's just leave findObject in mbitmap.go.
      
      (I tried to make findObject small enough to inline by splitting out
      the error case, but, sadly, wasn't quite able to get it under the
      inlining budget.)
      
      Change-Id: I7bcb92f383ade565d22a9f2494e4c66fd513fb10
      Reviewed-on: https://go-review.googlesource.com/85878
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      058bb7ea
    • Austin Clements's avatar
      runtime: replace mlookup and findObject with heapBitsForObject · 41e6abdc
      Austin Clements authored
      These functions all serve essentially the same purpose. mlookup is
      used in only one place and findObject in only three. Use
      heapBitsForObject instead, which is the most optimized implementation.
      
      (This may seem slightly silly because none of these uses care about
      the heap bits, but we're about to split up the functionality of
      heapBitsForObject anyway. At that point, findObject will rise from the
      ashes.)
      
      Change-Id: I906468c972be095dd23cf2404a7d4434e802f250
      Reviewed-on: https://go-review.googlesource.com/85877
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      41e6abdc
    • Austin Clements's avatar
      runtime: validate lfnode addresses · b1d94c11
      Austin Clements authored
      Change-Id: Ic8c506289caaf6218494e5150d10002e0232feaa
      Reviewed-on: https://go-review.googlesource.com/85876
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      b1d94c11
    • Austin Clements's avatar
      runtime: expand/update lfstack address space assumptions · 981d0495
      Austin Clements authored
      I was spelunking Linux's address space code and found that some of the
      information about maximum virtual addresses in lfstack's comments was
      out of date. This expands and updates the comment.
      
      Change-Id: I9f54b23e6b266b3c5cc20259a849231fb751f6e7
      Reviewed-on: https://go-review.googlesource.com/85875
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarRick Hudson <rlh@golang.org>
      981d0495
    • Chad Rosier's avatar
      cmd/compile: improve absorb shifts optimization for arm64 · 51932c32
      Chad Rosier authored
      Current absorb shifts optimization can generate dead Value nodes which increase
      use count of other live nodes. It will impact other optimizations (such as
      combined loads) which are enabled based on specific use count. This patch fixes
      the issue by decreasing the use count of nodes referenced by dead Value nodes
      generated by absorb shifts optimization.
      
      Performance impacts on go1 benchmarks (data collected on A57@2GHzx8):
      
      name                     old time/op    new time/op    delta
      BinaryTree17-8              6.28s ± 2%     6.24s ± 1%     ~     (p=0.065 n=10+9)
      Fannkuch11-8                6.32s ± 0%     6.33s ± 0%   +0.17%  (p=0.000 n=10+10)
      FmtFprintfEmpty-8          98.9ns ± 0%    99.2ns ± 0%   +0.34%  (p=0.000 n=9+7)
      FmtFprintfString-8          183ns ± 1%     182ns ± 1%   -1.01%  (p=0.005 n=9+10)
      FmtFprintfInt-8             199ns ± 1%     202ns ± 1%   +1.41%  (p=0.000 n=10+9)
      FmtFprintfIntInt-8          272ns ± 1%     276ns ± 3%   +1.36%  (p=0.015 n=10+10)
      FmtFprintfPrefixedInt-8     367ns ± 1%     369ns ± 1%   +0.68%  (p=0.042 n=10+10)
      FmtFprintfFloat-8           491ns ± 1%     493ns ± 1%     ~     (p=0.064 n=10+10)
      FmtManyArgs-8              1.31µs ± 1%    1.32µs ± 1%   +0.39%  (p=0.042 n=8+9)
      GobDecode-8                17.0ms ± 2%    16.2ms ± 2%   -4.74%  (p=0.000 n=10+10)
      GobEncode-8                13.7ms ± 2%    13.4ms ± 1%   -2.40%  (p=0.000 n=10+9)
      Gzip-8                      844ms ± 0%     737ms ± 0%  -12.70%  (p=0.000 n=10+10)
      Gunzip-8                   84.4ms ± 1%    83.9ms ± 0%   -0.55%  (p=0.000 n=10+8)
      HTTPClientServer-8          122µs ± 1%     124µs ± 1%   +1.75%  (p=0.000 n=10+9)
      JSONEncode-8               34.9ms ± 1%    32.4ms ± 0%   -7.11%  (p=0.000 n=10+9)
      JSONDecode-8                150ms ± 0%     146ms ± 1%   -2.84%  (p=0.000 n=7+10)
      Mandelbrot200-8            10.0ms ± 0%    10.0ms ± 0%     ~     (p=0.529 n=10+10)
      GoParse-8                  8.18ms ± 1%    8.03ms ± 0%   -1.93%  (p=0.000 n=10+10)
      RegexpMatchEasy0_32-8       209ns ± 0%     209ns ± 0%     ~     (p=0.248 n=10+9)
      RegexpMatchEasy0_1K-8       789ns ± 1%     790ns ± 0%     ~     (p=0.361 n=10+10)
      RegexpMatchEasy1_32-8       202ns ± 0%     202ns ± 1%     ~     (p=0.137 n=8+10)
      RegexpMatchEasy1_1K-8      1.12µs ± 2%    1.12µs ± 1%     ~     (p=0.810 n=10+10)
      RegexpMatchMedium_32-8      298ns ± 0%     298ns ± 0%     ~     (p=0.443 n=10+9)
      RegexpMatchMedium_1K-8     83.0µs ± 5%    78.6µs ± 0%   -5.37%  (p=0.000 n=10+10)
      RegexpMatchHard_32-8       4.32µs ± 0%    4.26µs ± 0%   -1.47%  (p=0.000 n=10+10)
      RegexpMatchHard_1K-8        132µs ± 4%     126µs ± 0%   -4.41%  (p=0.000 n=10+9)
      Revcomp-8                   1.11s ± 0%     1.11s ± 0%   +0.14%  (p=0.017 n=10+9)
      Template-8                  155ms ± 1%     155ms ± 1%     ~     (p=0.796 n=10+10)
      TimeParse-8                 774ns ± 1%     785ns ± 1%   +1.41%  (p=0.001 n=10+10)
      TimeFormat-8                788ns ± 1%     806ns ± 1%   +2.24%  (p=0.000 n=10+9)
      
      name                     old speed      new speed      delta
      GobDecode-8              45.2MB/s ± 2%  47.5MB/s ± 2%   +4.96%  (p=0.000 n=10+10)
      GobEncode-8              56.0MB/s ± 2%  57.4MB/s ± 1%   +2.44%  (p=0.000 n=10+9)
      Gzip-8                   23.0MB/s ± 0%  26.3MB/s ± 0%  +14.55%  (p=0.000 n=10+10)
      Gunzip-8                  230MB/s ± 1%   231MB/s ± 0%   +0.55%  (p=0.000 n=10+8)
      JSONEncode-8             55.6MB/s ± 1%  59.9MB/s ± 0%   +7.65%  (p=0.000 n=10+9)
      JSONDecode-8             12.9MB/s ± 0%  13.3MB/s ± 1%   +2.94%  (p=0.000 n=7+10)
      GoParse-8                7.08MB/s ± 1%  7.22MB/s ± 0%   +1.95%  (p=0.000 n=10+10)
      RegexpMatchEasy0_32-8     153MB/s ± 0%   153MB/s ± 0%   -0.16%  (p=0.023 n=10+10)
      RegexpMatchEasy0_1K-8    1.30GB/s ± 1%  1.30GB/s ± 0%     ~     (p=0.393 n=10+10)
      RegexpMatchEasy1_32-8     158MB/s ± 0%   158MB/s ± 0%     ~     (p=0.684 n=10+10)
      RegexpMatchEasy1_1K-8     915MB/s ± 2%   918MB/s ± 1%     ~     (p=0.796 n=10+10)
      RegexpMatchMedium_32-8   3.35MB/s ± 0%  3.35MB/s ± 0%     ~     (p=1.000 n=10+9)
      RegexpMatchMedium_1K-8   12.3MB/s ± 5%  13.0MB/s ± 0%   +5.56%  (p=0.000 n=10+10)
      RegexpMatchHard_32-8     7.40MB/s ± 0%  7.51MB/s ± 0%   +1.50%  (p=0.000 n=10+10)
      RegexpMatchHard_1K-8     7.75MB/s ± 4%  8.10MB/s ± 0%   +4.52%  (p=0.000 n=10+8)
      Revcomp-8                 229MB/s ± 0%   228MB/s ± 0%   -0.14%  (p=0.017 n=10+9)
      Template-8               12.5MB/s ± 1%  12.5MB/s ± 1%     ~     (p=0.780 n=10+10)
      
      Change-Id: I103389f168eac79f6af44e8fef93acc2a7a4ac96
      Reviewed-on: https://go-review.googlesource.com/88415
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
      51932c32
    • Than McIntosh's avatar
      compiler: honor //line directives in DWARF variable file/line attrs · b3cb740b
      Than McIntosh authored
      During DWARF debug generation, the DW_AT_decl_line / DW_AT_decl_file
      attributes for variable DIEs were being computed without taking into
      account the possibility of "//line" directives. Fix things up to use
      the correct src.Pos methods to pick up this info.
      
      Fixes #23704.
      
      Change-Id: I88c21a0e0a9602392be229252d856a6d665868e2
      Reviewed-on: https://go-review.googlesource.com/92255
      Run-TryBot: Than McIntosh <thanm@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: 's avatarHeschi Kreinick <heschi@google.com>
      b3cb740b
    • Hana Kim's avatar
      internal/trace: link user span start and end events · 1ae22d8c
      Hana Kim authored
      Also add testdata for version 1.11 including UserTaskSpan test trace.
      
      Change-Id: I673fb29bb3aee96a14fadc0ab860d4f5832143f5
      Reviewed-on: https://go-review.googlesource.com/93795Reviewed-by: 's avatarHeschi Kreinick <heschi@google.com>
      1ae22d8c
    • zaq1tomo's avatar
      cmd/cgo: delete double space in comment · 864ac315
      zaq1tomo authored
      delete double space from comment
      
      Change-Id: I71af5c1149941575016f79a91269f128b1fc16af
      GitHub-Last-Rev: aba8874bd362d05d6c29c8647049369dfcd796f5
      GitHub-Pull-Request: golang/go#23851
      Reviewed-on: https://go-review.googlesource.com/94415Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
      864ac315
    • Ian Lance Taylor's avatar
      debug/dwarf: formStrp uses a 64-bit value for 64-bit DWARF · ff3885dc
      Ian Lance Taylor authored
      No test as the only system I know that uses 64-bit DWARF is AIX.
      
      Change-Id: I24e225253075be188845656b6778993c2d24ebf5
      Reviewed-on: https://go-review.googlesource.com/84379
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      Reviewed-by: 's avatarAustin Clements <austin@google.com>
      ff3885dc
    • Hana Kim's avatar
      runtime/trace: implement annotation API · 6977a3b2
      Hana Kim authored
      This implements the annotation API proposed in golang.org/cl/63274.
      
      traceString is updated to protect the string map with trace.stringsLock
      because the assumption that traceString is called by a single goroutine
      (either at the beginning of tracing and at the end of tracing when
      dumping all the symbols and function names) is no longer true.
      
      traceString is used by the annotation apis (NewContext, StartSpan, Log)
      to register frequently appearing strings (task and span names, and log
      keys) after this change.
      
      NewContext -> one or two records (EvString, EvUserTaskCreate)
      end function -> one record (EvUserTaskEnd)
      StartSpan -> one or two records (EvString, EvUserSpan)
      span end function -> one or two records (EvString, EvUserSpan)
      Log -> one or two records (EvString, EvUserLog)
      
      EvUserLog record is of the typical record format written by traceEvent
      except that it is followed by bytes that represents the value string.
      
      In addition to runtime/trace change, this change includes
      corresponding changes in internal/trace to parse the new record types.
      
      Future work to improve efficiency:
        More efficient unique task id generation instead of atomic. (per-P
        counter).
        Instead of a centralized trace.stringsLock, consider using per-P
        string cache or something more efficient.
      
      R=go1.11
      
      Change-Id: Iec9276c6c51e5be441ccd52dec270f1e3b153970
      Reviewed-on: https://go-review.googlesource.com/71690Reviewed-by: 's avatarAustin Clements <austin@google.com>
      6977a3b2
    • Hana Kim's avatar
      runtime/trace: user annotation API · 32d1cd33
      Hana Kim authored
      This CL presents the proposed user annotation API skeleton.
      This CL bumps up the trace version to 1.11.
      
      Design doc https://goo.gl/iqJfJ3
      
      Implementation CLs are followed.
      
      The API introduces three basic building blocks. Log, Span, and Task.
      
      Log is for basic logging. When called, the message will be recorded
      to the trace along with timestamp, goroutine id, and stack info.
      
         trace.Log(ctx, messageType message)
      
      Span can be thought as an extension of log to record interesting
      time interval during a goroutine's execution. A span is local to a
      goroutine by definition.
      
         trace.WithSpan(ctx, "doVeryExpensiveOp", func(ctx context) {
            /* do something very expensive */
         })
      
      Task is higher-level concept that aids tracing of complex operations
      that encompass multiple goroutines or are asynchronous.
      For example, an RPC request, a HTTP request, a file write, or a
      batch job can be traced with a Task.
      
      Note we chose to design the API around context.Context so it allows
      easier integration with other tracing tools, often designed around
      context.Context as well. Log and WithSpan APIs recognize the task
      information embedded in the context and record it in the trace as
      well. That allows the Go execution tracer to associate and group
      the spans and log messages based on the task information.
      
      In order to create a Task,
      
         ctx, end := trace.NewContext(ctx, "myTask")
         defer end()
      
      The Go execution tracer measures the time between the task created
      and the task ended for the task latency.
      
      More discussion history in golang.org/cl/59572.
      
      Update #16619
      
      R=go1.11
      
      Change-Id: I59a937048294dafd23a75cf1723c6db461b193cd
      Reviewed-on: https://go-review.googlesource.com/63274Reviewed-by: 's avatarAustin Clements <austin@google.com>
      32d1cd33
    • Michael Fraenkel's avatar
      cmd/compile: convert untyped bool for OIF and OFOR · e0576805
      Michael Fraenkel authored
      Updates #23834.
      
      Change-Id: I92aca9108590a0c7de774f4fad7ded97105e3cb8
      Reviewed-on: https://go-review.googlesource.com/94475Reviewed-by: 's avatarMatthew Dempsky <mdempsky@google.com>
      e0576805
    • Carlos Eduardo Seo's avatar
      cmd/asm, cmd/internal/obj/ppc64: add Immediate Shifted opcodes for ppc64x · 9a9a8c01
      Carlos Eduardo Seo authored
      This change adds ADD/AND/OR/XOR Immediate Shifted instructions for
      ppc64x so they are usable in Go asm code. These instructions were
      originally present in asm9.go, but they were only usable in that
      file (as -AADD, -AANDCC, -AOR, -AXOR). These old mnemonics are now
      removed.
      
      Updates #23845
      
      Change-Id: Ifa2fac685e8bc628cb241dd446adfc3068181826
      Reviewed-on: https://go-review.googlesource.com/94115Reviewed-by: 's avatarLynn Boger <laboger@linux.vnet.ibm.com>
      9a9a8c01