1. 10 Nov, 2015 16 commits
  2. 09 Nov, 2015 6 commits
  3. 08 Nov, 2015 10 commits
  4. 07 Nov, 2015 5 commits
  5. 06 Nov, 2015 3 commits
    • Ilya Tocar's avatar
      runtime: optimize indexbytebody on amd64 · 321a4072
      Ilya Tocar authored
      Use avx2 to compare 32 bytes per iteration.
      Results (haswell):
      
      name                    old time/op    new time/op     delta
      IndexByte32-6             15.5ns ± 0%     14.7ns ± 5%   -4.87%        (p=0.000 n=16+20)
      IndexByte4K-6              360ns ± 0%      183ns ± 0%  -49.17%        (p=0.000 n=19+20)
      IndexByte4M-6              384µs ± 0%      256µs ± 1%  -33.41%        (p=0.000 n=20+20)
      IndexByte64M-6            6.20ms ± 0%     4.18ms ± 1%  -32.52%        (p=0.000 n=19+20)
      IndexBytePortable32-6     73.4ns ± 5%     75.8ns ± 3%   +3.35%        (p=0.000 n=20+19)
      IndexBytePortable4K-6     5.15µs ± 0%     5.15µs ± 0%     ~     (all samples are equal)
      IndexBytePortable4M-6     5.26ms ± 0%     5.25ms ± 0%   -0.12%        (p=0.000 n=20+18)
      IndexBytePortable64M-6    84.1ms ± 0%     84.1ms ± 0%   -0.08%        (p=0.012 n=18+20)
      Index32-6                  352ns ± 0%      352ns ± 0%     ~     (all samples are equal)
      Index4K-6                 53.8µs ± 0%     53.8µs ± 0%   -0.03%        (p=0.000 n=16+18)
      Index4M-6                 55.4ms ± 0%     55.4ms ± 0%     ~           (p=0.149 n=20+19)
      Index64M-6                 886ms ± 0%      886ms ± 0%     ~           (p=0.108 n=20+20)
      IndexEasy32-6             80.3ns ± 0%     80.1ns ± 0%   -0.21%        (p=0.000 n=20+20)
      IndexEasy4K-6              426ns ± 0%      215ns ± 0%  -49.53%        (p=0.000 n=20+20)
      IndexEasy4M-6              388µs ± 0%      262µs ± 1%  -32.42%        (p=0.000 n=18+20)
      IndexEasy64M-6            6.20ms ± 0%     4.19ms ± 1%  -32.47%        (p=0.000 n=18+20)
      
      name                    old speed      new speed       delta
      IndexByte32-6           2.06GB/s ± 1%   2.17GB/s ± 5%   +5.19%        (p=0.000 n=18+20)
      IndexByte4K-6           11.4GB/s ± 0%   22.3GB/s ± 0%  +96.45%        (p=0.000 n=17+20)
      IndexByte4M-6           10.9GB/s ± 0%   16.4GB/s ± 1%  +50.17%        (p=0.000 n=20+20)
      IndexByte64M-6          10.8GB/s ± 0%   16.0GB/s ± 1%  +48.19%        (p=0.000 n=19+20)
      IndexBytePortable32-6    436MB/s ± 5%    422MB/s ± 3%   -3.27%        (p=0.000 n=20+19)
      IndexBytePortable4K-6    795MB/s ± 0%    795MB/s ± 0%     ~           (p=0.940 n=17+18)
      IndexBytePortable4M-6    798MB/s ± 0%    799MB/s ± 0%   +0.12%        (p=0.000 n=20+18)
      IndexBytePortable64M-6   798MB/s ± 0%    798MB/s ± 0%   +0.08%        (p=0.011 n=18+20)
      Index32-6               90.9MB/s ± 0%   90.9MB/s ± 0%   -0.00%        (p=0.025 n=20+20)
      Index4K-6               76.1MB/s ± 0%   76.1MB/s ± 0%   +0.03%        (p=0.000 n=14+15)
      Index4M-6               75.7MB/s ± 0%   75.7MB/s ± 0%     ~           (p=0.076 n=20+19)
      Index64M-6              75.7MB/s ± 0%   75.7MB/s ± 0%     ~           (p=0.456 n=20+17)
      IndexEasy32-6            399MB/s ± 0%    399MB/s ± 0%   +0.20%        (p=0.000 n=20+19)
      IndexEasy4K-6           9.60GB/s ± 0%  19.02GB/s ± 0%  +98.19%        (p=0.000 n=20+20)
      IndexEasy4M-6           10.8GB/s ± 0%   16.0GB/s ± 1%  +47.98%        (p=0.000 n=18+20)
      IndexEasy64M-6          10.8GB/s ± 0%   16.0GB/s ± 1%  +48.08%        (p=0.000 n=18+20)
      
      Change-Id: I46075921dde9f3580a89544c0b3a2d8c9181ebc4
      Reviewed-on: https://go-review.googlesource.com/16484Reviewed-by: 's avatarKeith Randall <khr@golang.org>
      Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
      Reviewed-by: 's avatarKlaus Post <klauspost@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      321a4072
    • Keith Randall's avatar
      runtime: teach peephole optimizer that duffcopy clobbers X0 · ffb20631
      Keith Randall authored
      Duffcopy now uses X0, as of 5cf281a9.  Teach the peephole
      optimizer that duffcopy clobbers X0 so that it does not
      rename registers use X0 across the duffcopy instruction.
      
      Fixes #13171
      
      Change-Id: I389cbf1982cb6eb2f51e6152ac96736a8589f085
      Reviewed-on: https://go-review.googlesource.com/16715
      Run-TryBot: Keith Randall <khr@golang.org>
      Reviewed-by: 's avatarMinux Ma <minux@golang.org>
      Reviewed-by: 's avatarIlya Tocar <ilya.tocar@intel.com>
      ffb20631
    • Marcel van Lohuizen's avatar
      doc: updated go1.6 with reflect change for unexported embedded structs · 59bac0be
      Marcel van Lohuizen authored
      Change-Id: I53c196925fb86784b31dea799c27e79574d35fcc
      Reviewed-on: https://go-review.googlesource.com/16304Reviewed-by: 's avatarMarcel van Lohuizen <mpvl@golang.org>
      59bac0be