• Lynn Boger's avatar
    internal/bytealg: implement bytes.Count in asm for ppc64x · a0fad982
    Lynn Boger authored
    This adds an asm implementation for the Count function in ppc64x.
    The Go code that manipulates a byte at a time is especially
    inefficient on ppc64x, so an asm implementation is a significant
    improvement.
    
    bytes:
    name               old time/op   new time/op    delta
    CountSingle/10-8    23.1ns ± 0%    18.6ns ± 0%    -19.48%  (p=1.000 n=1+1)
    CountSingle/32-8    60.4ns ± 0%    19.0ns ± 0%    -68.54%  (p=1.000 n=1+1)
    CountSingle/4K-8    7.29µs ± 0%    0.45µs ± 0%    -93.80%  (p=1.000 n=1+1)
    CountSingle/4M-8    7.49ms ± 0%    0.45ms ± 0%    -93.97%  (p=1.000 n=1+1)
    CountSingle/64M-8    127ms ± 0%       9ms ± 0%    -92.53%  (p=1.000 n=1+1)
    
    html:
    name              old time/op  new time/op  delta
    Escape-8          57.5µs ± 0%  36.1µs ± 0%  -37.13%  (p=1.000 n=1+1)
    EscapeNone-8      20.0µs ± 0%   2.0µs ± 0%  -90.14%  (p=1.000 n=1+1)
    
    Change-Id: Iadbf422c0e9a37b47d2d95fb8c778420f3aabb58
    Reviewed-on: https://go-review.googlesource.com/131695
    Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: 's avatarMichael Munday <mike.munday@ibm.com>
    a0fad982
count_ppc64x.s 2.71 KB