• Lynn Boger's avatar
    runtime: improve performance of memclr, memmove on ppc64x · aa9bcea3
    Lynn Boger authored
    This improves the asm implementations for memmove and memclr on
    ppc64x through use of vsx loads and stores when size is >= 32 bytes.
    For memclr, dcbz is used when the size is >= 512 and aligned to 128.
    
    Memclr/64       13.3ns ± 0%     10.7ns ± 0%   -19.55%  (p=0.000 n=8+7)
    Memclr/96       14.9ns ± 0%     11.4ns ± 0%   -23.49%  (p=0.000 n=8+8)
    Memclr/128      16.3ns ± 0%     12.3ns ± 0%   -24.54%  (p=0.000 n=8+8)
    Memclr/160      17.3ns ± 0%     13.0ns ± 0%   -24.86%  (p=0.000 n=8+8)
    Memclr/256      20.0ns ± 0%     15.3ns ± 0%   -23.62%  (p=0.000 n=8+8)
    Memclr/512      34.2ns ± 0%     10.2ns ± 0%   -70.20%  (p=0.000 n=8+8)
    Memclr/4096      178ns ± 0%       23ns ± 0%   -87.13%  (p=0.000 n=8+8)
    Memclr/65536    2.67µs ± 0%     0.30µs ± 0%   -88.89%  (p=0.000 n=7+8)
    Memclr/1M       43.2µs ± 0%     10.0µs ± 0%   -76.85%  (p=0.000 n=8+8)
    Memclr/4M        173µs ± 0%       40µs ± 0%   -76.88%  (p=0.000 n=8+8)
    Memclr/8M        349µs ± 0%       82µs ± 0%   -76.58%  (p=0.000 n=8+8)
    Memclr/16M       701µs ± 7%      672µs ± 0%    -4.05%  (p=0.040 n=8+7)
    Memclr/64M      2.70ms ± 0%     2.67ms ± 0%    -0.96%  (p=0.000 n=8+7)
    
    Memmove/32      6.59ns ± 0%    5.84ns ± 0%  -11.34%  (p=0.029 n=4+4)
    Memmove/64      7.91ns ± 0%    6.97ns ± 0%  -11.92%  (p=0.029 n=4+4)
    Memmove/128     10.5ns ± 0%     8.8ns ± 0%  -16.24%  (p=0.029 n=4+4)
    Memmove/256     21.0ns ± 0%    12.9ns ± 0%  -38.57%  (p=0.029 n=4+4)
    Memmove/512     28.4ns ± 0%    26.2ns ± 0%   -7.75%  (p=0.029 n=4+4)
    Memmove/1024    48.2ns ± 1%    39.4ns ± 0%  -18.26%  (p=0.029 n=4+4)
    Memmove/2048    85.4ns ± 0%    69.0ns ± 0%  -19.20%  (p=0.029 n=4+4)
    Memmove/4096     159ns ± 0%     128ns ± 0%  -19.50%  (p=0.029 n=4+4)
    
    Change-Id: I8c1adf88790845bf31444a15249456006eb5bf8b
    Reviewed-on: https://go-review.googlesource.com/c/141217
    Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: 's avatarMichael Munday <mike.munday@ibm.com>
    aa9bcea3
memclr_ppc64x.s 4.2 KB