• erifan01's avatar
    internal/bytealg: optimize Equal on arm64 · de28555c
    erifan01 authored
    Currently the 16-byte loop chunk16_loop is implemented with NEON instructions LD1, VMOV and VCMEQ.
    Using scalar instructions LDP and CMP to achieve this loop can reduce the number of clock cycles.
    For cases where the length of strings are between 4 to 15 bytes, loading the last 8 or 4 bytes at
    a time to reduce the number of comparisons.
    
    Benchmarks:
    name                 old time/op    new time/op    delta
    Equal/0-8              5.51ns ± 0%    5.84ns ±14%     ~     (p=0.246 n=7+8)
    Equal/1-8              10.5ns ± 0%    10.5ns ± 0%     ~     (all equal)
    Equal/6-8              14.0ns ± 0%    12.5ns ± 0%  -10.71%  (p=0.000 n=8+8)
    Equal/9-8              13.5ns ± 0%    12.5ns ± 0%   -7.41%  (p=0.000 n=8+8)
    Equal/15-8             15.5ns ± 0%    12.5ns ± 0%  -19.35%  (p=0.000 n=8+8)
    Equal/16-8             14.0ns ± 0%    13.0ns ± 0%   -7.14%  (p=0.000 n=8+8)
    Equal/20-8             16.5ns ± 0%    16.0ns ± 0%   -3.03%  (p=0.000 n=8+8)
    Equal/32-8             16.5ns ± 0%    15.3ns ± 0%   -7.27%  (p=0.000 n=8+8)
    Equal/4K-8              552ns ± 0%     553ns ± 0%     ~     (p=0.315 n=8+8)
    Equal/4M-8             1.13ms ±23%    1.20ms ±27%     ~     (p=0.442 n=8+8)
    Equal/64M-8            32.9ms ± 0%    32.6ms ± 0%   -1.15%  (p=0.000 n=8+8)
    CompareBytesEqual-8    12.0ns ± 0%    12.0ns ± 0%     ~     (all equal)
    
    Change-Id: If317ecdcc98e31883d37fd7d42b113b548c5bd2a
    Reviewed-on: https://go-review.googlesource.com/112496Reviewed-by: 's avatarCherry Zhang <cherryyz@google.com>
    Run-TryBot: Cherry Zhang <cherryyz@google.com>
    de28555c
Name
Last commit
Last update
.github Loading commit data...
api Loading commit data...
doc Loading commit data...
lib/time Loading commit data...
misc Loading commit data...
src Loading commit data...
test Loading commit data...
.gitattributes Loading commit data...
.gitignore Loading commit data...
AUTHORS Loading commit data...
CONTRIBUTING.md Loading commit data...
CONTRIBUTORS Loading commit data...
LICENSE Loading commit data...
PATENTS Loading commit data...
README.md Loading commit data...
favicon.ico Loading commit data...
robots.txt Loading commit data...