• Keith Randall's avatar
    runtime: amd64, use 4-byte ops for memmove of 4 bytes · a96e117a
    Keith Randall authored
    memmove used to use 2 2-byte load/store pairs to move 4 bytes.
    When the result is loaded with a single 4-byte load, it caused
    a store to load fowarding stall.  To avoid the stall,
    special case memmove to use 4 byte ops for the 4 byte copy case.
    
    We already have a special case for 8-byte copies.
    386 already specializes 4-byte copies.
    I'll do 2-byte copies also, but not for 1.8.
    
    benchmark                 old ns/op     new ns/op     delta
    BenchmarkIssue18740-8     7567          4799          -36.58%
    
    3-byte copies get a bit slower.  Other copies are unchanged.
    name         old time/op   new time/op   delta
    Memmove/3-8   4.76ns ± 5%   5.26ns ± 3%  +10.50%  (p=0.000 n=10+10)
    
    Fixes #18740
    
    Change-Id: Iec82cbac0ecfee80fa3c8fc83828f9a1819c3c74
    Reviewed-on: https://go-review.googlesource.com/35567
    Run-TryBot: Keith Randall <khr@golang.org>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: 's avatarDavid Chase <drchase@google.com>
    a96e117a
Name
Last commit
Last update
.github Loading commit data...
api Loading commit data...
doc Loading commit data...
lib/time Loading commit data...
misc Loading commit data...
src Loading commit data...
test Loading commit data...
.gitattributes Loading commit data...
.gitignore Loading commit data...
AUTHORS Loading commit data...
CONTRIBUTING.md Loading commit data...
CONTRIBUTORS Loading commit data...
LICENSE Loading commit data...
PATENTS Loading commit data...
README.md Loading commit data...
favicon.ico Loading commit data...
robots.txt Loading commit data...