• Ilya Tocar's avatar
    math: fix sqrt regression on AMD64 · 6e703ae7
    Ilya Tocar authored
    1.7 introduced a significant regression compared to 1.6:
    
    SqrtIndirect-4  2.32ns ± 0%  7.86ns ± 0%  +238.79%        (p=0.000 n=20+18)
    
    This is caused by sqrtsd preserving upper part of destination register.
    Which introduces dependency on previous  value of X0.
    In 1.6 benchmark loop didn't use X0 immediately after call:
    
    callq  *%rbx
    movsd  0x8(%rsp),%xmm2
    movsd  0x20(%rsp),%xmm1
    addsd  %xmm2,%xmm1
    mov    0x18(%rsp),%rax
    inc    %rax
    jmp    loop
    
    In 1.7 however xmm0 is used just after call:
    
    callq  *%rbx
    mov    0x10(%rsp),%rcx
    lea    0x1(%rcx),%rax
    movsd  0x8(%rsp),%xmm0
    movsd  0x18(%rsp),%xmm1
    
    I've  verified that this is caused by dependency, by inserting
    XORPS X0,X0 in the beginning of math.Sqrt, which puts performance back on 1.6 level.
    
    Splitting SQRTSD mem,reg into:
    MOVSD mem,reg
    SQRTSD reg,reg
    
    Removes dependency, because MOVSD (load version)
    doesn't need to preserve upper part of a register.
    And reg,reg operation is solved by renamer in CPU.
    
    As a result of this change regression is gone:
    SqrtIndirect-4  7.86ns ± 0%  2.33ns ± 0%  -70.36%  (p=0.000 n=18+17)
    
    This also removes old Sqrt benchmarks, in favor of benchmarks measuring latency.
    Only SqrtIndirect is kept, to show impact of this patch.
    
    Change-Id: Ic7eebe8866445adff5bc38192fa8d64c9a6b8872
    Reviewed-on: https://go-review.googlesource.com/28392
    Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
    Reviewed-by: 's avatarKeith Randall <khr@golang.org>
    6e703ae7
Name
Last commit
Last update
..
archive Loading commit data...
bufio Loading commit data...
builtin Loading commit data...
bytes Loading commit data...
cmd Loading commit data...
compress Loading commit data...
container Loading commit data...
context Loading commit data...
crypto Loading commit data...
database/sql Loading commit data...
debug Loading commit data...
encoding Loading commit data...
errors Loading commit data...
expvar Loading commit data...
flag Loading commit data...
fmt Loading commit data...
go Loading commit data...
hash Loading commit data...
html Loading commit data...
image Loading commit data...
index/suffixarray Loading commit data...
internal Loading commit data...
io Loading commit data...
log Loading commit data...
math Loading commit data...
mime Loading commit data...
net Loading commit data...
os Loading commit data...
path Loading commit data...
reflect Loading commit data...
regexp Loading commit data...
runtime Loading commit data...
sort Loading commit data...
strconv Loading commit data...
strings Loading commit data...
sync Loading commit data...
syscall Loading commit data...
testing Loading commit data...
text Loading commit data...
time Loading commit data...
unicode Loading commit data...
unsafe Loading commit data...
vendor/golang_org/x/net Loading commit data...
Make.dist Loading commit data...
all.bash Loading commit data...
all.bat Loading commit data...
all.rc Loading commit data...
androidtest.bash Loading commit data...
bootstrap.bash Loading commit data...
buildall.bash Loading commit data...
clean.bash Loading commit data...
clean.bat Loading commit data...
clean.rc Loading commit data...
cmp.bash Loading commit data...
iostest.bash Loading commit data...
make.bash Loading commit data...
make.bat Loading commit data...
make.rc Loading commit data...
naclmake.bash Loading commit data...
nacltest.bash Loading commit data...
race.bash Loading commit data...
race.bat Loading commit data...
run.bash Loading commit data...
run.bat Loading commit data...
run.rc Loading commit data...