-
Mike Strosaker authored
Adds an assembly implementation of sha512.block for ppc64le to improve its performance. This implementation is largely based on the original amd64 implementation, unrolling the 80 iterations of the inner loop. Fixes #17660 benchmark old ns/op new ns/op delta BenchmarkHash8Bytes 1715 1133 -33.94% BenchmarkHash1K 10098 5513 -45.41% BenchmarkHash8K 68004 35278 -48.12% benchmark old MB/s new MB/s speedup BenchmarkHash8Bytes 4.66 7.06 1.52x BenchmarkHash1K 101.40 185.72 1.83x BenchmarkHash8K 120.46 232.21 1.93x Change-Id: Ifd55a49a24cb159b3a09a8e928c3f37727aca103 Reviewed-on: https://go-review.googlesource.com/32320Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
854ae03d