-
Iskander Sharipov authored
Rationale: small buffer optimization does not work and it has made things slower since 2014. Until we can make it work, we should prefer simpler code that also turns out to be more efficient. With this change, it's possible to use NewBuffer(make([]byte, 0, bootstrapSize)) to get the desired stack-allocated initial buffer since escape analysis can prove the created slice to be non-escaping. New implementation key points: - Zero value bytes.Buffer performs better than before - You can have a truly stack-allocated buffer, and it's not even limited to 64 bytes - The unsafe.Sizeof(bytes.Buffer{}) is reduced significantly - Empty writes don't cause allocations Buffer benchmarks from bytes package: name old time/op new time/op delta ReadString-8 9.20µs ± 1% 9.22µs ± 1% ~ (p=0.148 n=10+10) WriteByte-8 28.1µs ± 0% 26.2µs ± 0% -6.78% (p=0.000 n=10+10) WriteRune-8 64.9µs ± 0% 65.0µs ± 0% +0.16% (p=0.000 n=10+10) BufferNotEmptyWriteRead-8 469µs ± 0% 461µs ± 0% -1.76% (p=0.000 n=9+10) BufferFullSmallReads-8 108µs ± 0% 108µs ± 0% -0.21% (p=0.000 n=10+10) name old speed new speed delta ReadString-8 3.56GB/s ± 1% 3.55GB/s ± 1% ~ (p=0.165 n=10+10) WriteByte-8 146MB/s ± 0% 156MB/s ± 0% +7.26% (p=0.000 n=9+10) WriteRune-8 189MB/s ± 0% 189MB/s ± 0% -0.16% (p=0.000 n=10+10) name old alloc/op new alloc/op delta ReadString-8 32.8kB ± 0% 32.8kB ± 0% ~ (all equal) WriteByte-8 0.00B 0.00B ~ (all equal) WriteRune-8 0.00B 0.00B ~ (all equal) BufferNotEmptyWriteRead-8 4.72kB ± 0% 4.67kB ± 0% -1.02% (p=0.000 n=10+10) BufferFullSmallReads-8 3.44kB ± 0% 3.33kB ± 0% -3.26% (p=0.000 n=10+10) name old allocs/op new allocs/op delta ReadString-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) WriteByte-8 0.00 0.00 ~ (all equal) WriteRune-8 0.00 0.00 ~ (all equal) BufferNotEmptyWriteRead-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) BufferFullSmallReads-8 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.000 n=10+10) The most notable thing in go1 benchmarks is reduced allocs in HTTPClientServer (-1 alloc): HTTPClientServer-8 64.0 ± 0% 63.0 ± 0% -1.56% (p=0.000 n=10+10) For more explanations and benchmarks see the referenced issue. Updates #7921 Change-Id: Ica0bf85e1b70fb4f5dc4f6a61045e2cf4ef72aa3 Reviewed-on: https://go-review.googlesource.com/133715Reviewed-by: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Robert Griesemer <gri@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
9c2be4c2