• Filippo Valsorda's avatar
    crypto/tls: replace custom *block with standard buffers · ab51b1d6
    Filippo Valsorda authored
    The crypto/tls record layer used a custom buffer implementation with its
    own semantics, freelist, and offset management. Replace it all with
    per-task bytes.Buffer, bytes.Reader and byte slices, along with a
    refactor of all the encrypt and decrypt code.
    
    The main quirk of *block was to do a best-effort read past the record
    boundary, so that if a closeNotify was waiting it would be peeked and
    surfaced along with the last Read. Address that with atLeastReader and
    ReadFrom to avoid a useless copy (instead of a LimitReader or CopyN).
    
    There was also an optimization to split blocks along record boundary
    lines without having to copy in and out the data. Replicate that by
    aliasing c.input into consumed c.rawInput (after an in-place decrypt
    operation). This is safe because c.rawInput is not used until c.input is
    drained.
    
    The benchmarks are noisy but look like an improvement across the board,
    which is a nice side effect :)
    
    name                                       old time/op   new time/op   delta
    HandshakeServer/RSA-8                        817µs ± 2%    797µs ± 2%  -2.52%  (p=0.000 n=10+9)
    HandshakeServer/ECDHE-P256-RSA-8             984µs ±11%    897µs ± 0%  -8.89%  (p=0.000 n=10+9)
    HandshakeServer/ECDHE-P256-ECDSA-P256-8      206µs ±10%    199µs ± 3%    ~     (p=0.113 n=10+9)
    HandshakeServer/ECDHE-X25519-ECDSA-P256-8    204µs ± 3%    202µs ± 1%  -1.06%  (p=0.013 n=10+9)
    HandshakeServer/ECDHE-P521-ECDSA-P521-8     15.5ms ± 0%   15.6ms ± 1%    ~     (p=0.095 n=9+10)
    Throughput/MaxPacket/1MB-8                  5.35ms ±19%   5.39ms ±36%    ~     (p=1.000 n=9+10)
    Throughput/MaxPacket/2MB-8                  9.20ms ±15%   8.30ms ± 8%  -9.79%  (p=0.035 n=10+9)
    Throughput/MaxPacket/4MB-8                  13.8ms ± 7%   13.6ms ± 8%    ~     (p=0.315 n=10+10)
    Throughput/MaxPacket/8MB-8                  25.1ms ± 3%   23.2ms ± 2%  -7.66%  (p=0.000 n=10+9)
    Throughput/MaxPacket/16MB-8                 46.9ms ± 1%   43.0ms ± 3%  -8.29%  (p=0.000 n=9+10)
    Throughput/MaxPacket/32MB-8                 88.9ms ± 2%   82.3ms ± 2%  -7.40%  (p=0.000 n=9+9)
    Throughput/MaxPacket/64MB-8                  175ms ± 2%    164ms ± 4%  -6.18%  (p=0.000 n=10+10)
    Throughput/DynamicPacket/1MB-8              5.79ms ±26%   5.82ms ±22%    ~     (p=0.912 n=10+10)
    Throughput/DynamicPacket/2MB-8              9.23ms ±14%   9.50ms ±23%    ~     (p=0.971 n=10+10)
    Throughput/DynamicPacket/4MB-8              14.5ms ±11%   13.8ms ± 6%  -4.66%  (p=0.019 n=10+10)
    Throughput/DynamicPacket/8MB-8              25.6ms ± 4%   23.5ms ± 3%  -8.33%  (p=0.000 n=10+10)
    Throughput/DynamicPacket/16MB-8             47.3ms ± 3%   44.6ms ± 7%  -5.65%  (p=0.000 n=10+10)
    Throughput/DynamicPacket/32MB-8             91.9ms ±14%   85.0ms ± 4%  -7.55%  (p=0.000 n=10+10)
    Throughput/DynamicPacket/64MB-8              177ms ± 2%    168ms ± 4%  -4.97%  (p=0.000 n=8+10)
    Latency/MaxPacket/200kbps-8                  694ms ± 0%    694ms ± 0%    ~     (p=0.315 n=10+9)
    Latency/MaxPacket/500kbps-8                  279ms ± 0%    279ms ± 0%    ~     (p=0.447 n=9+10)
    Latency/MaxPacket/1000kbps-8                 140ms ± 0%    140ms ± 0%    ~     (p=0.661 n=9+10)
    Latency/MaxPacket/2000kbps-8                71.1ms ± 0%   71.1ms ± 0%  +0.05%  (p=0.019 n=9+9)
    Latency/MaxPacket/5000kbps-8                30.4ms ± 7%   30.5ms ± 4%    ~     (p=0.720 n=9+10)
    Latency/DynamicPacket/200kbps-8              134ms ± 0%    134ms ± 0%    ~     (p=0.075 n=10+10)
    Latency/DynamicPacket/500kbps-8             54.8ms ± 0%   54.8ms ± 0%    ~     (p=0.631 n=10+10)
    Latency/DynamicPacket/1000kbps-8            28.5ms ± 0%   28.5ms ± 0%    ~     (p=1.000 n=8+8)
    Latency/DynamicPacket/2000kbps-8            15.7ms ±12%   16.1ms ± 0%    ~     (p=0.109 n=10+7)
    Latency/DynamicPacket/5000kbps-8            8.20ms ±26%   8.17ms ±13%    ~     (p=1.000 n=9+9)
    
    name                                       old speed     new speed     delta
    Throughput/MaxPacket/1MB-8                 193MB/s ±14%  202MB/s ±30%    ~     (p=0.897 n=8+10)
    Throughput/MaxPacket/2MB-8                 230MB/s ±14%  249MB/s ±17%    ~     (p=0.089 n=10+10)
    Throughput/MaxPacket/4MB-8                 304MB/s ± 6%  309MB/s ± 7%    ~     (p=0.315 n=10+10)
    Throughput/MaxPacket/8MB-8                 334MB/s ± 3%  362MB/s ± 2%  +8.29%  (p=0.000 n=10+9)
    Throughput/MaxPacket/16MB-8                358MB/s ± 1%  390MB/s ± 3%  +9.08%  (p=0.000 n=9+10)
    Throughput/MaxPacket/32MB-8                378MB/s ± 2%  408MB/s ± 2%  +8.00%  (p=0.000 n=9+9)
    Throughput/MaxPacket/64MB-8                384MB/s ± 2%  410MB/s ± 4%  +6.61%  (p=0.000 n=10+10)
    Throughput/DynamicPacket/1MB-8             178MB/s ±24%  182MB/s ±24%    ~     (p=0.604 n=9+10)
    Throughput/DynamicPacket/2MB-8             228MB/s ±13%  225MB/s ±20%    ~     (p=0.971 n=10+10)
    Throughput/DynamicPacket/4MB-8             291MB/s ±10%  305MB/s ± 6%  +4.83%  (p=0.019 n=10+10)
    Throughput/DynamicPacket/8MB-8             327MB/s ± 4%  357MB/s ± 3%  +9.08%  (p=0.000 n=10+10)
    Throughput/DynamicPacket/16MB-8            355MB/s ± 3%  376MB/s ± 6%  +6.07%  (p=0.000 n=10+10)
    Throughput/DynamicPacket/32MB-8            366MB/s ±12%  395MB/s ± 4%  +7.91%  (p=0.000 n=10+10)
    Throughput/DynamicPacket/64MB-8            380MB/s ± 2%  400MB/s ± 4%  +5.26%  (p=0.000 n=8+10)
    
    Note that this reduced the buffer for the first read from 1024 to 5+512,
    so it triggered the issue described at #24198 when using a synchronous
    net.Pipe: the first server flight was not being consumed entirely by the
    first read anymore, causing a deadlock as both the client and the server
    were trying to send (the client a reply to the ServerHello, the server
    the rest of the buffer). Fixed by rebasing on top of CL 142817.
    
    Change-Id: Ie31b0a572b2ad37878469877798d5c6a5276f931
    Reviewed-on: https://go-review.googlesource.com/c/142818
    Run-TryBot: Filippo Valsorda <filippo@golang.org>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: 's avatarAdam Langley <agl@golang.org>
    ab51b1d6
Name
Last commit
Last update
..
aes Loading commit data...
cipher Loading commit data...
des Loading commit data...
dsa Loading commit data...
ecdsa Loading commit data...
elliptic Loading commit data...
hmac Loading commit data...
internal Loading commit data...
md5 Loading commit data...
rand Loading commit data...
rc4 Loading commit data...
rsa Loading commit data...
sha1 Loading commit data...
sha256 Loading commit data...
sha512 Loading commit data...
subtle Loading commit data...
tls Loading commit data...
x509 Loading commit data...
crypto.go Loading commit data...
issue21104_test.go Loading commit data...