• Shenghou Ma's avatar
    runtime: disable prefetching on 386 · 0adf6dce
    Shenghou Ma authored
    It doesn't seem to help on modern processors and it makes Go impossible to run
    on Pentium MMX (which is the documented minimum hardware requirement.)
    
    Old is with prefetch, new is w/o. Both are compiled with GO386=sse2.
    Benchmarking is done on Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz.
    
    name                     old time/op    new time/op    delta
    BinaryTree17-4              2.89s ± 2%     2.87s ± 0%    ~     (p=0.061 n=11+10)
    Fannkuch11-4                3.65s ± 0%     3.65s ± 0%    ~     (p=0.365 n=11+11)
    FmtFprintfEmpty-4          52.1ns ± 0%    52.1ns ± 0%    ~      (p=0.065 n=10+9)
    FmtFprintfString-4          168ns ± 0%     167ns ± 0%  -0.48%   (p=0.000 n=8+10)
    FmtFprintfInt-4             167ns ± 0%     167ns ± 1%    ~      (p=0.591 n=9+10)
    FmtFprintfIntInt-4          295ns ± 0%     292ns ± 0%  -0.99%   (p=0.000 n=9+10)
    FmtFprintfPrefixedInt-4     327ns ± 0%     326ns ± 0%  -0.24%  (p=0.007 n=10+10)
    FmtFprintfFloat-4           431ns ± 0%     431ns ± 0%  -0.07%  (p=0.000 n=10+11)
    FmtManyArgs-4              1.13µs ± 0%    1.13µs ± 0%  -0.37%  (p=0.009 n=11+11)
    GobDecode-4                9.36ms ± 1%    9.33ms ± 0%  -0.31%  (p=0.006 n=11+10)
    GobEncode-4                7.38ms ± 1%    7.38ms ± 1%    ~     (p=0.797 n=11+11)
    Gzip-4                      394ms ± 0%     395ms ± 1%    ~     (p=0.519 n=11+11)
    Gunzip-4                   65.4ms ± 0%    65.4ms ± 0%    ~     (p=0.739 n=10+10)
    HTTPClientServer-4         52.4µs ± 1%    52.5µs ± 1%    ~     (p=0.748 n=11+11)
    JSONEncode-4               19.0ms ± 0%    19.0ms ± 0%    ~      (p=0.780 n=9+10)
    JSONDecode-4               59.6ms ± 0%    59.6ms ± 0%    ~      (p=0.720 n=9+10)
    Mandelbrot200-4            4.09ms ± 0%    4.09ms ± 0%    ~      (p=0.295 n=11+9)
    GoParse-4                  3.45ms ± 1%    3.43ms ± 1%  -0.35%  (p=0.040 n=11+11)
    RegexpMatchEasy0_32-4       101ns ± 1%     101ns ± 1%    ~     (p=1.000 n=11+11)
    RegexpMatchEasy0_1K-4       796ns ± 0%     796ns ± 0%    ~      (p=0.954 n=10+8)
    RegexpMatchEasy1_32-4       110ns ± 0%     110ns ± 1%    ~      (p=0.289 n=9+11)
    RegexpMatchEasy1_1K-4       991ns ± 0%     991ns ± 0%    ~      (p=0.784 n=10+8)
    RegexpMatchMedium_32-4      131ns ± 0%     130ns ± 0%  -0.42%   (p=0.004 n=11+9)
    RegexpMatchMedium_1K-4     41.9µs ± 1%    41.6µs ± 0%    ~      (p=0.067 n=11+9)
    RegexpMatchHard_32-4       2.34µs ± 0%    2.34µs ± 0%    ~     (p=0.208 n=11+11)
    RegexpMatchHard_1K-4       70.9µs ± 0%    71.0µs ± 0%    ~      (p=0.968 n=9+10)
    Revcomp-4                   819ms ± 0%     818ms ± 0%    ~     (p=0.251 n=10+11)
    Template-4                 73.9ms ± 0%    73.8ms ± 0%  -0.25%  (p=0.013 n=10+11)
    TimeParse-4                 414ns ± 0%     414ns ± 0%    ~     (p=0.809 n=11+10)
    TimeFormat-4                485ns ± 0%     485ns ± 0%    ~      (p=0.404 n=11+7)
    
    name                     old speed      new speed      delta
    GobDecode-4              82.0MB/s ± 1%  82.3MB/s ± 0%  +0.31%  (p=0.007 n=11+10)
    GobEncode-4               104MB/s ± 1%   104MB/s ± 1%    ~     (p=0.797 n=11+11)
    Gzip-4                   49.2MB/s ± 0%  49.1MB/s ± 1%    ~     (p=0.507 n=11+11)
    Gunzip-4                  297MB/s ± 0%   297MB/s ± 0%    ~     (p=0.670 n=10+10)
    JSONEncode-4              102MB/s ± 0%   102MB/s ± 0%    ~      (p=0.794 n=9+10)
    JSONDecode-4             32.6MB/s ± 0%  32.6MB/s ± 0%    ~       (p=0.334 n=9+9)
    GoParse-4                16.8MB/s ± 1%  16.9MB/s ± 1%    ~     (p=0.052 n=11+11)
    RegexpMatchEasy0_32-4     314MB/s ± 0%   314MB/s ± 1%    ~     (p=0.618 n=11+11)
    RegexpMatchEasy0_1K-4    1.29GB/s ± 0%  1.29GB/s ± 0%    ~     (p=0.315 n=10+10)
    RegexpMatchEasy1_32-4     290MB/s ± 1%   290MB/s ± 1%    ~     (p=0.667 n=10+11)
    RegexpMatchEasy1_1K-4    1.03GB/s ± 0%  1.03GB/s ± 0%    ~      (p=0.829 n=10+8)
    RegexpMatchMedium_32-4   7.63MB/s ± 0%  7.65MB/s ± 0%    ~     (p=0.142 n=11+11)
    RegexpMatchMedium_1K-4   24.4MB/s ± 1%  24.6MB/s ± 0%    ~      (p=0.063 n=11+9)
    RegexpMatchHard_32-4     13.7MB/s ± 0%  13.7MB/s ± 0%    ~     (p=0.302 n=11+11)
    RegexpMatchHard_1K-4     14.4MB/s ± 0%  14.4MB/s ± 0%    ~      (p=0.784 n=9+10)
    Revcomp-4                 310MB/s ± 0%   311MB/s ± 0%    ~     (p=0.243 n=10+11)
    Template-4               26.2MB/s ± 0%  26.3MB/s ± 0%  +0.24%  (p=0.009 n=10+11)
    
    Update #12970.
    
    Change-Id: Id185080687a60c229a5cb2e5220e7ca1b53910e2
    Reviewed-on: https://go-review.googlesource.com/15999Reviewed-by: 's avatarAustin Clements <austin@google.com>
    Reviewed-by: 's avatarDmitry Vyukov <dvyukov@google.com>
    0adf6dce
asm_386.s 35.4 KB