• Russ Cox's avatar
    cmd/gc: simplify compiled code for explicit zeroing · 3c40ee0f
    Russ Cox authored
    Among other things, *x = T{} does not need a write barrier.
    The changes here avoid an unnecessary copy even when
    no pointers are involved, so it may have larger effects.
    
    In 6g and 8g, avoid manually repeated STOSQ in favor of
    writing explicit MOVs, under the theory that the MOVs
    should have fewer dependencies and pipeline better.
    
    Benchmarks compare best of 5 on a 2012 MacBook Pro Core i5
    with TurboBoost disabled. Most improvements can be explained
    by the changes in this CL.
    
    The effect in Revcomp is real but harder to explain: none of
    the instructions in the inner loop changed. I suspect loop
    alignment but really have no idea.
    
    benchmark                       old         new         delta
    BenchmarkBinaryTree17           3809027371  3819907076  +0.29%
    BenchmarkFannkuch11             3607547556  3686983012  +2.20%
    BenchmarkFmtFprintfEmpty        118         103         -12.71%
    BenchmarkFmtFprintfString       289         277         -4.15%
    BenchmarkFmtFprintfInt          304         290         -4.61%
    BenchmarkFmtFprintfIntInt       507         458         -9.66%
    BenchmarkFmtFprintfPrefixedInt  425         408         -4.00%
    BenchmarkFmtFprintfFloat        555         555         +0.00%
    BenchmarkFmtManyArgs            1835        1733        -5.56%
    BenchmarkGobDecode              14738209    14639331    -0.67%
    BenchmarkGobEncode              14239039    13703571    -3.76%
    BenchmarkGzip                   538211054   538701315   +0.09%
    BenchmarkGunzip                 135430877   134818459   -0.45%
    BenchmarkHTTPClientServer       116488      116618      +0.11%
    BenchmarkJSONEncode             28923406    29294334    +1.28%
    BenchmarkJSONDecode             105779820   104289543   -1.41%
    BenchmarkMandelbrot200          5791758     5771964     -0.34%
    BenchmarkGoParse                5376642     5310943     -1.22%
    BenchmarkRegexpMatchEasy0_32    195         190         -2.56%
    BenchmarkRegexpMatchEasy0_1K    477         455         -4.61%
    BenchmarkRegexpMatchEasy1_32    170         165         -2.94%
    BenchmarkRegexpMatchEasy1_1K    1410        1394        -1.13%
    BenchmarkRegexpMatchMedium_32   336         329         -2.08%
    BenchmarkRegexpMatchMedium_1K   108979      106328      -2.43%
    BenchmarkRegexpMatchHard_32     5854        5821        -0.56%
    BenchmarkRegexpMatchHard_1K     185089      182838      -1.22%
    BenchmarkRevcomp                834920364   780202624   -6.55%
    BenchmarkTemplate               137046937   129728756   -5.34%
    BenchmarkTimeParse              600         594         -1.00%
    BenchmarkTimeFormat             559         539         -3.58%
    
    LGTM=r
    R=r
    CC=golang-codereviews, iant, khr, rlh
    https://golang.org/cl/157910047
    3c40ee0f
mparith2.c 9.35 KB