• Russ Cox's avatar
    cmd/6g: treat vardef-initialized fat variables as live at calls · 1afbceb5
    Russ Cox authored
    This CL forces the optimizer to preserve some memory stores
    that would be redundant except that a stack scan due to garbage
    collection or stack copying might look at them during a function call.
    As such, it forces additional memory writes and therefore slows
    down the execution of some programs, especially garbage-heavy
    programs that are already limited by memory bandwidth.
    
    The slowdown can be as much as 7% for end-to-end benchmarks.
    
    These numbers are from running go1.test -test.benchtime=5s three times,
    taking the best (lowest) ns/op for each benchmark. I am excluding
    benchmarks with time/op < 10us to focus on macro effects.
    All benchmarks are on amd64.
    
    Comparing tip (a27f34c771cb) against this CL on an Intel Core i5 MacBook Pro:
    
    benchmark                          old ns/op      new ns/op      delta
    BenchmarkBinaryTree17              3876500413     3856337341     -0.52%
    BenchmarkFannkuch11                2965104777     2991182127     +0.88%
    BenchmarkGobDecode                 8563026        8788340        +2.63%
    BenchmarkGobEncode                 5050608        5267394        +4.29%
    BenchmarkGzip                      431191816      434168065      +0.69%
    BenchmarkGunzip                    107873523      110563792      +2.49%
    BenchmarkHTTPClientServer          85036          86131          +1.29%
    BenchmarkJSONEncode                22143764       22501647       +1.62%
    BenchmarkJSONDecode                79646916       85658808       +7.55%
    BenchmarkMandelbrot200             4720421        4700108        -0.43%
    BenchmarkGoParse                   4651575        4712247        +1.30%
    BenchmarkRegexpMatchMedium_1K      71986          73490          +2.09%
    BenchmarkRegexpMatchHard_1K        111018         117495         +5.83%
    BenchmarkRevcomp                   648798723      659352759      +1.63%
    BenchmarkTemplate                  112673009      112819078      +0.13%
    
    Comparing tip (a27f34c771cb) against this CL on an Intel Xeon E5520:
    
    BenchmarkBinaryTree17              5461110720     5393104469     -1.25%
    BenchmarkFannkuch11                4314677151     4327177615     +0.29%
    BenchmarkGobDecode                 11065853       11235272       +1.53%
    BenchmarkGobEncode                 65000655        6959837        +7.07%
    BenchmarkGzip                      647478596      671769097      +3.75%
    BenchmarkGunzip                    139348579      141096376      +1.25%
    BenchmarkHTTPClientServer          69376          73610          +6.10%
    BenchmarkJSONEncode                30172320       31796106       +5.38%
    BenchmarkJSONDecode                113704905      114239137      +0.47%
    BenchmarkMandelbrot200             6032730        6003077        -0.49%
    BenchmarkGoParse                   6775251        6405995        -5.45%
    BenchmarkRegexpMatchMedium_1K      111832         113895         +1.84%
    BenchmarkRegexpMatchHard_1K        161112         168420         +4.54%
    BenchmarkRevcomp                   876363406      892319935      +1.82%
    BenchmarkTemplate                  146273096      148998339      +1.86%
    
    Just to get a sense of where we are compared to the previous release,
    here are the same benchmarks comparing Go 1.2 to this CL.
    
    Comparing Go 1.2 against this CL on an Intel Core i5 MacBook Pro:
    
    BenchmarkBinaryTree17              4370077662     3856337341     -11.76%
    BenchmarkFannkuch11                3347052657     2991182127     -10.63%
    BenchmarkGobDecode                 8791384        8788340        -0.03%
    BenchmarkGobEncode                 4968759        5267394        +6.01%
    BenchmarkGzip                      437815669      434168065      -0.83%
    BenchmarkGunzip                    94604099       110563792      +16.87%
    BenchmarkHTTPClientServer          87798          86131          -1.90%
    BenchmarkJSONEncode                22818243       22501647       -1.39%
    BenchmarkJSONDecode                97182444       85658808       -11.86%
    BenchmarkMandelbrot200             4733516        4700108        -0.71%
    BenchmarkGoParse                   5054384        4712247        -6.77%
    BenchmarkRegexpMatchMedium_1K      67612          73490          +8.69%
    BenchmarkRegexpMatchHard_1K        107321         117495         +9.48%
    BenchmarkRevcomp                   733270055      659352759      -10.08%
    BenchmarkTemplate                  109304977      112819078      +3.21%
    
    Comparing Go 1.2 against this CL on an Intel Xeon E5520:
    
    BenchmarkBinaryTree17              5986953594     5393104469     -9.92%
    BenchmarkFannkuch11                4861139174     4327177615     -10.98%
    BenchmarkGobDecode                 11830997       11235272       -5.04%
    BenchmarkGobEncode                 6608722        6959837        +5.31%
    BenchmarkGzip                      661875826      671769097      +1.49%
    BenchmarkGunzip                    138630019      141096376      +1.78%
    BenchmarkHTTPClientServer          71534          73610          +2.90%
    BenchmarkJSONEncode                30393609       31796106       +4.61%
    BenchmarkJSONDecode                139645860      114239137      -18.19%
    BenchmarkMandelbrot200             5988660        6003077        +0.24%
    BenchmarkGoParse                   6974092        6405995        -8.15%
    BenchmarkRegexpMatchMedium_1K      111331         113895         +2.30%
    BenchmarkRegexpMatchHard_1K        165961         168420         +1.48%
    BenchmarkRevcomp                   995049292      892319935      -10.32%
    BenchmarkTemplate                  145623363      148998339      +2.32%
    
    Fixes #8036.
    
    LGTM=khr
    R=golang-codereviews, josharian, khr
    CC=golang-codereviews, iant, r
    https://golang.org/cl/99660044
    1afbceb5
issue8036.go 674 Bytes