• Dmitriy Vyukov's avatar
    runtime: use custom thunks for race calls instead of cgo · a1695d2e
    Dmitriy Vyukov authored
    Implement custom assembly thunks for hot race calls (memory accesses and function entry/exit).
    The thunks extract caller pc, verify that the address is in heap or global and switch to g0 stack.
    
    Before:
    ok  	regexp	3.692s
    ok  	compress/bzip2	9.461s
    ok  	encoding/json	6.380s
    After:
    ok  	regexp	2.229s (-40%)
    ok  	compress/bzip2	4.703s (-50%)
    ok  	encoding/json	3.629s (-43%)
    
    For comparison, normal non-race build:
    ok  	regexp	0.348s
    ok  	compress/bzip2	0.304s
    ok  	encoding/json	0.661s
    Race build:
    ok  	regexp	2.229s (+540%)
    ok  	compress/bzip2	4.703s (+1447%)
    ok  	encoding/json	3.629s (+449%)
    
    Also removes some race-related special cases from cgocall and scheduler.
    In long-term it will allow to remove cyclic runtime/race dependency on cmd/cgo.
    
    Fixes #4249.
    Fixes #7460.
    Update #6508
    Update #6688
    
    R=iant, rsc, bradfitz
    CC=golang-codereviews
    https://golang.org/cl/55100044
    a1695d2e
race_amd64.s 7.76 KB