• Keith Randall's avatar
    runtime: faster memclr on x86. · da7cf0ba
    Keith Randall authored
    Use explicit SSE writes instead of REP STOSQ.
    
    benchmark               old ns/op    new ns/op    delta
    BenchmarkMemclr5               22            5  -73.62%
    BenchmarkMemclr16              27            5  -78.49%
    BenchmarkMemclr64              28            6  -76.43%
    BenchmarkMemclr256             34            8  -74.94%
    BenchmarkMemclr4096           112           84  -24.73%
    BenchmarkMemclr65536         1902         1920   +0.95%
    
    LGTM=dvyukov
    R=golang-codereviews, dvyukov
    CC=golang-codereviews
    https://golang.org/cl/60090044
    da7cf0ba
asm_amd64.s 30.4 KB