-
fanzha02 authored
Current assembler gets large constants from constant pool, this CL gets rid of the pool by using MOVZ/MOVN and MOVK to load large constants. This CL changes the assembler behavior as follows. 1. go assembly 1, MOVD $0x1111222233334444, R1 2, MOVD $0x1111ffff1111ffff, R1 previous version: MOVD 0x9a4, R1 (loads constant from pool). optimized version: 1, MOVD $0x4444, R1; MOVK $(0x3333<<16), R1; MOVK $(0x2222<<32), R1; MOVK $(0x1111<<48), R1. 2, MOVN $(0xeeee<<16), R1; MOVK $(0x1111<<48), R1. Add test cases, and below are binary size comparison and bechmark results. 1. Binary size before/after binary size change pkg/linux_arm64 +25.4KB pkg/tool/linux_arm64 -2.9KB go -2KB gofmt no change 2. compiler benchmark. name old time/op new time/op delta Template 574ms ±21% 577ms ±14% ~ (p=0.853 n=10+10) Unicode 327ms ±29% 353ms ±23% ~ (p=0.360 n=10+8) GoTypes 1.97s ± 8% 2.04s ±11% ~ (p=0.143 n=10+10) Compiler 9.13s ± 9% 9.25s ± 8% ~ (p=0.684 n=10+10) SSA 29.2s ± 5% 27.0s ± 4% -7.40% (p=0.000 n=10+10) Flate 402ms ±40% 308ms ± 6% -23.29% (p=0.004 n=10+10) GoParser 470ms ±26% 382ms ±10% -18.82% (p=0.000 n=9+10) Reflect 1.36s ±16% 1.17s ± 7% -13.92% (p=0.001 n=9+10) Tar 561ms ±19% 466ms ±15% -17.08% (p=0.000 n=9+10) XML 745ms ±20% 679ms ±20% ~ (p=0.123 n=10+10) StdCmd 35.5s ± 6% 37.2s ± 3% +4.81% (p=0.001 n=9+8) name old user-time/op new user-time/op delta Template 625ms ±14% 660ms ±18% ~ (p=0.343 n=10+10) Unicode 355ms ±10% 373ms ±20% ~ (p=0.346 n=9+10) GoTypes 2.39s ± 8% 2.37s ± 5% ~ (p=0.897 n=10+10) Compiler 11.1s ± 4% 11.4s ± 2% +2.63% (p=0.010 n=10+9) SSA 35.4s ± 3% 34.9s ± 2% ~ (p=0.113 n=10+9) Flate 402ms ±13% 371ms ±30% ~ (p=0.089 n=10+9) GoParser 513ms ± 8% 489ms ±24% -4.76% (p=0.039 n=9+9) Reflect 1.52s ±12% 1.41s ± 5% -7.32% (p=0.001 n=9+10) Tar 607ms ±10% 558ms ± 8% -7.96% (p=0.009 n=9+10) XML 828ms ±10% 789ms ±12% ~ (p=0.059 n=10+10) name old text-bytes new text-bytes delta HelloSize 714kB ± 0% 712kB ± 0% -0.23% (p=0.000 n=10+10) CmdGoSize 8.26MB ± 0% 8.25MB ± 0% -0.14% (p=0.000 n=10+10) name old data-bytes new data-bytes delta HelloSize 10.5kB ± 0% 10.5kB ± 0% ~ (all equal) CmdGoSize 258kB ± 0% 258kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal) CmdGoSize 146kB ± 0% 146kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.18MB ± 0% 1.18MB ± 0% ~ (all equal) CmdGoSize 11.2MB ± 0% 11.2MB ± 0% -0.13% (p=0.000 n=10+10) 3. go1 benckmark. name old time/op new time/op delta BinaryTree17 6.60s ±18% 7.36s ±22% ~ (p=0.222 n=5+5) Fannkuch11 4.04s ± 0% 4.05s ± 0% ~ (p=0.421 n=5+5) FmtFprintfEmpty 91.8ns ±14% 91.2ns ± 9% ~ (p=0.667 n=5+5) FmtFprintfString 145ns ± 0% 151ns ± 6% ~ (p=0.397 n=4+5) FmtFprintfInt 169ns ± 0% 176ns ± 5% +4.14% (p=0.016 n=4+5) FmtFprintfIntInt 229ns ± 2% 243ns ± 6% ~ (p=0.143 n=5+5) FmtFprintfPrefixedInt 343ns ± 0% 350ns ± 3% +1.92% (p=0.048 n=5+5) FmtFprintfFloat 400ns ± 3% 394ns ± 3% ~ (p=0.063 n=5+5) FmtManyArgs 1.04µs ± 0% 1.05µs ± 0% +1.62% (p=0.029 n=4+4) GobDecode 13.9ms ± 4% 13.9ms ± 5% ~ (p=1.000 n=5+5) GobEncode 10.6ms ± 4% 10.6ms ± 5% ~ (p=0.421 n=5+5) Gzip 567ms ± 1% 563ms ± 4% ~ (p=0.548 n=5+5) Gunzip 60.2ms ± 1% 60.4ms ± 0% ~ (p=0.056 n=5+5) HTTPClientServer 114µs ± 4% 108µs ± 7% ~ (p=0.095 n=5+5) JSONEncode 18.4ms ± 2% 17.8ms ± 2% -3.06% (p=0.016 n=5+5) JSONDecode 105ms ± 1% 103ms ± 2% ~ (p=0.056 n=5+5) Mandelbrot200 5.48ms ± 0% 5.49ms ± 0% ~ (p=0.841 n=5+5) GoParse 6.05ms ± 1% 6.05ms ± 2% ~ (p=1.000 n=5+5) RegexpMatchEasy0_32 143ns ± 1% 146ns ± 4% +2.10% (p=0.048 n=4+5) RegexpMatchEasy0_1K 499ns ± 1% 492ns ± 2% ~ (p=0.079 n=5+5) RegexpMatchEasy1_32 137ns ± 0% 136ns ± 1% -0.73% (p=0.016 n=4+5) RegexpMatchEasy1_1K 826ns ± 4% 823ns ± 2% ~ (p=0.841 n=5+5) RegexpMatchMedium_32 224ns ± 5% 233ns ± 8% ~ (p=0.119 n=5+5) RegexpMatchMedium_1K 59.6µs ± 0% 59.3µs ± 1% -0.66% (p=0.016 n=4+5) RegexpMatchHard_32 3.29µs ± 3% 3.26µs ± 1% ~ (p=0.889 n=5+5) RegexpMatchHard_1K 98.8µs ± 2% 99.0µs ± 0% ~ (p=0.690 n=5+5) Revcomp 1.02s ± 1% 1.01s ± 1% ~ (p=0.095 n=5+5) Template 135ms ± 5% 131ms ± 1% ~ (p=0.151 n=5+5) TimeParse 591ns ± 0% 593ns ± 0% +0.20% (p=0.048 n=5+5) TimeFormat 655ns ± 2% 607ns ± 0% -7.42% (p=0.016 n=5+4) [Geo mean] 93.5µs 93.8µs +0.23% name old speed new speed delta GobDecode 55.1MB/s ± 4% 55.1MB/s ± 4% ~ (p=1.000 n=5+5) GobEncode 72.4MB/s ± 4% 72.3MB/s ± 5% ~ (p=0.421 n=5+5) Gzip 34.2MB/s ± 1% 34.5MB/s ± 4% ~ (p=0.548 n=5+5) Gunzip 322MB/s ± 1% 321MB/s ± 0% ~ (p=0.056 n=5+5) JSONEncode 106MB/s ± 2% 109MB/s ± 2% +3.16% (p=0.016 n=5+5) JSONDecode 18.5MB/s ± 1% 18.8MB/s ± 2% ~ (p=0.056 n=5+5) GoParse 9.57MB/s ± 1% 9.57MB/s ± 2% ~ (p=0.952 n=5+5) RegexpMatchEasy0_32 223MB/s ± 1% 221MB/s ± 0% -1.10% (p=0.029 n=4+4) RegexpMatchEasy0_1K 2.05GB/s ± 1% 2.08GB/s ± 2% ~ (p=0.095 n=5+5) RegexpMatchEasy1_32 232MB/s ± 0% 234MB/s ± 1% +0.76% (p=0.016 n=4+5) RegexpMatchEasy1_1K 1.24GB/s ± 4% 1.24GB/s ± 2% ~ (p=0.841 n=5+5) RegexpMatchMedium_32 4.45MB/s ± 5% 4.20MB/s ± 1% -5.63% (p=0.000 n=5+4) RegexpMatchMedium_1K 17.2MB/s ± 0% 17.3MB/s ± 1% +0.66% (p=0.016 n=4+5) RegexpMatchHard_32 9.73MB/s ± 3% 9.83MB/s ± 1% ~ (p=0.889 n=5+5) RegexpMatchHard_1K 10.4MB/s ± 2% 10.3MB/s ± 0% ~ (p=0.635 n=5+5) Revcomp 249MB/s ± 1% 252MB/s ± 1% ~ (p=0.095 n=5+5) Template 14.4MB/s ± 4% 14.8MB/s ± 1% ~ (p=0.151 n=5+5) [Geo mean] 62.1MB/s 62.3MB/s +0.34% Fixes #10108 Change-Id: I79038f3c4c2ff874c136053d1a2b1c8a5a9cfac5 Reviewed-on: https://go-review.googlesource.com/c/118796Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
644ddaa8