go.net/publicsuffix: tighten the encoding from 8 bytes per node to 4.
On the full list (running gen.go with -subset=false): Before, there were 6086 nodes (at 8 bytes per node) before. After, there were 6086 nodes (at 4 bytes per node) plus 354 children entries (at 4 bytes per node). The difference is 22928 bytes. In comparison, the (crushed) text is 21082 bytes, and for the curious, the longest label is 36 bytes: "xn--correios-e-telecomunicaes-ghc29a". All 32 bits in the nodes table are used, but there's wiggle room to accomodate future changes to effective_tld_names.dat: The largest children index is 353 (in 9 bits, so max is 511). The largest node type is 2 (in 2 bits, so max is 3). The largest text offset is 21080 (in 15 bits, so max is 32767). The largest text length is 36 (in 6 bits, so max is 63). benchmark old ns/op new ns/op delta BenchmarkPublicSuffix 19948 19744 -1.02% R=dr.volker.dobler CC=golang-dev https://golang.org/cl/6999045
Showing
This diff is collapsed.
Please
register
or
sign in
to comment