• Rob Pike's avatar
    all: make Unicode surrogate halves illegal as UTF-8 · c48b77b1
    Rob Pike authored
    Surrogate halves are part of UTF-16 and should never appear in UTF-8.
    (The rune that two combined halves represent in UTF-16 should
    be encoded directly.)
    
    Encoding: encode as RuneError.
    Decoding: convert to RuneError, consume one byte.
    
    This requires changing:
            package unicode/utf8
            runtime for range over string
    Also added utf8.ValidRune and fixed bug in utf.RuneLen.
    
    Fixes #3927.
    
    R=golang-dev, rsc, bsiegert
    CC=golang-dev
    https://golang.org/cl/6458099
    c48b77b1
utf8.go 10.3 KB