• Joe Tsai's avatar
    archive/zip: add FileHeader.NonUTF8 field · 4fcc8359
    Joe Tsai authored
    The NonUTF8 field provides users with a way to explictly tell the
    ZIP writer to avoid setting the UTF-8 flag.
    This is necessary because many readers:
    	1) (Still) do not support UTF-8
    	2) And use the local system encoding instead
    
    Thus, even though character encodings other than CP-437 and UTF-8
    are not officially supported by the ZIP specification, pragmatically
    the world has permitted use of them.
    
    When a non-standard encoding is used, it is the user's responsibility
    to ensure that the target system is expecting the encoding used
    (e.g., producing a ZIP file you know is used on a Chinese version of Windows).
    
    We adjust the detectUTF8 function to account for Shift-JIS and EUC-KR
    not being identical to ASCII for two characters.
    
    We don't need an API for users to explicitly specify that they are encoding
    with UTF-8 since all single byte characters are compatible with all other
    common encodings (Windows-1256, Windows-1252, Windows-1251, Windows-1250,
    IEC-8859, EUC-KR, KOI8-R, Latin-1, Shift-JIS, GB-2312, GBK) except for
    the non-printable characters and the backslash character (all of which
    are invalid characters in a path name anyways).
    
    Fixes #10741
    
    Change-Id: I9004542d1d522c9137973f1b6e2b623fa54dfd66
    Reviewed-on: https://go-review.googlesource.com/75592
    Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
    Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
    4fcc8359
reader_test.go 30.1 KB