• Joe Tsai's avatar
    archive/tar: refactor Reader support for sparse files · 3bece2fa
    Joe Tsai authored
    This CL is the first step (of two) for adding sparse file support
    to the Writer. This CL only refactors the logic of sparse-file handling
    in the Reader so that common logic can be easily shared by the Writer.
    
    As a result of this CL, there are some new publicly visible API changes:
    	type SparseEntry struct { Offset, Length int64 }
    	type Header struct { ...; SparseHoles []SparseEntry }
    
    A new type is defined to represent a sparse fragment and a new field
    Header.SparseHoles is added to represent the sparse holes in a file.
    The API intentionally represent sparse files using hole fragments,
    rather than data fragments so that the zero value of SparseHoles
    naturally represents a normal file (i.e., a file without any holes).
    The Reader now populates SparseHoles for sparse files.
    
    It is necessary to export the sparse hole information, otherwise it would
    be impossible for the Writer to specify that it is trying to encode
    a sparse file, and what it looks like.
    
    Some unexported helper functions were added to common.go:
    	func validateSparseEntries(sp []SparseEntry, size int64) bool
    	func alignSparseEntries(src []SparseEntry, size int64) []SparseEntry
    	func invertSparseEntries(src []SparseEntry, size int64) []SparseEntry
    
    The validation logic that used to be in newSparseFileReader is now moved
    to validateSparseEntries so that the Writer can use it in the future.
    alignSparseEntries is currently unused by the Reader, but will be used
    by the Writer in the future. Since TAR represents sparse files by
    only recording the data fragments, we add the invertSparseEntries
    function to convert a list of data fragments to a normalized list
    of hole fragments (and vice-versa).
    
    Some other high-level changes:
    * skipUnread is deleted, where most of it's logic is moved to the
    Discard methods on regFileReader and sparseFileReader.
    * readGNUSparsePAXHeaders was rewritten to be simpler.
    * regFileReader and sparseFileReader were completely rewritten
    in simpler and easier to understand logic.
    * A bug was fixed in sparseFileReader.Read where it failed to
    report an error if the logical size of the file ends before
    consuming all of the underlying data.
    * The tests for sparse-file support was completely rewritten.
    
    Updates #13548
    
    Change-Id: Ic1233ae5daf3b3f4278fe1115d34a90c4aeaf0c2
    Reviewed-on: https://go-review.googlesource.com/56771
    Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: 's avatarIan Lance Taylor <iant@golang.org>
    3bece2fa
tar_test.go 18.6 KB