image/jpeg: don't unread a byte if we've already taken bits from it.

This rolls back most of golang.org/cl/8841, aka 2f98bac3, and makes a different fix. It keeps the TestTruncatedSOSDataDoesntPanic test introduced by that other CL, which obviously still passes after this CL. Fixes #11650, a regression (introduced by cl/8841) from Go 1.4. The original cl/8841 changed the image/jpeg not to panic on an input given in #10387. We still do not panic on that input, after this CL. I have a corpus of over 160,000 JPEG images, a sample of a web crawl. The image/jpeg code ran happily over that whole corpus both before and after this CL, although that corpus clearly didn't catch the regression in the first place. This code was otherwise tested manually. I don't think that it's trivial to synthesize a JPEG input that happens to run out of Huffman data at just the right place. The test image attached to #11650 obviously has that property, but I don't think we can simply add that test image to the repository: it's 227KiB, and I don't know its copyright status. I also looked back over the issue tracker for problematic JPEGs that people have filed. The Go code, after this CL, is still happy on these files in my directory: issue2362a.jpeg issue3916.jpeg issue3976.jpeg issue4084.jpeg issue4259.jpeg issue4291.jpeg issue4337.jpeg issue4500.jpeg issue4705.jpeg issue4975.jpeg issue5112.jpeg issue6767.jpeg issue9888.jpeg issue10133.jpeg issue10357.jpeg issue10447.jpeg issue11648.jpeg issue11650.jpeg There were other images attached in the issue tracker that aren't actually valid JPEGs. They failed both before and after this CL: broken-issue2362b.jpeg broken-issue6450.jpeg broken-issue8693.jpeg broken-issue10154.jpeg broken-issue10387.jpeg broken-issue10388.jpeg broken-issue10389.jpeg broken-issue10413.jpeg In summary, this CL fixes #11650 and, after some automated and manual testing, I don't think introduces new regressions. Change-Id: I30b67036e9b087f3051d57dac7ea05fb4fa36f66 Reviewed-on: https://go-review.googlesource.com/12163Reviewed-by: Rob Pike <r@golang.org>

image/jpeg: don't unread a byte if we've already taken bits from it.
This rolls back most of golang.org/cl/8841, aka 2f98bac3, and makes a different fix. It keeps the TestTruncatedSOSDataDoesntPanic test introduced by that other CL, which obviously still passes after this CL. Fixes #11650, a regression (introduced by cl/8841) from Go 1.4. The original cl/8841 changed the image/jpeg not to panic on an input given in #10387. We still do not panic on that input, after this CL. I have a corpus of over 160,000 JPEG images, a sample of a web crawl. The image/jpeg code ran happily over that whole corpus both before and after this CL, although that corpus clearly didn't catch the regression in the first place. This code was otherwise tested manually. I don't think that it's trivial to synthesize a JPEG input that happens to run out of Huffman data at just the right place. The test image attached to #11650 obviously has that property, but I don't think we can simply add that test image to the repository: it's 227KiB, and I don't know its copyright status. I also looked back over the issue tracker for problematic JPEGs that people have filed. The Go code, after this CL, is still happy on these files in my directory: issue2362a.jpeg issue3916.jpeg issue3976.jpeg issue4084.jpeg issue4259.jpeg issue4291.jpeg issue4337.jpeg issue4500.jpeg issue4705.jpeg issue4975.jpeg issue5112.jpeg issue6767.jpeg issue9888.jpeg issue10133.jpeg issue10357.jpeg issue10447.jpeg issue11648.jpeg issue11650.jpeg There were other images attached in the issue tracker that aren't actually valid JPEGs. They failed both before and after this CL: broken-issue2362b.jpeg broken-issue6450.jpeg broken-issue8693.jpeg broken-issue10154.jpeg broken-issue10387.jpeg broken-issue10388.jpeg broken-issue10389.jpeg broken-issue10413.jpeg In summary, this CL fixes #11650 and, after some automated and manual testing, I don't think introduces new regressions. Change-Id: I30b67036e9b087f3051d57dac7ea05fb4fa36f66 Reviewed-on: https://go-review.googlesource.com/12163Reviewed-by: Rob Pike <r@golang.org>
60b7d27c · Nigel Tao · 902345e5 · 60b7d27c · 60b7d27c
Commit 60b7d27c authored Jul 14, 2015 by Nigel Tao
Hide whitespace changes
Inline Side-by-side

Showing with 15 additions and 6 deletions

huffman.go src/image/jpeg/huffman.go +3 -1

reader.go src/image/jpeg/reader.go +12 -5

No files found.
--- a/src/image/jpeg/huffman.go
+++ b/src/image/jpeg/huffman.go
@@ -187,7 +187,9 @@ func (d *decoder) decodeHuffman(h *huffman) (uint8, error) {
 			// There are no more bytes of data in this segment, but we may still
 			// be able to read the next symbol out of the previously read bits.
 			// First, undo the readByte that the ensureNBits call made.
-			d.unreadByteStuffedByte()
+			if d.bytes.nUnreadable != 0 {
+				d.unreadByteStuffedByte()
+			}
 			goto slowPath
 		}
 	}

--- a/src/image/jpeg/reader.go
+++ b/src/image/jpeg/reader.go
@@ -169,9 +169,6 @@ func (d *decoder) fill() error {
 // sometimes overshoot and read one or two too many bytes. Two-byte overshoot
 // can happen when expecting to read a 0xff 0x00 byte-stuffed byte.
 func (d *decoder) unreadByteStuffedByte() {
-	if d.bytes.nUnreadable == 0 {
-		return
-	}
 	d.bytes.i -= d.bytes.nUnreadable
 	d.bytes.nUnreadable = 0
 	if d.bits.n >= 8 {
@@ -243,7 +240,12 @@ func (d *decoder) readByteStuffedByte() (x byte, err error) {
 // stuffing.
 func (d *decoder) readFull(p []byte) error {
 	// Unread the overshot bytes, if any.
-	d.unreadByteStuffedByte()
+	if d.bytes.nUnreadable != 0 {
+		if d.bits.n >= 8 {
+			d.unreadByteStuffedByte()
+		}
+		d.bytes.nUnreadable = 0
+	}

 	for {
 		n := copy(p, d.bytes.buf[d.bytes.i:d.bytes.j])
@@ -265,7 +267,12 @@ func (d *decoder) readFull(p []byte) error {
 // ignore ignores the next n bytes.
 func (d *decoder) ignore(n int) error {
 	// Unread the overshot bytes, if any.
-	d.unreadByteStuffedByte()
+	if d.bytes.nUnreadable != 0 {
+		if d.bits.n >= 8 {
+			d.unreadByteStuffedByte()
+		}
+		d.bytes.nUnreadable = 0
+	}

 	for {
 		m := d.bytes.j - d.bytes.i