Fix ZIP parsing failure on non-seekable streams with short reads#1169
Fix ZIP parsing failure on non-seekable streams with short reads#1169adamhathcock merged 5 commits intoreleasefrom
Conversation
Co-authored-by: adamhathcock <527620+adamhathcock@users.noreply.github.com>
Co-authored-by: adamhathcock <527620+adamhathcock@users.noreply.github.com>
|
@copilot can you use ReadFully or ReadExact? |
Co-authored-by: adamhathcock <527620+adamhathcock@users.noreply.github.com>
Done! Refactored the |
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Files Reviewed (2 files)
|
There was a problem hiding this comment.
Pull request overview
This PR fixes a regression where ZIP archive parsing fails on non-seekable streams when the underlying Stream.Read() returns short reads (fewer bytes than requested). The issue manifested in real-world scenarios like ASP.NET Core multipart/form-data streaming where chunking patterns could vary.
Changes:
- Modified
SharpCompressStreamto implement buffer fill methods that handle short reads by following the ReadFully pattern - Added comprehensive test coverage with
ZipShortReadTeststo prevent future regressions - Updated package lock file (unrelated maintenance update for existing Mono.Posix.NETStandard dependency)
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/SharpCompress/IO/SharpCompressStream.cs | Added three private FillBuffer* methods that implement the ReadFully pattern to handle short reads from underlying streams. Updated all Read methods (sync and async variants) to use these buffer fill methods. |
| tests/SharpCompress.Test/Zip/ZipShortReadTests.cs | New test class with 9 test cases covering various chunking patterns including extreme cases (1 byte at a time). Includes a PatternReadStream helper class to simulate non-seekable streams with controlled short reads. |
| tests/SharpCompress.Test/packages.lock.json | Lock file update reflecting existing Mono.Posix.NETStandard dependency (used for Linux symlink tests, not related to this PR's functionality). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Fix ZIP parsing regression with short reads on non-seekable streams
Completed:
BinaryReaderdoesn't handle short reads properly on non-seekable streamsFix Details:
The regression occurred because
SharpCompressStream.Read()with buffering enabled would return short reads when the underlying stream returned less data than requested. This causedBinaryReaderto receive incomplete data when parsing ZIP headers, leading toZlibExceptionfailures depending on the chunking pattern.Solution: Modified
SharpCompressStreamto implement buffer fill methods that follow theReadFully/ReadFullyAsyncpattern fromUtility.cs. These methods read from the underlying stream in a loop until the internal buffer is filled or EOF is reached, but return the actual count read (rather than a boolean) to handle EOF gracefully.Changes made to
SharpCompressStream.cs:FillBuffer()to follow the ReadFully pattern (same logic asUtility.ReadFullybut returns count)FillBufferAsync()to follow the ReadFullyAsync patternFillBufferMemoryAsync()to follow the ReadFullyAsync patternTest coverage:
ZipShortReadTestswith 9 test cases covering various chunking patternsSummary
This fix resolves the reported issue where SharpCompress ZIP parsing became sensitive to Stream.Read chunking patterns on non-seekable streams. The solution follows the established ReadFully pattern used elsewhere in the codebase.
Original prompt
This section details on the original issue you should resolve
<issue_title>Regression: ZIP parsing fails depending on Stream.Read chunking on non-seekable streams</issue_title>
<issue_description>Hi,
I’m hitting what looks like a stream chunking sensitivity regression when reading ZIP archives from a non-seekable stream. With the exact same ZIP bytes, SharpCompress will sometimes fail depending only on how the underlying
Stream.Read()splits the data.Regression note
The exact same code path works correctly with SharpCompress 0.40.0.
The failure (
ZlibExceptionin the chunked case) starts occurring in newer versions.Context / real-world scenario
This happens in a real ASP.NET Core streaming pipeline (multipart/form-data):
HttpRequest.BodyMultipartReader(multipart/form-data)ReaderFactory.Open(...).MoveToNextEntry()A seemingly unrelated change (for example changing a text field value from
"my-value"to"my-valu") shifts the alignment of the ZIP part by 1 byte, which changes the short-read pattern seen by SharpCompress and triggers a failure.To make this report independent of ASP.NET / multipart, the repro below uses a custom non-seekable stream that returns legal short reads.
Reproduction
This snippet (with stream.zip) reads the same ZIP bytes three ways:
MemoryStreamOnly case (3) fails with a
ZlibException. Wrapping the stream with a simple coalescing wrapper fixes the issue.