Fix DataErrorException when extracting LZMA-compressed zero-byte ZIP entries#1237
Fix DataErrorException when extracting LZMA-compressed zero-byte ZIP entries#1237adamhathcock merged 3 commits intomasterfrom
Conversation
Co-authored-by: adamhathcock <527620+adamhathcock@users.noreply.github.com>
…entries Co-authored-by: adamhathcock <527620+adamhathcock@users.noreply.github.com>
Code Review SummaryStatus: No Issues Found | Recommendation: Merge OverviewThe fix addresses a bug where extracting LZMA-compressed zero-byte ZIP entries throws a Changes Reviewed
Code Quality
Files Reviewed (5 files)
|
There was a problem hiding this comment.
Pull request overview
Fixes extraction of ZIP entries that are LZMA-compressed but have a known uncompressed size of 0 bytes, preventing DataErrorException during reads and adding regression coverage for both seekable and streaming readers.
Changes:
- Short-circuit LZMA decompression for entries with explicitly-known zero uncompressed size (sync + async paths).
- Add a new LZMA ZIP fixture containing a single zero-byte entry.
- Add regression tests for both
ArchiveFactory.OpenArchive(seekable) andReaderFactory.OpenReader(streaming/forward-only).
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/SharpCompress/Common/Zip/ZipFilePart.cs | Adds early-return logic for known-empty LZMA entries during decompression stream creation. |
| src/SharpCompress/Common/Zip/ZipFilePart.Async.cs | Async equivalent of the LZMA known-empty early-return logic. |
| tests/TestArchives/Archives/Zip.lzma.empty.zip | New test archive fixture containing a zero-byte LZMA-compressed entry. |
| tests/SharpCompress.Test/Zip/ZipArchiveTests.cs | Regression test covering extraction via seekable archive API. |
| tests/SharpCompress.Test/Zip/ZipReaderTests.cs | Regression test covering extraction via streaming reader API with a forward-only stream. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Extracting a ZIP archive containing an LZMA-compressed entry with 0 uncompressed bytes throws
DataErrorExceptionfromLzmaStream.Read().Root cause
LZMA entries in ZIP always carry a 4-byte header (version + property length) plus 5-byte properties, followed by actual LZMA stream data — even for empty files. So
CompressedSize > 0whileUncompressedSize == 0.LzmaStreamis constructed withoutputSize=0, setting_availableBytes=0. The firstRead()call immediately sets_endReached=truewithout running the decoder, then the end-of-stream check throws because_inputPosition(0, never incremented) ≠_inputSize(> 0).Fix
In
ZipFilePart.CreateDecompressionStream(sync and async), after parsing the LZMA header, short-circuit when the output size is definitively zero — i.e.,Bit1(EOS-marker flag) is not set andUncompressedSize == 0:stream.Skip()is necessary because the stream at this point is aReadOnlySubStreamscoped toCompressedSizebytes; draining it keeps the underlying archive stream correctly positioned for sequential (non-seekable) reads.Changes
ZipFilePart.cs/ZipFilePart.Async.cs— early return for known-empty LZMA entries in both sync and async decompression pathsZip.lzma.empty.zip— new test archive containing a single LZMA-compressed zero-byte entryZipArchiveTests— regression test via seekableArchiveFactory.OpenArchiveZipReaderTests— regression test via streamingReaderFactory.OpenReaderwith a forward-only streamOriginal prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.