[WIP] Improve I/O operations while reading big solid archives by julianxhokaxhiu · Pull Request #1163 · adamhathcock/sharpcompress

julianxhokaxhiu · 2026-01-27T00:36:12Z

This PR is a tentative effort to improve the extraction time for 7z archives, especially ones that are compressed as 1 block solid using 16MB dictionary.

The main idea here is to reduce the number of access to cache ( while doing Skip ), improve performance in writing files to the disk ( use 1MB buffer by default ) and aggressively inline codepaths that are reused across executions ( avoiding jumps in the code but instead favoring a more linear execution at the expense of code size ).

Feel free to test it and report back. In my own case I can finally see the extraction of files in the archive mentioned in #1105

Below Claude Sonnet 4.5 summary

Performance Optimization Summary for 7Zip Solid Archive Extraction

Problem

7Zip extraction with large solid archives (16MB dictionaries, 1-block compression) was extremely slow in version 0.42.0+ compared to 0.41.0, taking hours instead of minutes on high-end hardware.

Root Causes Identified Through Profiling

StreamExtensions.Skip() consuming 94.54% CPU - byte-by-byte reading via CopyTo(Stream.Null)
Excessive Path.GetFullPath() calls - Called 3+ times per extracted file
Small I/O buffers - 80KB buffers insufficient for modern NVMe drives
Method call overhead - Hot-path LZMA decompression methods not inlined

Optimizations Implemented

1. Skip Operation Optimization

File: StreamExtensions.cs

Changed from ReadOnlySubStream + CopyTo(Stream.Null) (byte-by-byte) to buffered reading
Increased buffer size from implicit default to 1MB using ArrayPool
Added fast path for BufferedSubStream to skip via internal method
Impact: Reduced Skip CPU usage from 94.54% → 82.24% → negligible

File: BufferedSubStream.cs

Added SkipInternal() method with 1MB buffer for efficient skipping
Skips cached data instantly, uses large buffered reads for remainder
Impact: Eliminates repeated RefillCache() calls when skipping large amounts of data

2. Increased I/O Buffer Sizes (80KB → 1MB)

Files Modified:

IArchiveEntryExtensions.cs - BufferSize constant
AbstractReader.cs - CopyTo/CopyToAsync calls

Rationale: Modern NVMe Gen 5 drives benefit significantly from larger sequential I/O operations

Impact: Better disk I/O throughput, reduced system calls

3. FileStream Optimization

File: IArchiveEntryExtensions.cs

Replaced File.Open() with explicit FileStream constructor
Specified 1MB buffer size explicitly
Added useAsync: true for async operations to enable overlapped I/O on Windows
Impact: Better async I/O performance, reduced context switching

4. Path Processing Optimization

File: ExtractionMethods.cs

Cached entry.Key to avoid multiple property accesses
Reduced Path.GetFullPath() calls from 3 per file to 1 by consolidating path operations
Combined Path.Combine calls: Combine(folder, file) instead of separate operations
Moved security validation before filesystem calls to avoid unnecessary I/O
Impact: WriteEntryToDirectory CPU reduced from 84.95% → 31.74% (63% reduction)

5. LZMA Decompression Micro-optimizations

Files Modified:

RangeCoder.cs
RangeCoderBit.cs
LzmaDecoder.cs

Added [MethodImpl(MethodImplOptions.AggressiveInlining)] to hot-path methods:

RangeCoder.Decoder: GetThreshold, Decode, DecodeBit, DecodeDirectBits
BitDecoder.Decode (called millions of times)
LzmaDecoder.LenDecoder.Decode
LzmaDecoder.Decoder2: DecodeNormal, DecodeWithMatchByte
LzmaDecoder.LiteralDecoder: GetState, DecodeNormal, DecodeWithMatchByte

Rationale: These methods are in tight decompression loops; inlining reduces call overhead and enables cross-method JIT optimizations

Impact: Reduced LZMA decompression overhead, though fundamental decompression work remains CPU-intensive as expected

Performance Results

Skip operation: No longer a bottleneck (was 94.54% CPU)
Path processing: 63% reduction in WriteEntryToDirectory overhead
Overall extraction: Significantly faster, especially noticeable with large solid archives
Hot path: Now dominated by actual LZMA decompression work (unavoidable)

Technical Notes

All buffer allocations use ArrayPool<byte> to minimize GC pressure
Changes maintain backward compatibility
Security validations (path traversal checks) preserved
Code formatted with CSharpier per project standards

Testing

Tested with 700MB+ solid 7zip archive with 16MB dictionary on AMD 9800X3D with NVMe Gen 5 drive. Extraction time improved from hours to minutes, matching performance of version 0.41.0.

Copilot

Pull request overview

Improves extraction performance for large solid 7z archives by reducing skip overhead, increasing I/O buffer sizes, and adding aggressive inlining in hot LZMA decode paths.

Changes:

Increase stream copy/write buffer sizes to 1MB for faster disk and stream I/O.
Optimize skipping on non-seekable streams, including a fast-path for BufferedSubStream.
Add AggressiveInlining attributes to selected LZMA range coder / decoder methods.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
src/SharpCompress/Readers/AbstractReader.cs	Uses larger CopyTo/CopyToAsync buffers when writing entry data.
src/SharpCompress/Polyfills/StreamExtensions.cs	Reworks `Skip(long)` to reduce read calls and adds a `BufferedSubStream` fast path.
src/SharpCompress/IO/BufferedSubStream.cs	Adds `SkipInternal` to skip efficiently using cache + large reads.
src/SharpCompress/Compressors/LZMA/RangeCoder/RangeCoderBit.cs	Adds inlining attribute to bit decode hot path.
src/SharpCompress/Compressors/LZMA/RangeCoder/RangeCoder.cs	Adds inlining attributes to decoder methods used heavily in LZMA.
src/SharpCompress/Compressors/LZMA/LzmaDecoder.cs	Adds inlining attributes to nested decoder hot paths.
src/SharpCompress/Common/ExtractionMethods.cs	Refactors destination path computation/security checks and directory creation flow.
src/SharpCompress/Archives/IArchiveEntryExtensions.cs	Increases extraction buffer size and uses larger `FileStream` buffers for file writes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-27T00:43:48Z

src/SharpCompress/Archives/IArchiveEntryExtensions.cs

+                    using var fs = new FileStream(
+                        destinationFileName,
+                        fm,
+                        FileAccess.Write,
+                        FileShare.None,
+                        bufferSize: 1048576
+                    ); // 1MB buffer


These FileStream constructors set bufferSize to 1MB, which typically causes a 1MB managed buffer allocation per extracted file (often LOH) in addition to the CopyTo buffer. This can lead to large transient allocations/LOH fragmentation when extracting many files. Consider relying on the default FileStream buffering or making the larger FileStream buffer size conditional/opt-in (and ideally reuse pooled buffers where possible).

Copilot · 2026-01-27T00:43:48Z

src/SharpCompress/Archives/IArchiveEntryExtensions.cs

+                    using var fs = new FileStream(
+                        destinationFileName,
+                        fm,
+                        FileAccess.Write,
+                        FileShare.None,
+                        bufferSize: 1048576
+                    ); // 1MB buffer


The 1MB literal buffer size is duplicated here even though the file already defines BufferSize. Using the constant would prevent accidental divergence if the buffer size is tuned later.

Copilot · 2026-01-27T00:43:48Z

src/SharpCompress/IO/BufferedSubStream.cs

+        // For remaining data, we must actually read it from the underlying stream
+        // Use very large reads to minimize LZMA decompression call overhead
+        var skipBuffer = ArrayPool<byte>.Shared.Rent(1048576); // 1MB for skipping
+        try
+        {
+            while (advanceAmount > 0 && BytesLeftToRead > 0)
+            {
+                var toRead = (int)
+                    Math.Min(Math.Min(advanceAmount, BytesLeftToRead), skipBuffer.Length);


SkipInternal always rents a 1MB buffer even when BytesLeftToRead is already 0 or the remaining skip amount is small (after consuming cached bytes). Consider returning early when there’s nothing left to read/skip, and renting a buffer sized to min(remainingSkip, 1MB) to reduce unnecessary large rentals for small skips.

Copilot · 2026-01-27T00:43:49Z

src/SharpCompress/Readers/AbstractReader.cs

    internal async ValueTask WriteAsync(Stream writeStream, CancellationToken cancellationToken)
    {
 #if LEGACY_DOTNET
        using Stream s = await OpenEntryStreamAsync(cancellationToken).ConfigureAwait(false);
        var sourceStream = WrapWithProgress(s, Entry);
-        await sourceStream.CopyToAsync(writeStream, 81920, cancellationToken).ConfigureAwait(false);
+        await sourceStream
+            .CopyToAsync(writeStream, 1048576, cancellationToken)
+            .ConfigureAwait(false); // 1MB buffer
 #else
        await using Stream s = await OpenEntryStreamAsync(cancellationToken).ConfigureAwait(false);
        var sourceStream = WrapWithProgress(s, Entry);
-        await sourceStream.CopyToAsync(writeStream, 81920, cancellationToken).ConfigureAwait(false);
+        await sourceStream
+            .CopyToAsync(writeStream, 1048576, cancellationToken)
+            .ConfigureAwait(false); // 1MB buffer


CopyToAsync with a 1MB buffer will allocate a new 1MB buffer per call (LOH) in the framework implementation. For many-entry extractions this can create substantial GC/LOH pressure. Consider using a pooled-buffer async copy loop (ArrayPool) or keeping the buffer size under the LOH threshold and making the larger size opt-in/configurable.

Copilot · 2026-01-27T00:43:49Z

src/SharpCompress/Archives/IArchiveEntryExtensions.cs

 public static class IArchiveEntryExtensions
 {
-    private const int BufferSize = 81920;
+    private const int BufferSize = 1048576; // 1MB buffer for better disk I/O performance


Setting BufferSize to 1MB affects CopyTo/CopyToAsync below, which will allocate a new 1MB byte[] per call (LOH) in the framework implementation. This can significantly increase GC pressure for archives with many (especially small) entries. Consider keeping the CopyTo buffer below the LOH threshold or using a pooled-buffer copy implementation so the larger buffer doesn’t allocate per extraction call.

Suggested change

private const int BufferSize = 1048576; // 1MB buffer for better disk I/O performance

private const int BufferSize = 81920; // 80KB buffer to avoid LOH allocations while maintaining good I/O performance

julianxhokaxhiu · 2026-01-27T00:46:26Z

I'll take care of Copilot suggestions tomorrow. Thanks

adamhathcock · 2026-01-27T07:28:28Z

There's probably a few things here that might need to be separate PRs to be consumable.

making buffer size an option (ideally pooling) which I want to do
improvements for copying/Skip extension methods
improvements to LZMA in general

I definitely want to do them, but just break them down. It helps me as I'm in the middle of a large refactor to get async working all the way.

I'll make a stab at #1 today

adamhathcock · 2026-01-27T07:54:10Z

I just realized this PR is against master which is currently dev. You probably want release so we can get these kind of fixes out faster

julianxhokaxhiu · 2026-01-27T08:29:07Z

I'll have to rework the entire patch then as the PR states there are 249 file changes now which is not what I want. I didn't knew release is the branch currently used to release current nuget artifacts. I'll update this to point again to master and refine the patch, and eventually see how we can backport that to release after if it's ok with you. Let me know, cheers

adamhathcock · 2026-01-27T11:03:16Z

I think the release branch is now have buffer size centralized into a single static which should make things easier.

I'll back port this to master which should have further changes later

#1165

julianxhokaxhiu · 2026-01-27T16:01:23Z

I'll close this PR then in favor of yours, so we keep this one only as a ref.

Improve skip call CPU hotpath

bae32c4

Copilot AI review requested due to automatic review settings January 27, 2026 00:36

Copilot started reviewing on behalf of julianxhokaxhiu January 27, 2026 00:36 View session

Copilot AI reviewed Jan 27, 2026

View reviewed changes

julianxhokaxhiu changed the base branch from master to release January 27, 2026 08:26

julianxhokaxhiu changed the base branch from release to master January 27, 2026 08:29

julianxhokaxhiu closed this Jan 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Improve I/O operations while reading big solid archives#1163

[WIP] Improve I/O operations while reading big solid archives#1163
julianxhokaxhiu wants to merge 1 commit intoadamhathcock:masterfrom
julianxhokaxhiu:fix/lzma-slow

julianxhokaxhiu commented Jan 27, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

julianxhokaxhiu commented Jan 27, 2026

Uh oh!

adamhathcock commented Jan 27, 2026

Uh oh!

adamhathcock commented Jan 27, 2026

Uh oh!

julianxhokaxhiu commented Jan 27, 2026

Uh oh!

adamhathcock commented Jan 27, 2026

Uh oh!

julianxhokaxhiu commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	private const int BufferSize = 1048576; // 1MB buffer for better disk I/O performance
	private const int BufferSize = 81920; // 80KB buffer to avoid LOH allocations while maintaining good I/O performance

Conversation

julianxhokaxhiu commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance Optimization Summary for 7Zip Solid Archive Extraction

Problem

Root Causes Identified Through Profiling

Optimizations Implemented

1. Skip Operation Optimization

2. Increased I/O Buffer Sizes (80KB → 1MB)

3. FileStream Optimization

4. Path Processing Optimization

5. LZMA Decompression Micro-optimizations

Performance Results

Technical Notes

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

julianxhokaxhiu commented Jan 27, 2026

Uh oh!

adamhathcock commented Jan 27, 2026

Uh oh!

adamhathcock commented Jan 27, 2026

Uh oh!

julianxhokaxhiu commented Jan 27, 2026

Uh oh!

adamhathcock commented Jan 27, 2026

Uh oh!

julianxhokaxhiu commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

julianxhokaxhiu commented Jan 27, 2026 •

edited

Loading