Skip to content

Memory is leaking when multithreading #899

@VoidXH

Description

@VoidXH

For a ~2 GB 7z with ~40k files (this one specifically), running the following code will eat 64 GBs of memory in a minute:

static void Extract(string archive, string output) {
    Console.WriteLine($"Extracting {Path.GetFileName(archive)}...");
    using SevenZipArchive release = SevenZipArchive.Open(archive);
    SevenZipArchiveEntry[] toExtract = release.Entries.Where(entry => !entry.IsDirectory).ToArray();

    ExtractionOptions options = new() {
        ExtractFullPath = true,
        Overwrite = true
    };
    int done = 0;
    DateTime nextUpdate = default;
    Parallel.For(0, toExtract.Length, i => {
        Interlocked.Increment(ref done);
        SevenZipArchiveEntry entry = toExtract[i];
        string path = Path.Combine(output, entry.Key);
        if (File.Exists(path) && new FileInfo(path).Length == entry.Size) {
            return;
        }
        try {
            SevenZipArchive handle = SevenZipArchive.Open(archive);
            handle.Entries.First(x => x.Key == entry.Key).WriteToDirectory(output, options);
            handle.Dispose();
        } catch {
            Console.WriteLine($"[WARN] Couldn't extract {toExtract[i].Key}.");
        }

        if (nextUpdate < DateTime.Now || done == toExtract.Length) {
            nextUpdate = DateTime.Now + TimeSpan.FromSeconds(1);
            //ProgressBar(done, toExtract.Length);
        }
    });
    Console.WriteLine("\nExtraction completed.");
}

Note that the opening of a new archive is needed in the try block, because multithreading a single handle to the archive would make WriteToDirectory throw a lot of exceptions.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions