High-performance parallelized implementations of common zip file operations.
See discussion in pex-tool/pex#2158.
This crate adds some hacks to the widely-used zip crate (see the diff at https://github.com/zip-rs/zip/compare/master...cosmicexplorer:zip:merge-entries?expand=1). When the merge feature is provided to this fork of zip, two crimes are unveiled:
merge_archive():- This will copy over the contents of another zip file into the current one without deserializing any data.
- This enables parallelization of arbitrary zip commands, as multiple zip files can be created in parallel and then merged afterwards.
finish_into_readable():- Creating a writable
ZipWriterand then converting it into a readableZipArchiveis a very common operation when merging zip files. - This likely has zero performance benefit, but it is a good example of the types of investigations you can do with the zip format, especially against the well-written
zipcrate.
- Creating a writable
We mainly need compatibility with zipfile and zipimport (see pex-tool/pex#2158 (comment)). Also see the zipimport PEP. I currently believe that this program's output will work perfectly against zipfile and zipimport.
- benchmark zip creation (vs
zipcrate) - benchmark zip merging (vs
zipcrate)- this should also really be done in the
zip-mergecrate, too
- this should also really be done in the