Skip to content

Conversation

@albertvillanova
Copy link
Member

@albertvillanova albertvillanova commented Aug 16, 2021

Previous PR #2798 fixed streaming remote zip files when passing the parameter data_files.

However, that broke streaming zip files used in canonical datasets scripts, which normally have a subsequent join() (patched with xjoin()) after the StreamingDownloadManager.download_and_extract() is called.

This PR fixes this issue and allows streaming zip files both from:

  • canonical datasets scripts and
  • data files.

@albertvillanova albertvillanova changed the title Fix streaming zip files with subsequent join Fix streaming zip files from canonical datasets Aug 16, 2021
@albertvillanova albertvillanova marked this pull request as ready for review August 16, 2021 09:56
@albertvillanova albertvillanova merged commit a5a3f48 into huggingface:master Aug 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant