Describe the bug
When loading a dataset with streaming=True, some background process prevents the script from ever returning (or at least for a very long time).
This did not happen with huggingface-hub < 1.
Steps to reproduce the bug
from datasets import load_dataset
ds = load_dataset("IRIIS-RESEARCH/Nepali-Text-Corpus", split="train", streaming=True)
print(next(iter(ds)))
Expected behavior
Should clean up as soon as iterable dataset goes out of scope
Environment info
datasets version: 4.8.5
- Platform: macOS-26.3.1-arm64-arm-64bit
- Python version: 3.11.13
huggingface_hub version: 1.13.0
- PyArrow version: 24.0.0
- Pandas version: 3.0.2
fsspec version: 2026.2.0
Describe the bug
When loading a dataset with
streaming=True, some background process prevents the script from ever returning (or at least for a very long time).This did not happen with
huggingface-hub < 1.Steps to reproduce the bug
from datasets import load_dataset
ds = load_dataset("IRIIS-RESEARCH/Nepali-Text-Corpus", split="train", streaming=True)
print(next(iter(ds)))
Expected behavior
Should clean up as soon as iterable dataset goes out of scope
Environment info
datasetsversion: 4.8.5huggingface_hubversion: 1.13.0fsspecversion: 2026.2.0