Skip to content

Cannot load dataset with 2.14.5: FileNotFound error #6305

@finiteautomata

Description

@finiteautomata

Describe the bug

I'm trying to load [piuba-bigdata/articles_and_comments] and I'm stumbling with this error on 2.14.5. However, this works on 2.10.0.

Steps to reproduce the bug

Colab link

Downloading readme: 100%
1.19k/1.19k [00:00<00:00, 30.9kB/s]
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
[<ipython-input-2-807c3583d297>](https://localhost:8080/#) in <cell line: 3>()
      1 from datasets import load_dataset
      2 
----> 3 load_dataset("piuba-bigdata/articles_and_comments", split="train")

2 frames
[/usr/local/lib/python3.10/dist-packages/datasets/load.py](https://localhost:8080/#) in load_dataset(path, name, data_dir, data_files, split, cache_dir, features, download_config, download_mode, verification_mode, ignore_verifications, keep_in_memory, save_infos, revision, token, use_auth_token, task, streaming, num_proc, storage_options, **config_kwargs)
   2127 
   2128     # Create a dataset builder
-> 2129     builder_instance = load_dataset_builder(
   2130         path=path,
   2131         name=name,

[/usr/local/lib/python3.10/dist-packages/datasets/load.py](https://localhost:8080/#) in load_dataset_builder(path, name, data_dir, data_files, cache_dir, features, download_config, download_mode, revision, token, use_auth_token, storage_options, **config_kwargs)
   1813         download_config = download_config.copy() if download_config else DownloadConfig()
   1814         download_config.storage_options.update(storage_options)
-> 1815     dataset_module = dataset_module_factory(
   1816         path,
   1817         revision=revision,

[/usr/local/lib/python3.10/dist-packages/datasets/load.py](https://localhost:8080/#) in dataset_module_factory(path, revision, download_config, download_mode, dynamic_modules_path, data_dir, data_files, **download_kwargs)
   1506                     raise e1 from None
   1507                 if isinstance(e1, FileNotFoundError):
-> 1508                     raise FileNotFoundError(
   1509                         f"Couldn't find a dataset script at {relative_to_absolute_path(combined_path)} or any data file in the same directory. "
   1510                         f"Couldn't find '{path}' on the Hugging Face Hub either: {type(e1).__name__}: {e1}"

FileNotFoundError: Couldn't find a dataset script at /content/piuba-bigdata/articles_and_comments/articles_and_comments.py or any data file in the same directory. Couldn't find 'piuba-bigdata/articles_and_comments' on the Hugging Face Hub either: FileNotFoundError: No (supported) data files or dataset script found in piuba-bigdata/articles_and_comments.

Expected behavior

It should load normally.

Environment info

- `datasets` version: 2.14.5
- Platform: Linux-5.15.120+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.18.0
- PyArrow version: 9.0.0
- Pandas version: 1.5.3

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions