Skip to content

Datasets created with push_to_hub can't be accessed in offline mode #3547

@TevenLeScao

Description

@TevenLeScao

Describe the bug

In offline mode, one can still access previously-cached datasets. This fails with datasets created with push_to_hub.

Steps to reproduce the bug

in Python:

import datasets
mpwiki = datasets.load_dataset("teven/matched_passages_wikidata")

in bash:

export HF_DATASETS_OFFLINE=1

in Python:

import datasets
mpwiki = datasets.load_dataset("teven/matched_passages_wikidata")

Expected results

datasets should find the previously-cached dataset.

Actual results

ConnectionError: Couln't reach the Hugging Face Hub for dataset 'teven/matched_passages_wikidata': Offline mode is enabled

Environment info

  • datasets version: 1.16.2.dev0
  • Platform: Linux-4.18.0-193.70.1.el8_2.x86_64-x86_64-with-glibc2.17
  • Python version: 3.8.10
  • PyArrow version: 3.0.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions