Skip to content

Load dataset with non-existent file #6183

@freQuensy23-coder

Description

@freQuensy23-coder

Describe the bug

When load a dataset from datasets and pass a wrong path to json with the data, error message does not contain something abount "wrong path" or "file do not exist" -
SchemaInferenceError: Please pass `features` or at least one example when writing data

Steps to reproduce the bug

from datasets import load_dataset
load_dataset('json', data_files='/home/alexey/unreal_file.json')

Expected behavior

Raise os FileNotFound error or custom error with informative message

Environment info

# packages in environment at /home/alexey/.conda/envs/alex_LoRA:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             5.1                       1_gnu  
accelerate                0.21.0                   pypi_0    pypi
aiohttp                   3.8.5                    pypi_0    pypi
aiosignal                 1.3.1                    pypi_0    pypi
antlr4-python3-runtime    4.9.3                    pypi_0    pypi
appdirs                   1.4.4                    pypi_0    pypi
asttokens                 2.0.5              pyhd3eb1b0_0  
async-timeout             4.0.3                    pypi_0    pypi
attrs                     23.1.0                   pypi_0    pypi
backcall                  0.2.0              pyhd3eb1b0_0  
bitsandbytes              0.41.1                   pypi_0    pypi
bzip2                     1.0.8                h7b6447c_0  
ca-certificates           2023.05.30           h06a4308_0  
certifi                   2023.7.22                pypi_0    pypi
charset-normalizer        3.2.0                    pypi_0    pypi
click                     8.1.6                    pypi_0    pypi
cmake                     3.27.2                   pypi_0    pypi
comm                      0.1.2           py310h06a4308_0  
contourpy                 1.1.0                    pypi_0    pypi
cycler                    0.11.0                   pypi_0    pypi
datasets                  2.14.4                   pypi_0    pypi
debugpy                   1.6.7           py310h6a678d5_0  
decorator                 5.1.1              pyhd3eb1b0_0  
dill                      0.3.7                    pypi_0    pypi
docker-pycreds            0.4.0                    pypi_0    pypi
executing                 0.8.3              pyhd3eb1b0_0  
filelock                  3.12.2                   pypi_0    pypi
fire                      0.5.0                    pypi_0    pypi
fonttools                 4.42.0                   pypi_0    pypi
frozenlist                1.4.0                    pypi_0    pypi
fsspec                    2023.6.0                 pypi_0    pypi
gitdb                     4.0.10                   pypi_0    pypi
gitpython                 3.1.32                   pypi_0    pypi
huggingface-hub           0.16.4                   pypi_0    pypi
idna                      3.4                      pypi_0    pypi
ipykernel                 6.25.0          py310h2f386ee_0  
ipython                   8.12.2          py310h06a4308_0  
ipython-genutils          0.2.0                    pypi_0    pypi
ipywidgets                8.0.4           py310h06a4308_0  
jedi                      0.18.1          py310h06a4308_1  
jinja2                    3.1.2                    pypi_0    pypi
jsonschema                4.19.0                   pypi_0    pypi
jsonschema-specifications 2023.7.1                 pypi_0    pypi
jupyter_client            8.1.0           py310h06a4308_0  
jupyter_core              5.3.0           py310h06a4308_0  
jupyterlab_widgets        3.0.5           py310h06a4308_0  
kiwisolver                1.4.4                    pypi_0    pypi
ld_impl_linux-64          2.38                 h1181459_1  
libffi                    3.3                  he6710b0_2  
libgcc-ng                 11.2.0               h1234567_1  
libgomp                   11.2.0               h1234567_1  
libsodium                 1.0.18               h7b6447c_0  
libstdcxx-ng              11.2.0               h1234567_1  
libuuid                   1.41.5               h5eee18b_0  
lightning-utilities       0.9.0                    pypi_0    pypi
lit                       16.0.6                   pypi_0    pypi
markupsafe                2.1.3                    pypi_0    pypi
matplotlib                3.7.2                    pypi_0    pypi
matplotlib-inline         0.1.6           py310h06a4308_0  
mpmath                    1.3.0                    pypi_0    pypi
multidict                 6.0.4                    pypi_0    pypi
multiprocess              0.70.15                  pypi_0    pypi
nbformat                  4.2.0                    pypi_0    pypi
ncurses                   6.4                  h6a678d5_0  
nest-asyncio              1.5.6           py310h06a4308_0  
networkx                  3.1                      pypi_0    pypi
numpy                     1.25.2                   pypi_0    pypi
nvidia-cublas-cu11        11.10.3.66               pypi_0    pypi
nvidia-cuda-cupti-cu11    11.7.101                 pypi_0    pypi
nvidia-cuda-nvrtc-cu11    11.7.99                  pypi_0    pypi
nvidia-cuda-runtime-cu11  11.7.99                  pypi_0    pypi
nvidia-cudnn-cu11         8.5.0.96                 pypi_0    pypi
nvidia-cufft-cu11         10.9.0.58                pypi_0    pypi
nvidia-curand-cu11        10.2.10.91               pypi_0    pypi
nvidia-cusolver-cu11      11.4.0.1                 pypi_0    pypi
nvidia-cusparse-cu11      11.7.4.91                pypi_0    pypi
nvidia-nccl-cu11          2.14.3                   pypi_0    pypi
nvidia-nvtx-cu11          11.7.91                  pypi_0    pypi
omegaconf                 2.3.0                    pypi_0    pypi
openssl                   1.1.1v               h7f8727e_0  
packaging                 23.0            py310h06a4308_0  
pandas                    2.0.3                    pypi_0    pypi
parso                     0.8.3              pyhd3eb1b0_0  
pathtools                 0.1.2                    pypi_0    pypi
peft                      0.4.0                    pypi_0    pypi
pexpect                   4.8.0              pyhd3eb1b0_3  
pickleshare               0.7.5           pyhd3eb1b0_1003  
pillow                    10.0.0                   pypi_0    pypi
pip                       23.2.1          py310h06a4308_0  
platformdirs              2.5.2           py310h06a4308_0  
plotly                    5.16.1                   pypi_0    pypi
prompt-toolkit            3.0.36          py310h06a4308_0  
protobuf                  4.24.0                   pypi_0    pypi
psutil                    5.9.0           py310h5eee18b_0  
ptyprocess                0.7.0              pyhd3eb1b0_2  
pure_eval                 0.2.2              pyhd3eb1b0_0  
pyarrow                   12.0.1                   pypi_0    pypi
pygments                  2.15.1          py310h06a4308_1  
pyparsing                 3.0.9                    pypi_0    pypi
python                    3.10.0               h12debd9_5  
python-dateutil           2.8.2              pyhd3eb1b0_0  
pytorch-lightning         2.0.6                    pypi_0    pypi
pytz                      2023.3                   pypi_0    pypi
pyyaml                    6.0.1                    pypi_0    pypi
pyzmq                     25.1.0          py310h6a678d5_0  
readline                  8.2                  h5eee18b_0  
referencing               0.30.2                   pypi_0    pypi
regex                     2023.8.8                 pypi_0    pypi
requests                  2.31.0                   pypi_0    pypi
rpds-py                   0.9.2                    pypi_0    pypi
safetensors               0.3.2                    pypi_0    pypi
scipy                     1.11.1                   pypi_0    pypi
sentencepiece             0.1.99                   pypi_0    pypi
sentry-sdk                1.29.2                   pypi_0    pypi
setproctitle              1.3.2                    pypi_0    pypi
setuptools                68.0.0          py310h06a4308_0  
six                       1.16.0             pyhd3eb1b0_1  
smmap                     5.0.0                    pypi_0    pypi
sqlite                    3.41.2               h5eee18b_0  
stack_data                0.2.0              pyhd3eb1b0_0  
sympy                     1.12                     pypi_0    pypi
tenacity                  8.2.3                    pypi_0    pypi
termcolor                 2.3.0                    pypi_0    pypi
tk                        8.6.12               h1ccaba5_0  
tokenizers                0.13.3                   pypi_0    pypi
torch                     2.0.1                    pypi_0    pypi
torchmetrics              1.0.3                    pypi_0    pypi
tornado                   6.3.2           py310h5eee18b_0  
tqdm                      4.66.1                   pypi_0    pypi
traitlets                 5.7.1           py310h06a4308_0  
transformers              4.31.0                   pypi_0    pypi
triton                    2.0.0                    pypi_0    pypi
typing-extensions         4.7.1                    pypi_0    pypi
tzdata                    2023.3                   pypi_0    pypi
urllib3                   2.0.4                    pypi_0    pypi
wandb                     0.15.8                   pypi_0    pypi
wcwidth                   0.2.5              pyhd3eb1b0_0  
wheel                     0.38.4          py310h06a4308_0  
widgetsnbextension        4.0.5           py310h06a4308_0  
xxhash                    3.3.0                    pypi_0    pypi
xz                        5.4.2                h5eee18b_0  
yarl                      1.9.2                    pypi_0    pypi
zeromq                    4.3.4                h2531618_0  
zlib                      1.2.13               h5eee18b_0  
    active environment : None
       user config file : /home/alexey/.condarc
 populated config files : 
          conda version : 23.1.0
    conda-build version : 3.22.0
         python version : 3.9.13.final.0
       virtual packages : __archspec=1=x86_64
                          __cuda=12.0=0
                          __glibc=2.35=0
                          __linux=5.19.0=0
                          __unix=0=0
       base environment : /opt/anaconda/anaconda3  (read only)
      conda av data dir : /opt/anaconda/anaconda3/etc/conda
  conda av metadata url : None
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /opt/anaconda/anaconda3/pkgs
                          /home/alexey/.conda/pkgs
       envs directories : /home/alexey/.conda/envs
                          /opt/anaconda/anaconda3/envs
               platform : linux-64
             user-agent : conda/23.1.0 requests/2.31.0 CPython/3.9.13 Linux/5.19.0-46-generic ubuntu/22.04.2 glibc/2.35
                UID:GID : 1009:1009
             netrc file : /home/alexey/.netrc
           offline mode : False

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions