Skip to content

Conversation

@albertvillanova
Copy link
Member

@albertvillanova albertvillanova commented Jul 7, 2022

Currently, when _resolve_single_pattern_locally is called from a different drive than the one in pattern, it raises an exception:

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\datasets\io\parquet.py:35: in __init__
    **kwargs,
C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\datasets\builder.py:287: in __init__
    sanitize_patterns(data_files), base_path=base_path, use_auth_token=use_auth_token
C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\datasets\data_files.py:761: in from_local_or_remote
    if not isinstance(patterns_for_key, DataFilesList)
C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\datasets\data_files.py:723: in from_local_or_remote
    data_files = resolve_patterns_locally_or_by_urls(base_path, patterns, allowed_extensions)
C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\datasets\data_files.py:321: in resolve_patterns_locally_or_by_urls
    for path in _resolve_single_pattern_locally(base_path, pattern, allowed_extensions):
C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\datasets\data_files.py:239: in _resolve_single_pattern_locally
    for filepath in glob_iter
C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\datasets\data_files.py:242: in <listcomp>
    os.path.relpath(filepath, base_path), os.path.relpath(pattern, base_path)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = 'C:\\Users\\runneradmin\\AppData\\Local\\Temp\\pytest-of-runneradmin\\pytest-0\\popen-gw0\\data6\\dataset.parquet'
start = '/'

...

E                   ValueError: path is on mount 'C:', start on mount 'D:'

This PR makes sure that base_path is in the same drive as pattern.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 7, 2022

The documentation is not available anymore as the PR was closed or merged.

@albertvillanova albertvillanova changed the title Fix _resolve_single_pattern_locally for Windows Fix _resolve_single_pattern_locally on Windows with multiple drives Jul 7, 2022
@lhoestq
Copy link
Member

lhoestq commented Jul 7, 2022

Good catch ! Sorry I forgot (again) about windows paths when writing this x)

@albertvillanova albertvillanova marked this pull request as ready for review July 7, 2022 16:47
@albertvillanova albertvillanova merged commit 9d49dd7 into huggingface:main Jul 7, 2022
@albertvillanova albertvillanova deleted the fix-windows-data-files branch July 7, 2022 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants