Skip to content

Conversation

@mariosasko
Copy link
Collaborator

This PR removes a check that's been added in #2684. My intention with this check was to capture an URL in the error message, but instead, it captures a substring of the previous regex match in the test function. Another option would be to replace this check with:

m_paths = re.findall(r"\S*_dummy/_dummy.py\b", str(exc_info.value))  # on Linux this will match an URL as well as a local_path due to different os.sep, so take the last element (an URL always comes last in the list)
assert len(m_paths) > 0 and is_remote_url(m_paths[-1])  # is_remote_url comes from datasets.utils.file_utils

@lhoestq Let me know which one of these two approaches (delete or replace) do you prefer?

@lhoestq
Copy link
Member

lhoestq commented Jul 28, 2021

Hi ! I did a change for this test already in #2662 :

datasets/tests/test_load.py

Lines 312 to 316 in 00686c4

with pytest.raises(FileNotFoundError) as exc_info:
datasets.load_dataset(SAMPLE_DATASET_NAME_THAT_DOESNT_EXIST)
m_combined_path = re.search(fr"http\S*{re.escape(os.path.join(SAMPLE_DATASET_NAME_THAT_DOESNT_EXIST, SAMPLE_DATASET_NAME_THAT_DOESNT_EXIST + '.py'))}\b", str(exc_info.value))
assert m_combined_path is not None and is_remote_url(m_combined_path.group())
assert os.path.abspath(SAMPLE_DATASET_NAME_THAT_DOESNT_EXIST) in str(exc_info.value)

(though I have to change the variable name m_combined_path to m_url or something)

I guess it's ok to remove this check for now :)

@lhoestq lhoestq merged commit 35b5e4b into huggingface:master Jul 28, 2021
@mariosasko mariosasko deleted the fix-test-load-check branch July 28, 2021 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants