-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed
Description
Describe the bug
Hello, when I ran the code snippet on the document, I encountered the following problem:
Python 3.10.9 (main, Mar 1 2023, 18:23:06) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from datasets import load_dataset
>>> base_url = "https://rajpurkar.github.io/SQuAD-explorer/dataset/"
>>> dataset = load_dataset("json", data_files={"train": base_url + "train-v1.1.json", "validation": base_url + "dev-v1.1.json"}, field="data")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/liushuai/anaconda3/lib/python3.10/site-packages/datasets/load.py", line 2112, in load_dataset
builder_instance = load_dataset_builder(
File "/home/liushuai/anaconda3/lib/python3.10/site-packages/datasets/load.py", line 1798, in load_dataset_builder
dataset_module = dataset_module_factory(
File "/home/liushuai/anaconda3/lib/python3.10/site-packages/datasets/load.py", line 1413, in dataset_module_factory
).get_module()
File "/home/liushuai/anaconda3/lib/python3.10/site-packages/datasets/load.py", line 949, in get_module
data_files = DataFilesDict.from_patterns(
File "/home/liushuai/anaconda3/lib/python3.10/site-packages/datasets/data_files.py", line 672, in from_patterns
DataFilesList.from_patterns(
File "/home/liushuai/anaconda3/lib/python3.10/site-packages/datasets/data_files.py", line 578, in from_patterns
resolve_pattern(
File "/home/liushuai/anaconda3/lib/python3.10/site-packages/datasets/data_files.py", line 340, in resolve_pattern
for filepath, info in fs.glob(pattern, detail=True).items()
File "/home/liushuai/anaconda3/lib/python3.10/site-packages/fsspec/asyn.py", line 113, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/home/liushuai/anaconda3/lib/python3.10/site-packages/fsspec/asyn.py", line 98, in sync
raise return_result
File "/home/liushuai/anaconda3/lib/python3.10/site-packages/fsspec/asyn.py", line 53, in _runner
result[0] = await coro
File "/home/liushuai/anaconda3/lib/python3.10/site-packages/fsspec/implementations/http.py", line 449, in _glob
elif await self._exists(path):
File "/home/liushuai/anaconda3/lib/python3.10/site-packages/fsspec/implementations/http.py", line 306, in _exists
r = await session.get(self.encode_url(path), **kw)
File "/home/liushuai/anaconda3/lib/python3.10/site-packages/aiohttp/client.py", line 922, in get
self._request(hdrs.METH_GET, url, allow_redirects=allow_redirects, **kwargs)
TypeError: ClientSession._request() got an unexpected keyword argument 'https'
Steps to reproduce the bug
from datasets import load_dataset
base_url = "https://rajpurkar.github.io/SQuAD-explorer/dataset/"
dataset = load_dataset("json", data_files={"train": base_url + "train-v1.1.json", "validation": base_url + "dev-v1.1.json"}, field="data")
Expected behavior
able to load normally
Environment info
datasetsversion: 2.14.4- Platform: Linux-5.4.54-2-x86_64-with-glibc2.27
- Python version: 3.10.9
- Huggingface_hub version: 0.16.4
- PyArrow version: 12.0.1
- Pandas version: 1.5.3
Metadata
Metadata
Assignees
Labels
No labels