-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Description
🐛 Describe the bug
While attempting to download the MNIST dataset using torchvision.datasets.MNIST, I encountered an error that prevents the dataset from downloading successfully. The error indicates an issue with accessing one of the download URLs.
`import torchvision.datasets as datasets
from torch.utils.data import DataLoader
val_ds = datasets.MNIST(root='.', train=False, download=True)
val_dl = DataLoader(val_ds, batch_size=128, shuffle=True)`
Expected Behavior
The MNIST dataset should be downloaded successfully without encountering any HTTP errors.
Actual Behavior
The download fails with a 403 Forbidden error when attempting to access http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz.
Observations
- Unencrypted HTTP Resource: The download is attempting to access a resource over HTTP instead of HTTPS, which may not be secure.
- 403 Forbidden Error: The server is returning a 403 Forbidden error, indicating that access to the resource is not allowed.
It's been this way for some time, so suggest updating the list of mirrors in https://github.com/pytorch/vision/blob/main/torchvision/datasets/mnist.py to not lead to an unsecure/broken endpoint.
mirrors = [ "http://yann.lecun.com/exdb/mnist/", "https://ossci-datasets.s3.amazonaws.com/mnist/", ]
Notably, trying to download the same files directly from @ylecun page https://yann.lecun.com/exdb/mnist/index.html fails with the same error.
Versions
PyTorch version: 2.3.1+cu118
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A
OS: Microsoft Windows 11 Pro
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A
Python version: 3.12.4 (tags/v3.12.4:8e8a4ba, Jun 6 2024, 19:30:16) [MSC v.1940 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-11-10.0.22631-SP0
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3070
Nvidia driver version: 556.12
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture=9
CurrentClockSpeed=3696
DeviceID=CPU0
Family=207
L2CacheSize=2560
L2CacheSpeed=
Manufacturer=GenuineIntel
MaxClockSpeed=3696
Name=Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz
ProcessorType=3
Revision=
Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] torch==2.3.1+cu118
[pip3] torchinfo==1.8.0
[pip3] torchvision==0.18.1+cu118
[pip3] torchviz==0.0.2
[conda] Could not collect