Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feedback] (the dataset download link gets 403 error) docs/components/training/user-guides/pytorch.md | #3927

Open
itaynvn-runai opened this issue Nov 20, 2024 · 1 comment

Comments

@itaynvn-runai
Copy link

itaynvn-runai commented Nov 20, 2024

issue:

following this guide:
https://www.kubeflow.org/docs/components/training/user-guides/pytorch/

which is using this image:

gcr.io/kubeflow-ci/pytorch-dist-mnist_test:1.0

that attempts to download this file:

http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz

but as of today, requesting this link gets 403 status.

here you can see the proper output for this image:
https://developer-qa.nvidia.com/blog/gpu-containers-runtime/#:~:text=Try%20running%20the%20MNIST%20training%20example%20included%20with%20the%20container%3A

suggestions:

  1. use links from this mirror instead, which is hosted on github and probably will be more reliable
https://github.com/fgnt/mnist
  1. allow to provide links to these files using env vars, to prevent hardcoding links that might be dead sometime.

notes:
i assume this link is hardcoded in a script which is used in the dockerfile used to build this image.
i found several references to this link across the kubeflow github:
https://github.com/search?q=org%3Akubeflow%20%22train-images-idx3-ubyte.gz%22&type=code
but couldn't trace the dockerfile used to build this image, nor detect which of these scripts was used in it.

@itaynvn-runai
Copy link
Author

itaynvn-runai commented Nov 23, 2024

tested with this image: kubeflow/pytorch-dist-mnist:latest(latest tag, pushed at 22/11/2024)
https://hub.docker.com/r/kubeflow/pytorch-dist-mnist/tags

the links were switched to a public S3 bucket, and download process completes:

Using distributed PyTorch with gloo backend
World Size: 2. Rank: 1
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ../data/FashionMNIST/raw/train-images-idx3-ubyte.gz

  0%|          | 0/26421880 [00:00<?, ?it/s]
  0%|          | 65536/26421880 [00:00<01:12, 365219.76it/s]
  1%|          | 229376/26421880 [00:00<00:38, 685094.04it/s]
  3%|▎         | 917504/26421880 [00:00<00:09, 2610886.88it/s]
  7%|▋         | 1933312/26421880 [00:00<00:05, 4111033.66it/s]
 26%|██▌       | 6848512/26421880 [00:00<00:01, 16200010.18it/s]
 38%|███▊      | 10059776/26421880 [00:00<00:00, 20608644.80it/s]
 47%|████▋     | 12517376/26421880 [00:01<00:00, 17876773.56it/s]
 64%|██████▍   | 16973824/26421880 [00:01<00:00, 24547329.01it/s]
 84%|████████▍ | 22315008/26421880 [00:01<00:00, 26412748.88it/s]
 98%|█████████▊| 25985024/26421880 [00:01<00:00, 24075278.44it/s]
100%|██████████| 26421880/26421880 [00:01<00:00, 16889476.36it/s]
Extracting ../data/FashionMNIST/raw/train-images-idx3-ubyte.gz to ../data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ../data/FashionMNIST/raw/train-labels-idx1-ubyte.gz

  0%|          | 0/29515 [00:00<?, ?it/s]
100%|██████████| 29515/29515 [00:00<00:00, 325193.23it/s]
Extracting ../data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ../data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ../data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz

  0%|          | 0/4422102 [00:00<?, ?it/s]
  1%|▏         | 65536/4422102 [00:00<00:12, 361558.72it/s]
  5%|▌         | 229376/4422102 [00:00<00:06, 681986.84it/s]
 21%|██        | 917504/4422102 [00:00<00:01, 2593771.27it/s]
 44%|████▎     | 1933312/4422102 [00:00<00:00, 4090096.69it/s]
100%|██████████| 4422102/4422102 [00:00<00:00, 6085832.68it/s]
Extracting ../data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ../data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ../data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz

FYI this new image should replace these 2 old images, currently used in alot of the examples across the repo:

gcr.io/kubeflow-ci/pytorch-dist-mnist_test:1.0 (latest tag, pushed at 07/03/2019)
https://console.cloud.google.com/gcr/images/kubeflow-ci/global/pytorch-dist-mnist_test

gcr.io/kubeflow-ci/pytorch-dist-mnist-test:v1.0 (latest tag, pushed at 03/03/2019)
https://console.cloud.google.com/gcr/images/kubeflow-ci/global/pytorch-dist-mnist-test

@itaynvn-runai itaynvn-runai reopened this Nov 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant