Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion torchtrain/datasets/download_tokenizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ def hf_download(repo_id: Optional[str] = None, hf_token: Optional[str] = None) -
from huggingface_hub import hf_hub_download
os.makedirs(f"checkpoints/{repo_id}", exist_ok=True)
try:
hf_hub_download(repo_id, "tokenizer.model", local_dir=f"torchtrain/datasets/tokenizer/", local_dir_use_symlinks=False, token=hf_token)
hf_hub_download(repo_id, "tokenizer.model", local_dir=f"./tokenizer/", local_dir_use_symlinks=False, token=hf_token)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm I don't think this would solve the problem, this is relative dir but ppl would run this either on the root folder or under the dataset folder. If you want to fix this I prefer sth like a absolute path as default, as expose a --path arg

except HTTPError as e:
if e.response.status_code == 401:
print("You need to pass a valid `--hf_token=...` to download private checkpoints.")
Expand Down