[Bug]  Offline Training and Exporting Is broken

Reply to https://www.reddit.com/r/unsloth/comments/1u5v4qc/comment/ov489z5/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

Unsloth Studio in offline mode (no internet) does not start training and export.
The logs show that when trying to select a model (by repo name), Studio makes many HTTP requests to huggingface.co to determine if the model is an embedding model, its size, and also tries to download config.json, tokenizer_config.json, adapter_config.json, and files of the embedding model unsloth/bge-small-en-v1.5.
All requests fail with DNS error [Errno 11001] getaddrinfo failed, retries happen, and training never starts.
However, the model is already downloaded in the local cache, but these checks ignore local files and the HF_HUB_OFFLINE environment variable.
Please make these preliminary checks (model type, size, and file presence) either skipped in offline mode, or use the local cache without going to the network.

I asked Qwen 3.7 to make an accurate extract of the technical data strictly from my logs:

Exact cause of the Export crash ("tokenizer is weirdly not loaded"):
In the export subprocess logs (offline mode), right before the crash, it prints this exact warning:
Unsloth: Warning - VLM processor fallback returned None for model_type=gemma4
Then it crashes with:
RuntimeError: Unsloth: The tokenizer is weirdly not loaded? Please check if there is one.
The traceback shows the exact failure point is inside the transformers library making a hidden network call:
transformers/tokenization_utils_tokenizers.py line 1297 in _patch_mistral_regex calls is_base_mistral().
is_base_mistral() (line 1287) calls model_info(model_id).
This model_info() makes an HTTP GET request and throws httpx.ConnectError: [Errno 11001] getaddrinfo failed.

Exact files requested during Training pre-flight checks (Offline):
When trying to start training for llmfan46/gemma-4-E4B-it-ultra-uncensored-heretic offline, the backend hangs making HEAD requests for these exact files:
config.json, adapter_config.json, tokenizer_config.json, preprocessor_config.json, processor_config.json.
This generates these exact warnings in the log:
Could not determine if llmfan46/gemma-4-E4B-it-ultra-uncensored-heretic is embedding model
Could not check GGUF files for 'llmfan46/gemma-4-E4B-it-ultra-uncensored-heretic' after 3 attempts
Could not get model size for llmfan46/gemma-4-E4B-it-ultra-uncensored-heretic

Exact files requested by the background RAG thread (Offline):
The _warm_rag_embedder thread blocks the process trying to load unsloth/bge-small-en-v1.5 via SentenceTransformer. It makes HEAD requests for these exact files:
modules.json, config_sentence_transformers.json, [README.md](http://readme.md/), sentence_bert_config.json, adapter_config.json.
It retries 5 times for each file, causing massive delays.
Important note about the local cache:
The model files are already fully downloaded and exist in my local Hugging Face cache (C:\Users\User\.cache\huggingface\hub). However, the Studio pre-flight checks and the transformers tokenizer initialization completely ignore the local cache when offline. They do not pass local_files_only=True and do not respect the offline environment variables, forcing network requests (HEAD requests and model_info()) instead of just reading the existing files from the disk.



Logs: 
[Export to GGUF. The first 2 attempts are without internet, the third is with internet..txt](https://github.com/user-attachments/files/29602573/Export.to.GGUF.The.first.2.attempts.are.without.internet.the.third.is.with.internet.txt)
[Training with the internet.txt](https://github.com/user-attachments/files/29602574/Training.with.the.internet.txt)
[Training without the internet.txt](https://github.com/user-attachments/files/29602572/Training.without.the.internet.txt)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug] Offline Training and Exporting Is broken #6817

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Uh oh!

[Bug] Offline Training and Exporting Is broken #6817

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions