·
          
            19 commits
          
          to main
          since this release
        
        
        
Hey everyone, please update Unsloth to use the latest updates! π¦₯
- Unsloth now has its own π Docker image! Start training with no setup: Read our Guide β’ Docker image
- We collabed with NVIDIA for Blackwell and DGX Spark support. Read our Blackwell guide and DGX guide.
  
New model updates
- Qwen3-VL models are all now supported: Blogpost β’ SFT 8B notebook β’ GRPO 8B notebook
- IBM Granite-4.0 models are now supported. Granite-4.0 guide β’ Notebook
- OpenAI showcased our new gpt-oss RL notebook for autonomously solving the 2048 game. Blogpost β’ Notebook
- Read about our GLM-4.6 chat template fixes and how to run the model here
New features
- Introducing Quantization-Aware Training: We collabed with Pytorch for QAT, recovering as much 70% accuracy. Read blog
  
- Unsloth supports OpenEnv to allow for open RL environments. Blog coming soon β’ Notebook
- New customer support agent notebook to enable real-time analysis & solving of customer interactions. You'll also learn how to train models using data from Google Sheets.
- Support for Python 3.13, PyTorch 2.9 and the latest Hugging Face TRL and transformers are now fixed.
- Save to TorchAO supported as well:
from torchao.quantization import Int4WeightOnlyConfig
model.save_pretrained_torchao("model", tokenizer, torchao_config = Int4WeightOnlyConfig())Tip
Update Unsloth via pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo
If you want PyTorch 2.9: pip install --upgrade unsloth unsloth_zoo
RL Improvements
- Fixed Standby consuming more VRAM than usual. Auto selects the maximum 80% to 95% of GPU utilization if import os; os.environ["UNSLOTH_VLLM_STANDBY"] = "1"is used.
- Fixed GRPO training hangs with better environment timers - works on DGX Spark and all other GPUs.
- Fixes GRPO RuntimeError: shape '[1, 887, 1, 128]' is invalid for input of size 3633152for all models
RL Environment functions
- New execute_with_time_limitfunction to force functions to execute within a time limit. E.g. with a 2 second time limit, use:
from unsloth import execute_with_time_limit
@execute_with_time_limit(2)
def execute_strategy(strategy, game):
    return _execute_strategy(strategy, game)
try:
    execute_strategy(strategy, game)
except TimeoutError as e:
    print(f"Timed out with error = {str(e)}")- To check if only Python standard modules are used in a function, use check_python_modules.
- Use create_locked_down_functionto create a function without leakage of global variables.
- Use Benchmarkeriefrom unsloth import Benchmarkerto benchmark functions accurately. It wipes the L1 to L3 cache approximately to reduce chances of benchmark cheating.
- Use launch_openenvto launch a continuous reloaded OpenEnv environment process (to stop it from closing down) iefrom unsloth import launch_openenvIt will auto find a port that is not used.
Bug fixes
- GPT-OSS BF16 The GPTOSSRouter works with load_in_4bit = TrueAttributeError: 'GptOssTopKRouter' object has no attribute 'weight'
- Mistral training fixed - sentencepiece proto issue fixed (any protobuf version works)
- Fix evaluation ie UNSLOTH_RETURN_LOGITS="1"works. Fixes #3126 #3071
- Fixes Output 0 of UnslothFusedLossBackward is a view and is being modified inplace.for Gemma 3 andtransformers>=4.57.1
- If you see ImportError: cannot import name '_Ink' from 'PIL._typing' (/usr/local/lib/python3.12/dist-packages/PIL/_typing.py)please update and use our new notebooks
Don't forget to also join our Reddit: r/unsloth π₯°
What's Changed
- Fix loading as 8bit by @Etherll in #3384
- Nightly by @danielhanchen in #3392
- Nightly by @danielhanchen in #3394
- Update int8-int4 QAT config to use Int8DynamicActivationIntxWeightConfig by @metascroy in #3391
- Gemma 3 bug fixes by @danielhanchen in #3410
- Transformers Fix v4.57 rename from PretrainedConfig to PreTrainedConfig by @mmathew23 in #3445
- improve qat by @Etherll in #3446
- Fix eval metric issue by @pluesclues in #3420
- [Part2] Reinstate llama.cpp Compatibility and GGUF Conversion with Multiple Quantizations and Automated Ollama Modelfile Creation by @rolandtannous in #3356
- vLLM FP8 quantized support for SFT/GRPO by @Datta0 in #3414
- Fix by @danielhanchen in #3466
- AMD fixes by @danielhanchen in #3467
- Fix transformers 4.57.1 by @danielhanchen in #3473
- GRPO bug fixes by @danielhanchen in #3474
- EOL LF (unix line endings) normalization by @djsaunde in #3478
- Fix out of resources issue for llama3.2 sft on amd gpu by @wangxunx in #3455
- Bug fixes by @danielhanchen in #3483
- Bug fixes by @danielhanchen in #3484
- Patch sleep mode properly for trl by @Datta0 in #3492
- Sleep trl patch by @Datta0 in #3494
- fix cross entropy loss issue for small vocab size on amd gpu by @wangxunx in #3503
- Gemma 3n fix by @mmathew23 in #3499
- enable intel for torch2.8 by @leizhenyuan in #3381
- add code for intel qlora by @leizhenyuan in #3370
- fix for intel memory calculation by @leizhenyuan in #3513
- [intel] enable support 2.9 for intel xpu by @leizhenyuan in #3514
- FP8 training enhancements by @Datta0 in #3496
New Contributors
- @metascroy made their first contribution in #3391
- @djsaunde made their first contribution in #3478
- @wangxunx made their first contribution in #3455
Full Changelog: September-2025-v3...October-2025