Release October Release + Unsloth Docker! · unslothai/unsloth

Hey everyone, please update Unsloth to use the latest updates! 🦥

Unsloth now has its own 🐋 Docker image! Start training with no setup: Read our Guide • Docker image
We collabed with NVIDIA for Blackwell and DGX Spark support. Read our Blackwell guide and DGX guide.

New model updates

Qwen3-VL models are all now supported: Blogpost • SFT 8B notebook • GRPO 8B notebook
IBM Granite-4.0 models are now supported. Granite-4.0 guide • Notebook
OpenAI showcased our new gpt-oss RL notebook for autonomously solving the 2048 game. Blogpost • Notebook
Read about our GLM-4.6 chat template fixes and how to run the model here

New features

Introducing Quantization-Aware Training: We collabed with Pytorch for QAT, recovering as much 70% accuracy. Read blog
Unsloth supports OpenEnv to allow for open RL environments. Blog coming soon • Notebook
New customer support agent notebook to enable real-time analysis & solving of customer interactions. You'll also learn how to train models using data from Google Sheets.
Support for Python 3.13, PyTorch 2.9 and the latest Hugging Face TRL and transformers are now fixed.
Save to TorchAO supported as well:

from torchao.quantization import Int4WeightOnlyConfig
model.save_pretrained_torchao("model", tokenizer, torchao_config = Int4WeightOnlyConfig())

Tip

Update Unsloth via pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo
If you want PyTorch 2.9: pip install --upgrade unsloth unsloth_zoo

RL Improvements

Fixed Standby consuming more VRAM than usual. Auto selects the maximum 80% to 95% of GPU utilization if import os; os.environ["UNSLOTH_VLLM_STANDBY"] = "1" is used.
Fixed GRPO training hangs with better environment timers - works on DGX Spark and all other GPUs.
Fixes GRPO RuntimeError: shape '[1, 887, 1, 128]' is invalid for input of size 3633152 for all models

RL Environment functions

New execute_with_time_limit function to force functions to execute within a time limit. E.g. with a 2 second time limit, use:

from unsloth import execute_with_time_limit
@execute_with_time_limit(2)
def execute_strategy(strategy, game):
    return _execute_strategy(strategy, game)
try:
    execute_strategy(strategy, game)
except TimeoutError as e:
    print(f"Timed out with error = {str(e)}")

To check if only Python standard modules are used in a function, use check_python_modules.
Use create_locked_down_function to create a function without leakage of global variables.
Use Benchmarker ie from unsloth import Benchmarker to benchmark functions accurately. It wipes the L1 to L3 cache approximately to reduce chances of benchmark cheating.
Use launch_openenv to launch a continuous reloaded OpenEnv environment process (to stop it from closing down) ie from unsloth import launch_openenv It will auto find a port that is not used.

Bug fixes

GPT-OSS BF16 The GPTOSSRouter works with load_in_4bit = True AttributeError: 'GptOssTopKRouter' object has no attribute 'weight'
Mistral training fixed - sentencepiece proto issue fixed (any protobuf version works)
Fix evaluation ie UNSLOTH_RETURN_LOGITS="1" works. Fixes #3126 #3071
Fixes Output 0 of UnslothFusedLossBackward is a view and is being modified inplace. for Gemma 3 and transformers>=4.57.1
If you see ImportError: cannot import name '_Ink' from 'PIL._typing' (/usr/local/lib/python3.12/dist-packages/PIL/_typing.py) please update and use our new notebooks

Don't forget to also join our Reddit: r/unsloth 🥰

What's Changed

Fix loading as 8bit by @Etherll in #3384
Nightly by @danielhanchen in #3392
Nightly by @danielhanchen in #3394
Update int8-int4 QAT config to use Int8DynamicActivationIntxWeightConfig by @metascroy in #3391
Gemma 3 bug fixes by @danielhanchen in #3410
Transformers Fix v4.57 rename from PretrainedConfig to PreTrainedConfig by @mmathew23 in #3445
improve qat by @Etherll in #3446
Fix eval metric issue by @pluesclues in #3420
[Part2] Reinstate llama.cpp Compatibility and GGUF Conversion with Multiple Quantizations and Automated Ollama Modelfile Creation by @rolandtannous in #3356
vLLM FP8 quantized support for SFT/GRPO by @Datta0 in #3414
Fix by @danielhanchen in #3466
AMD fixes by @danielhanchen in #3467
Fix transformers 4.57.1 by @danielhanchen in #3473
GRPO bug fixes by @danielhanchen in #3474
EOL LF (unix line endings) normalization by @djsaunde in #3478
Fix out of resources issue for llama3.2 sft on amd gpu by @wangxunx in #3455
Bug fixes by @danielhanchen in #3483
Bug fixes by @danielhanchen in #3484
Patch sleep mode properly for trl by @Datta0 in #3492
Sleep trl patch by @Datta0 in #3494
fix cross entropy loss issue for small vocab size on amd gpu by @wangxunx in #3503
Gemma 3n fix by @mmathew23 in #3499
enable intel for torch2.8 by @leizhenyuan in #3381
add code for intel qlora by @leizhenyuan in #3370
fix for intel memory calculation by @leizhenyuan in #3513
[intel] enable support 2.9 for intel xpu by @leizhenyuan in #3514
FP8 training enhancements by @Datta0 in #3496

New Contributors

@metascroy made their first contribution in #3391
@djsaunde made their first contribution in #3478
@wangxunx made their first contribution in #3455

Full Changelog: September-2025-v3...October-2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

October Release + Unsloth Docker!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

New model updates

New features

RL Improvements

RL Environment functions

Bug fixes

What's Changed

New Contributors

Contributors

Uh oh!