No immediate action

v0.1.50-beta Mixed 6d

AMD support, MCP, bugfixes, UI improvements

Open

Upgrade now

v0.1.49-beta Mixed 11d

Breaking upgrade

Inkling, Studio UX, CLI MLX

Open

No immediate action

v0.1.48-beta Breaking risk 19d

Studio export, Llama-swap API, performance boosts

Open

Review required

v0.1.47-beta Breaking risk 1mo

Auth RBAC

GLM 5.2, longer context, Chat Canvas, Hub redesign, Secure mode

Open

No immediate action

v0.1.46-beta New feature 1mo

DiffusionGemma, Gemma 4 MTP, Hub, Chat‑with‑Files

Open

Upgrade now

v0.1.45-beta Maintenance 1mo

Breaking upgrade

Studio, CI, Installer, GPU, UI

Open

Upgrade now

v0.1.44-beta Mixed 1mo

Breaking upgrade

MCP tools, Chat UI, Projects, Canvas, Runtime

Open

Upgrade now

v0.1.43-beta Mixed 1mo

Dependencies Breaking upgrade

Mac, Windows, CUDA, Blackwell, Studio updates

Open

Review required

v0.1.42-beta Breaking risk 2mo

Auth RBAC RCE / SSRF

API calls + Studio security + language support

Open

No immediate action

v0.1.41-beta Bug fix 2mo

Studio update fix + UX fixes

Open

Review required

v0.1.40-beta Breaking risk 2mo

Auth RBAC

MTP speculative decoding

Open

v0.1.38-beta Bug fix 2mo

Studio chat template no longer disappears after browser refresh.

Full changelog

You can use local LLMs with tools like Claude Code and Codex by connecting them to Unsloth’s API endpoint. This lets you run models like Qwen and Gemma locally, with additional features such as self-healing tool calling, code execution, and web search. Unsloth makes it easy to deploy a fast API inference endpoint that provides:

Self-healing tool calling, which helps reduce broken or malformed tool calls by 50%
Code execution support, allowing Bash and Python execution for more accurate code outputs.
Advanced Web search that visits and actually reads webpages to gather in-depth info.
Automatic inference settings for GGUF models (temp, top-k etc.)

Models loaded in Unsloth (including GGUFs) are exposed as an authenticated API via llama-server. A long API key is generated for security reasons like how OpenAI provides one. Your local models can then be used directly in your preferred AI agent, SDK, or chat client. Unsloth speaks two dialects on the same port:

Anthropic-compatible /v1/messages for Claude Code, OpenClaw, the Anthropic SDK, and any client that expects the Messages API.
OpenAI-compatible /v1/chat/completions and /v1/responses for the OpenAI SDK, OpenCode, Cursor, Continue, Cline, Open WebUI, SillyTavern, and any OpenAI-compatible tool.
Both support streaming, tool calling (OpenAI tools / Anthropic tools), and vision inputs.

New models

We've also got a handful of new models to run including NVIDIA Nemotron 3 Nano Omni, IBM Granite 4.1 and Mistral 3.5 Medium. We helped Mistral solve some issues with implementation in transformers and GGUFs.

Unsloth Updates

Stopped Studio training runs can now resume from checkpoints.
Chat threads now autosave and persist more reliably.
DPO training hangs in multi-process setups were fixed.
VLM GRPO support improved with MROPE updates.
Studio’s stop button now properly stops generation.
Fix chat template disappearing after browser refresh

What's Changed in Unsloth

Studio: use (gguf) context length before max seq length by @G07cha in https://github.com/unslothai/unsloth/pull/5111
chore: fix typo cleanup across tests and backend strings by @luojiyin1987 in https://github.com/unslothai/unsloth/pull/5152
fix: guard resolve_model_class fallback against unresolvable transformers AutoModel entries by @Etherll in https://github.com/unslothai/unsloth/pull/5155
Studio: kill in-flight llama-server before spawning a new one by @danielhanchen in https://github.com/unslothai/unsloth/pull/5171
Studio: stop currency escape from breaking inline LaTeX by @danielhanchen in https://github.com/unslothai/unsloth/pull/5170
Studio: probe AMD GPUs in llama-server VRAM detection by @danielhanchen in https://github.com/unslothai/unsloth/pull/5172
Studio: make stop button actually stop generation by @danielhanchen in https://github.com/unslothai/unsloth/pull/5069
Studio: add github_repo seed reader and GitHub Support Bot recipe by @danielhanchen in https://github.com/unslothai/unsloth/pull/5169
fix(studio): use endswith for mmproj F16 variant selection by @LeoBorcherding in https://github.com/unslothai/unsloth/pull/5184
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/5204
Fix Windows install when paths contain spaces or Python 3.14 is on PATH by @Etherll in https://github.com/unslothai/unsloth/pull/5201
Studio: Preserve transparency in uploaded profile avatars by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5200
UX: single chat header error placement and selector alignment by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5173
Studio: Refine chat preset and group built-in presets by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5159
Studio: Fix image-only chat requests failing validation by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5212
Studio: fix 7 failing studio_unit_tests on main by @danielhanchen in https://github.com/unslothai/unsloth/pull/5216
Patch checkpoint reload init functions to strip unsupported args by @Datta0 in https://github.com/unslothai/unsloth/pull/5167
Studio: Fix clipped model selector text descenders by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5210
Fix DPO trainer multi process hang by @Datta0 in https://github.com/unslothai/unsloth/pull/5199
Studio: Pin assistant-ui core for fresh installs by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5229
Fix local model scanner to handle ollama cloud models by @Anish9901 in https://github.com/unslothai/unsloth/pull/5220
Fix Studio desktop tray installer and titlebar and bux fixes by @wasimysaid in https://github.com/unslothai/unsloth/pull/5179
MROPE for VLM GRPO by @Datta0 in https://github.com/unslothai/unsloth/pull/5198
install: overlay unsloth-zoo from git main on --local by @rolandtannous in https://github.com/unslothai/unsloth/pull/5242
Studio: Fix chat template disappearing after browser refresh by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5209
studio: add --local to setup.sh + overlay unsloth-zoo from git main by @rolandtannous in https://github.com/unslothai/unsloth/pull/5252
Fix/windowsprebuilt by @mmathew23 in https://github.com/unslothai/unsloth/pull/5241
Studio: Add dataset upload dropzone and update preserve think copy by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5253
Add Qwen3.6 support by @rolandtannous in https://github.com/unslothai/unsloth/pull/5257
Studio: Chat thread autosave persistence by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5256
Studio: Enable deleting fine-tuned chat models by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5234
Studio: Add checkpoint resume for stopped training runs by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5255
Studio: Polish spacing and profile input radius by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5222
Fix check for libcurl headers in install.sh by @LFd3v in https://github.com/unslothai/unsloth/pull/5251
Default Studio host to 127.0.0.1 and prompt before auto-start by @rolandtannous in https://github.com/unslothai/unsloth/pull/5267
Studio: forward llama-server args from unsloth studio run , activate unsloth run , and allow passing model:quant to load models by @rolandtannous in https://github.com/unslothai/unsloth/pull/5271
Studio: Always show API usage examples and docs links by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5270
Studio: Change API Keys settings to API Access by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5268
unsloth run: add --enable-tools/--disable-tools server-side tool policy by @rolandtannous in https://github.com/unslothai/unsloth/pull/5277
fix: use % 8 instead of // 8 in FP8 weight shape check by @Ricardo-M-L in https://github.com/unslothai/unsloth/pull/5243
Pin Studio GGUF export to llama.cpp's local convert script by @mmathew23 in https://github.com/unslothai/unsloth/pull/5275
fix KVCache estimates for gemma4 style sliding window models by @Datta0 in https://github.com/unslothai/unsloth/pull/5225
Update VRAM estimator to cater to broader model configs by @Datta0 in https://github.com/unslothai/unsloth/pull/5175
Fix FastSentenceTransformer loading with newer sentence-transformers by @Etherll in https://github.com/unslothai/unsloth/pull/5259
Studio: Preserve chat history during autosave by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5278

What's changed in Unsloth-Zoo

Fix fused CE grad scaling under DDP by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/434
Fused CE backward: guard scaling=0, drop tensor path, use out-of-place mul by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/610
Fix/gemma4moefix by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/612
MROPE for VLM GRPO by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/614
Double-buffer GPU activations for overlapping H2D copy with backward compute by @ruixiang63 in https://github.com/unslothai/unsloth-zoo/pull/534
fix(temporary_patches/utils): add missing comma in all (raise_error / Unpack) by @Anai-Guo in https://github.com/unslothai/unsloth-zoo/pull/617
Fix qwen lora extractor for diff peft versions by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/618
fix: use backend device type in GGUF merge path by @andomeder in https://github.com/unslothai/unsloth-zoo/pull/615
Add unsloth_compiled_cache to gitignore by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/622
Allow local convert_hf_to_gguf.py via UNSLOTH_LLAMA_CPP_SCRIPTS_DIR by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/621

Full Changelog: https://github.com/unslothai/unsloth/compare/v0.1.37-beta...v0.1.38-beta

View release on GitHub

v0.1.37-beta Breaking risk 3mo

Notable features

Collapsible sidebar
Chat deletion and search
Preserve Thinking toggle for compatible models

Full changelog

Hey guys, we revamped the entire Unsloth Studio UI and UX experience to put an emphasis on chat and training:

Added a collapsible sidebar based on community feedback
You can now delete chats and search past conversations
New Preserve Thinking toggle for models that support it like Qwen3.6
Cleaner, more consistent design with easier navigation
Expanded Settings page with options to change your profile picture, name, and more
No more entering your Hugging Face token twice
gpt-oss now has low, medium and high thinking toggles.
Now uses latest llama.cpp prebuilt, even on Linux CUDA
Lots of bug, consistency and stability fixes
Kimi-K2.6 can now be run!
We also added experimental API support. Guides, announcement etc will come next week.

Qwen3.6 was also also previously already supported in Unsloth Studio for running and training. You can train and run Qwen3.6-27B right now!

What's Changed

Only run ldconfig CUDA-linking recovery when we have permission by @danielhanchen in https://github.com/unslothai/unsloth/pull/4930
Fix Mistral DPO/preference training crash on non-xformers platforms (e.g. Intel XPU) by @cheehook in https://github.com/unslothai/unsloth/pull/4889
Fix raw text paragraph break normalization by @kiankyars in https://github.com/unslothai/unsloth/pull/4884
Studio: keep chat input visible and fix compare pane clipping by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4924
fix: check find() return value before adding offset in try_fix_tokenizer by @Ricardo-M-L in https://github.com/unslothai/unsloth/pull/4923
updated models template mappers. added lfm2.5vl450m to transformers 5… by @rolandtannous in https://github.com/unslothai/unsloth/pull/4939
Revert "updated models template mappers. added lfm2.5vl450m to transformers 5…" by @rolandtannous in https://github.com/unslothai/unsloth/pull/4945
Add AMD ROCm/HIP support across installer and hardware detection by @danielhanchen in https://github.com/unslothai/unsloth/pull/4720
Pin bitsandbytes to continuous-release_main on ROCm (4-bit decode fix) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4954
Fix Gemma-4 GRPO catastrophic KL divergence with TRL 1.0.0+ by @danielhanchen in https://github.com/unslothai/unsloth/pull/4934
Add ROCm test suite (companion to #4720) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4824
updating gemma4 script by @Manan17 in https://github.com/unslothai/unsloth/pull/4992
Move gemma4 script by @Manan17 in https://github.com/unslothai/unsloth/pull/4994
studio: fix route transition DOM duplication via AnimatePresence mode="wait" by @AdamPlatin123 in https://github.com/unslothai/unsloth/pull/4987
Studio: Prompt manager, message deletion, and chat UI improvements by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4938
Pin kernels==0.12.1 to fix training import failure by @rolandtannous in https://github.com/unslothai/unsloth/pull/5000
Studio: Expose openai and anthropic compatible external API end points by @danielhanchen in https://github.com/unslothai/unsloth/pull/4956
studio: skip training status/metrics polling when idle by @AdamPlatin123 in https://github.com/unslothai/unsloth/pull/4988
studio: fix api-keys access + refresh by @wasimysaid in https://github.com/unslothai/unsloth/pull/5005
Studio: Polish API key copy button and harden async clipboard fallback by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5006
fix(studio): default chart view to full training history by @Barath19 in https://github.com/unslothai/unsloth/pull/5007
[Studio] Show non exported models in chat UI by @Datta0 in https://github.com/unslothai/unsloth/pull/4892
[Studio] Install flash attn at setup time for linux by @Datta0 in https://github.com/unslothai/unsloth/pull/4979
fix(studio): remove 300s cap on load_checkpoint (inherits 3600s default) by @TF-MTGE in https://github.com/unslothai/unsloth/pull/4922
Studio: honor explicit GGUF ctx and default to 4096 when weights exceed VRAM by @danielhanchen in https://github.com/unslothai/unsloth/pull/5011
Studio: make GGUF disk-space preflight cache-aware by @danielhanchen in https://github.com/unslothai/unsloth/pull/5012
Studio: anchor ctx-slider warning threshold at 4096 when weights exceed VRAM by @danielhanchen in https://github.com/unslothai/unsloth/pull/5014
studio: show HF model download progress in training start overlay by @danielhanchen in https://github.com/unslothai/unsloth/pull/4894
studio: stream export worker output into the export dialog by @danielhanchen in https://github.com/unslothai/unsloth/pull/4897
Fix num_items_in_batch GA for Gemma4 by @Datta0 in https://github.com/unslothai/unsloth/pull/4998
studio: pin peft to 0.18.1 to fix export subprocess issues by @rolandtannous in https://github.com/unslothai/unsloth/pull/5015
Studio: live model-load progress + rate/ETA on download and load by @danielhanchen in https://github.com/unslothai/unsloth/pull/5017
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/5004
Fix bitsandbytes ROCm install by using pip instead of uv by @edamamez in https://github.com/unslothai/unsloth/pull/4966
Studio: split model-load progress label across two rows by @danielhanchen in https://github.com/unslothai/unsloth/pull/5020
Studio: hard-stop at n_ctx with a 'Context limit reached' toast by @danielhanchen in https://github.com/unslothai/unsloth/pull/5021
[moe][gemma4] Target MoE for gemma4 by @Datta0 in https://github.com/unslothai/unsloth/pull/4913
Add configurable PyTorch mirror via UNSLOTH_PYTORCH_MIRROR env var by @rolandtannous in https://github.com/unslothai/unsloth/pull/5024
Studio: support GGUF variant selection for non-suffixed repos by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5023
fix: prevent offline freeze by fixing stats retry and forwarding local_files_only by @DavidSolanas in https://github.com/unslothai/unsloth/pull/5016
Respect classification head skip list on pre-quantized 4-bit checkpoints (#5027) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5034
fix(rocm): tighten gfx regex to ignore generic ISA lines by @danielhanchen in https://github.com/unslothai/unsloth/pull/5033
Fix grad-accum accepts_loss_kwargs detection for vision wrappers by @danielhanchen in https://github.com/unslothai/unsloth/pull/5036
grpo_compute_loss_slow called with wrong positional args by @jonahsamost in https://github.com/unslothai/unsloth/pull/4887
Gate trl disable_gradient_checkpointing warning on UNSLOTH_ENABLE_LOGGING by @danielhanchen in https://github.com/unslothai/unsloth/pull/5038
Studio: refresh Downloaded GGUF list and recurse into variant subdirs by @danielhanchen in https://github.com/unslothai/unsloth/pull/5032
feat: Add support for OLMo-3 model by @OnePunchMonk in https://github.com/unslothai/unsloth/pull/4678
feat: Add cactus QAT scheme support by @OnePunchMonk in https://github.com/unslothai/unsloth/pull/4679
Re-apply #4939: updated models template mappers by @rolandtannous in https://github.com/unslothai/unsloth/pull/4950
Studio: add folder browser modal for Custom Folders by @danielhanchen in https://github.com/unslothai/unsloth/pull/5035
Bump Studio installer minimum to 2026.4.5 by @danielhanchen in https://github.com/unslothai/unsloth/pull/5041
fix Gemma4 flash attn disable by @mmathew23 in https://github.com/unslothai/unsloth/pull/5045
BUG: fix _fix_chat_template for ChatML templates missing add_generation_prompt (#4150) by @kimimgo in https://github.com/unslothai/unsloth/pull/4426
fix: use direct registry API for PATH writes instead of SetEnvironmentVariable by @Etherll in https://github.com/unslothai/unsloth/pull/4961
Chat-template repair: warn-by-default, AST classification, dict support by @danielhanchen in https://github.com/unslothai/unsloth/pull/5049
Restrict flash attn to <=256 head dim. Consolidate attn impl checks by @Datta0 in https://github.com/unslothai/unsloth/pull/5051
Remove legacy venv Scripts entry from User PATH on upgrade by @danielhanchen in https://github.com/unslothai/unsloth/pull/5060
Fix review findings for chat-template repair (#5049) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5056
Studio: Ollama support, recommended folders, Custom Folders UX polish by @danielhanchen in https://github.com/unslothai/unsloth/pull/5050
feat(studio): replace navbar with collapsible sidebar by @wasimysaid in https://github.com/unslothai/unsloth/pull/4936
fix audio dataset preview and finetuning by @CodeMan62 in https://github.com/unslothai/unsloth/pull/5043
Chat first onboarding by @wasimysaid in https://github.com/unslothai/unsloth/pull/5063
Fix onboarding followups by @wasimysaid in https://github.com/unslothai/unsloth/pull/5064
Studio: Default Gemma fallback for chat + AI assist by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5066
fix: multi-GPU inference crash for bnb 4-bit/8-bit models by @danielhanchen in https://github.com/unslothai/unsloth/pull/5068
Add Qwen3.6 inference defaults for Studio by @danielhanchen in https://github.com/unslothai/unsloth/pull/5065
Add qwen3.6 script by @Manan17 in https://github.com/unslothai/unsloth/pull/5084
Studio: forward standard OpenAI tools / tool_choice to llama-server by @rolandtannous in https://github.com/unslothai/unsloth/pull/5099
fix(studio/chat): stop stream when trashing a thread from sidebar by @rolandtannous in https://github.com/unslothai/unsloth/pull/5067
Studio: Local profile customization in settings and sync sidebar identity by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5088
Studio: Show LoRA live logs and update GGUF quant options by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5058
Studio: prefer mainstream clipboard copy over deprecated one by @G07cha in https://github.com/unslothai/unsloth/pull/5109
Studio: Improve chat composition, fix scroll behaviour, and refine sidebar UX by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5089
Studio: forward standard OpenAI tools / tool_choice on /v1/responses (Codex compat) by @rolandtannous in https://github.com/unslothai/unsloth/pull/5122
Studio: support images on /v1/messages (Anthropic-compat) by @rolandtannous in https://github.com/unslothai/unsloth/pull/5128
Coerce TRL's tuple-cached _*_available flags to bool by @danielhanchen in https://github.com/unslothai/unsloth/pull/5129
Studio: Smoother thread switching in chat by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5126
Studio: Replace assistant UI shared autoscroll with per-panel scrolling by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5127
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/5117
Fix tokenizer save gemma by @Datta0 in https://github.com/unslothai/unsloth/pull/5115
update gema4 chat templates by @Datta0 in https://github.com/unslothai/unsloth/pull/5116
Bump installer floor to 2026.4.7 by @danielhanchen in https://github.com/unslothai/unsloth/pull/5134
fix/llamacpp_prebuilt_install by @mmathew23 in https://github.com/unslothai/unsloth/pull/5135
Studio: fix stale test_exception_result_cached test for vision cache by @danielhanchen in https://github.com/unslothai/unsloth/pull/5145
fix: patch CONTROL type for special tokens in sentencepiece GGUF export by @octo-patch in https://github.com/unslothai/unsloth/pull/5080
fix(install): clear STUDIO_LOCAL_* env on POSIX normal install by @danielhanchen in https://github.com/unslothai/unsloth/pull/5146
Add tauri by @wasimysaid in https://github.com/unslothai/unsloth/pull/5144
Studio: detect reasoning_effort and preserve_thinking in chat templates by @danielhanchen in https://github.com/unslothai/unsloth/pull/5149

New Contributors

@cheehook made their first contribution in https://github.com/unslothai/unsloth/pull/4889
@Ricardo-M-L made their first contribution in https://github.com/unslothai/unsloth/pull/4923
@Barath19 made their first contribution in https://github.com/unslothai/unsloth/pull/5007
@TF-MTGE made their first contribution in https://github.com/unslothai/unsloth/pull/4922
@edamamez made their first contribution in https://github.com/unslothai/unsloth/pull/4966
@DavidSolanas made their first contribution in https://github.com/unslothai/unsloth/pull/5016
@jonahsamost made their first contribution in https://github.com/unslothai/unsloth/pull/4887
@kimimgo made their first contribution in https://github.com/unslothai/unsloth/pull/4426
@CodeMan62 made their first contribution in https://github.com/unslothai/unsloth/pull/5043
@G07cha made their first contribution in https://github.com/unslothai/unsloth/pull/5109
@octo-patch made their first contribution in https://github.com/unslothai/unsloth/pull/5080

Full Changelog: https://github.com/unslothai/unsloth/compare/v0.1.36-beta...v0.1.37-beta

View release on GitHub

v0.1.36-beta Bug fix 3mo

Notable features

Speculative decoding support
Gemma 4 training stability improvements

Full changelog

Hey everyone, we’ve updated Gemma 4 training and quants with many fixes. The bugs are universal and affected all packages and implementations and did NOT originate from Unsloth. We identified the bugs, fixed them, and Gemma 4 training now works properly only in Unsloth.

You need 8GB VRAM to train Gemma-4-E2B locally. Unsloth trains Gemma 4 ~1.5x faster with ~60% less VRAM than FA2 setups.

You can also train 26B-A4B and 31B or train via Unsloth Studio. Studio and the notebooks work for Vision, Text, Audio and inference.
For more details, guide + notebooks on training Gemma 4, view our blog: https://unsloth.ai/docs/models/gemma-4/train

Gemma 4 Training Fixes:

For fix details see our blog.

Grad accumulation no longer causes losses to explode - before you might see losses of 300 to 400 - it should be 10 to 15 - Unsloth has this fixed.
Index Error for 26B and 31B for inference - this will fail inference for 26B and 31B when using transformers - we fixed it.
use_cache=False had gibberish for E2B, E4B - see https://github.com/huggingface/transformers/issues/45242
float16 audio -1e9 overflows on float16

If you see losses higher than 13-15 (like 100 or 300) most likely gradient accumulation is not being accounted properly - we have fixed this as part of Unsloth and Unsloth Studio.

Gemma 4 Quant Re-uploads

We also updated our Gemma 4 GGUFs so you will need to re-download. Once again, the quant issues are NOT related to or originated from Unsloth:

CUDA: check for buffer overlap before fusing - CRITICAL fixes <unused24> tokens https://github.com/ggml-org/llama.cpp/pull/21566
kv-cache : support attention rotation for heterogeneous iSWA https://github.com/ggml-org/llama.cpp/pull/21513
vocab : add byte token handling to BPE detokenizer for Gemma4 https://github.com/ggml-org/llama.cpp/pull/21488
convert : set "add bos" == True for Gemma 4 https://github.com/ggml-org/llama.cpp/pull/21500
common : add gemma 4 specialized parser https://github.com/ggml-org/llama.cpp/pull/21418
llama-model: read final_logit_softcapping for Gemma 4 https://github.com/ggml-org/llama.cpp/pull/21390
llama: add custom newline split for Gemma 4 https://github.com/ggml-org/llama.cpp/pull/21406

Unsloth Studio Updates

Add speculative decoding support (ngram-mod, on by default)
Llama.cpp binaries updated to use latest version which includes all Gemma 4 Fixes
Fix Qwen3.5 and Gemma 4 training issues
Enable exporting and saving of Gemma 4 models
Harden sandbox security for terminal and python tools
Let recipes use the model loaded in Chat
Fix empty chat threads on navigation (and whenever switching tabs) and stabilize new chat flow
Allow non-LLM recipes to run and move Data tab first in executions
Reuse HF cached repo casing to prevent duplicate downloads

What's Changed

fix(studio): lazy-import transformers in model_config to fix 5.x version switch by @rolandtannous in https://github.com/unslothai/unsloth/pull/4806
fix: patch PEFT for Gemma4ClippableLinear in loader checkpoint path (fixes export) by @rolandtannous in https://github.com/unslothai/unsloth/pull/4807
Fix/gemma4 install script by @Manan17 in https://github.com/unslothai/unsloth/pull/4815
Fix/llama.cppbuilding by @mmathew23 in https://github.com/unslothai/unsloth/pull/4804
Add tests for simplified llama.cpp install policy (from PR #4804) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4817
Differentiate web search and URL fetch in chat tool UI by @Shine1i in https://github.com/unslothai/unsloth/pull/4802
Allow non-LLM recipes to run and move Data tab first in executions by @Shine1i in https://github.com/unslothai/unsloth/pull/4805
studio: reuse HF cached repo casing to prevent duplicate downloads by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4822
fix(studio): ensure first chat tool call starts in session sandbox by @neodon in https://github.com/unslothai/unsloth/pull/4810
fix(studio): harden sandbox security for terminal and python tools by @danielhanchen in https://github.com/unslothai/unsloth/pull/4827
studio: add speculative decoding support (ngram-mod, on by default) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4836
Add Gemma 4 model sampling defaults by @danielhanchen in https://github.com/unslothai/unsloth/pull/4838
Add tests for cache case resolution (from PR #4822) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4823
Bump minimum unsloth version to 2026.4.2 in install scripts by @danielhanchen in https://github.com/unslothai/unsloth/pull/4842
Fix/studio colab button message: Add fallback message for Colab Studio button when proxy URL fails by @LeoBorcherding in https://github.com/unslothai/unsloth/pull/4866
[Studio][Optimization]Add vision detection cache to is_vision_model() by @rolandtannous in https://github.com/unslothai/unsloth/pull/4853
Add tests for is_vision_model() caching behaviour by @danielhanchen in https://github.com/unslothai/unsloth/pull/4855
Remove Gemma-4 from FORCE_FLOAT32 by @danielhanchen in https://github.com/unslothai/unsloth/pull/4875
fix: skip redundant HfFileSystem().glob() calls in loader.py by @rolandtannous in https://github.com/unslothai/unsloth/pull/4852
fix(studio): custom folder scan fails to find GGUF variants when pointing directly at a model directory by @JYYYYYT in https://github.com/unslothai/unsloth/pull/4860
Add unit tests for loader glob skip guard (from PR #4852) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4854
Studio: Fix empty chat threads on navigation and stabilize new chat flow by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4872
Bump minimum unsloth version to 2026.4.4 in install scripts by @danielhanchen in https://github.com/unslothai/unsloth/pull/4876
split venv_t5 into tiered 5.3.0/5.5.0 and fix trust_remote_code by @rolandtannous in https://github.com/unslothai/unsloth/pull/4878
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/4879
build(deps): bump oxc-parser from 0.121.0 to 0.123.0 in /studio/backend/core/data_recipe/oxc-validator in the npm-oxc-validator group by @dependabot[bot] in https://github.com/unslothai/unsloth/pull/4776
Update dependabot.yml by @danielhanchen in https://github.com/unslothai/unsloth/pull/4915
Let recipes use the model loaded in Chat by @Shine1i in https://github.com/unslothai/unsloth/pull/4840
build(deps): bump the bun-frontend group across 1 directory with 16 updates by @dependabot[bot] in https://github.com/unslothai/unsloth/pull/4586

New Contributors

@neodon made their first contribution in https://github.com/unslothai/unsloth/pull/4810
@JYYYYYT made their first contribution in https://github.com/unslothai/unsloth/pull/4860

Full Changelog: https://github.com/unslothai/unsloth/compare/v0.1.35-beta...v0.1.36-beta

View release on GitHub

v0.1.35-beta New feature 3mo

Notable features

Gemma 4 model support
Tool calling +30% to +80% accuracy improvement
Web search content retrieval

Full changelog

Google releases Gemma 4 with four new models: E2B, E4B, 26B-A4B, 31B.

You can now run and train the Gemma 4 models in Unsloth. Guide / Blog: https://unsloth.ai/docs/models/gemma-4
Run E2B and E4B on 6GB RAM, and on phones. Run 26B-A4B and 31B on ~18GB.
GGUFs: https://huggingface.co/collections/unsloth/gemma-4

Updates

Tool calls for smaller models are now more stable and don't cut off anymore
Pre-compiled binaries for llama.cpp for 2 Gemma 4 fixes:
- vocab: fix Gemma4 tokenizer - (#21343)
- fix: gemma 4 template - (#21326)
Pre-compiled binaries for Windows, Linux, Mac, WSL devices - CPU and GPU
90% reduced HF API calls - less rate limits
Intel Mac works
All Gemma 4 models are re-converted.
Tool Calling more robust
Speculative Decoding added for non vision models (Gemma-4 is vision sadly and Qwen3.5)
Context length is now properly applied.
Tool calls for all models are now +30% to +80% more accurate.
Web search now actually gets web content and not just summaries
Number of tool calls allowed are increased to 25 from 10
Tool calls now terminate much better, so looping / repetitions will be reduced
More tool call healing and de-duplication logic to stop tool callings from leaking XML as well
Tested with unsloth/Qwen3.5-4B-GGUF (UD-Q4_K_XL), web search + code execution + thinking enabled.

| Metric | Before | After |
|--------|--------|-------|
| XML leaks in response | 10/10 | 0/10 |
| URL fetches used | 0 | 4/10 runs |
| Runs with correct song names | 0/10 | 2/10 |
| Avg tool calls | 5.5 | 3.8 |
| Avg response time | 12.3s | 9.8s |

Run Gemma 4 in Unsloth Studio:

What's Changed

studio: Polish Windows installer/setup logs by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4736
feat: move folder management into model selector dropdown by @Shine1i in https://github.com/unslothai/unsloth/pull/4731
fix: clear tool status badge immediately after tool execution by @Shine1i in https://github.com/unslothai/unsloth/pull/4733
refactor flex attn to prefer flash if possible by @Datta0 in https://github.com/unslothai/unsloth/pull/4734
Fix Windows local GGUF model loading crash by @danielhanchen in https://github.com/unslothai/unsloth/pull/4730
Fix OOM model styling in Studio model selectors by @LeoBorcherding in https://github.com/unslothai/unsloth/pull/4738
feat(studio): strip org prefix in model search to surface unsloth variants by @rolandtannous in https://github.com/unslothai/unsloth/pull/4749
Fix forward compatibility with transformers 5.x by @danielhanchen in https://github.com/unslothai/unsloth/pull/4752
Architecture-aware KV cache VRAM estimation (5-path) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4757
Fix save_pretrained_merged for full-finetuned models by @danielhanchen in https://github.com/unslothai/unsloth/pull/4755
Feat/prebuiltllamacpp by @mmathew23 in https://github.com/unslothai/unsloth/pull/4741
Add installer test coverage for prebuilt llama.cpp changes by @danielhanchen in https://github.com/unslothai/unsloth/pull/4756
fix: studio web search SSL failures and empty page content by @danielhanchen in https://github.com/unslothai/unsloth/pull/4754
fix: add tokenizers to no-torch deps and TORCH_CONSTRAINT for arm64 macOS py313+ by @danielhanchen in https://github.com/unslothai/unsloth/pull/4748
fix(studio): allow context length slider to reach model's native limit by @danielhanchen in https://github.com/unslothai/unsloth/pull/4746
Tests for architecture-aware KV cache estimation by @danielhanchen in https://github.com/unslothai/unsloth/pull/4760
Fix custom llama.cpp source builds and macos metal source builds by @mmathew23 in https://github.com/unslothai/unsloth/pull/4762
studio: align composer/code, unify fonts, and remove tool collapse jitter by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4763
fix(chat): correct loading text for cached models during inference by @AdamPlatin123 in https://github.com/unslothai/unsloth/pull/4764
fix(security): shell injection in GGML export conversion by @mateeaaaaaaa in https://github.com/unslothai/unsloth/pull/4768
Add regression test for shell injection fix in GGML conversion by @danielhanchen in https://github.com/unslothai/unsloth/pull/4773
fix(studio): prevent small models from stalling on tool-calling tasks by @danielhanchen in https://github.com/unslothai/unsloth/pull/4769
Add regression tests for custom llama prebuilt installer by @danielhanchen in https://github.com/unslothai/unsloth/pull/4772
Feat/custom llama prebuilt by @mmathew23 in https://github.com/unslothai/unsloth/pull/4771
studio: fix chat font changes leaking outside chat page by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4775
feat(studio): display images from Python tool execution in chat UI by @danielhanchen in https://github.com/unslothai/unsloth/pull/4778
ui improvement by @rolandtannous in https://github.com/unslothai/unsloth/pull/4781
UI Changes by @danielhanchen in https://github.com/unslothai/unsloth/pull/4782
fix(studio): improve tool-calling re-prompt for small models by @danielhanchen in https://github.com/unslothai/unsloth/pull/4783
Pin Gemma-4 transformers requirement to 5.5.0 stable by @danielhanchen in https://github.com/unslothai/unsloth/pull/4784
Switch llama.cpp default to mainline ggml-org by @danielhanchen in https://github.com/unslothai/unsloth/pull/4785
Use transformers v5.5-release branch, pin to 5.5.0 by @danielhanchen in https://github.com/unslothai/unsloth/pull/4786
Fix: pin transformers==4.57.6 in main Studio venv by @danielhanchen in https://github.com/unslothai/unsloth/pull/4788
fix(studio): build llama.cpp from master for Gemma 4 support by @danielhanchen in https://github.com/unslothai/unsloth/pull/4790
fix name fixed name by @rolandtannous in https://github.com/unslothai/unsloth/pull/4791
fix(studio): prioritize curated defaults in Recommended model list by @danielhanchen in https://github.com/unslothai/unsloth/pull/4792
fix windows llama.cpp compile from source issue by @mmathew23 in https://github.com/unslothai/unsloth/pull/4793
fix(studio): pin llama.cpp to b8637 (Gemma 4 support) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4796
fix(studio): don't set trust_remote_code for Gemma 4 training by @danielhanchen in https://github.com/unslothai/unsloth/pull/4795
fix(studio): revert llama.cpp default tag to latest by @danielhanchen in https://github.com/unslothai/unsloth/pull/4797
fix(studio): suppress fatal error when ggml-org has no prebuilt manifest by @danielhanchen in https://github.com/unslothai/unsloth/pull/4799

New Contributors

@AdamPlatin123 made their first contribution in https://github.com/unslothai/unsloth/pull/4764
@mateeaaaaaaa made their first contribution in https://github.com/unslothai/unsloth/pull/4768

Full Changelog: https://github.com/unslothai/unsloth/compare/v0.1.3-beta...v0.1.35-beta

View release on GitHub

v0.1.3-beta New feature 3mo

Notable features

Custom GGUF folder scanning
Automatic multi-GPU support
Tool calling XML deduplication

Full changelog

We did many new improvements and fixes to Studio!

Tool calls for all models are now +30% to +80% more accurate.
Web search now actually gets web content and not just summaries
Number of tool calls allowed are increased to 25 from 10
Tool calls now terminate much better, so looping / repetitions will be reduced
More tool call healing and de-duplication logic to stop tool callings from leaking XML as well
Tested with unsloth/Qwen3.5-4B-GGUF (UD-Q4_K_XL), web search + code execution + thinking enabled.

| Metric | Before | After |
|--------|--------|-------|
| XML leaks in response | 10/10 | 0/10 |
| URL fetches used | 0 | 4/10 runs |
| Runs with correct song names | 0/10 | 2/10 |
| Avg tool calls | 5.5 | 3.8 |
| Avg response time | 12.3s | 9.8s |

New features

Update button now visible
Install script styling all updated!
Added custom folders so you can use any GGUFs in any folder - for now access in Advanced Settings in Chat and Custom Folders
Preliminary Automatic Multi GPU support for inference and training - useful for large models that don't fit on 1 GPU - Studio auto will allocate GPU resources
Intel Macs should work out of the box

Much smoother and faster Studio

Fixed timeouts of downloads of large models - no more timeouts seen.
Fixed Hugging Face rate limiting - HF API calls reduced by 90%
Fixed bun on Windows and faster installs

To update Studio:

For Linux, WSL, Mac, do: unsloth studio update
For Windows native, do: irm https://unsloth.ai/install.ps1 | iex
For Linux, WSL, Mac reinstalls, do: curl -fsSL https://unsloth.ai/install.sh | sh

What's Changed

Fix LM Studio GGUF loading on native Windows (no GPU) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4665
studio: add HF/local model selection UI for GGUF export by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4365
Fix blank page on Windows due to broken .js MIME type by @rolandtannous in https://github.com/unslothai/unsloth/pull/4674
fix: [Studio] setup.ps1 update-flow for windows by @rolandtannous in https://github.com/unslothai/unsloth/pull/4667
studio: unify Windows installer/setup logging style, verbosity controls, and startup messaging by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4651
studio: preserve GGUF context max after apply and refresh by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4691
[Studio] multi gpu finetuning/inference via "balanced_low0/sequential" device_map by @Datta0 in https://github.com/unslothai/unsloth/pull/4602
Fix editable install scanning 6,500+ node_modules dirs by @danielhanchen in https://github.com/unslothai/unsloth/pull/4697
fix(studio): avoid UnicodeEncodeError on Windows cp1252 consoles by @danielhanchen in https://github.com/unslothai/unsloth/pull/4699
Fix/bun windows bin detection by @Etherll in https://github.com/unslothai/unsloth/pull/4703
fix: skip download progress polling for exported GGUF models by @rolandtannous in https://github.com/unslothai/unsloth/pull/4709
[Studio] Fix: replace hard timeout with inactivity timeout for model loading by @rolandtannous in https://github.com/unslothai/unsloth/pull/4707
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/4705
studio: prevent false multimodal warning during model loading by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4704
fix(studio): open tour ReadMore links in new tab by @danielhanchen in https://github.com/unslothai/unsloth/pull/4694
[studio] multi gpu: revert to balanced for inference. by @Datta0 in https://github.com/unslothai/unsloth/pull/4698
fix: throttle and cache HuggingFace modelInfo API calls by @Shine1i in https://github.com/unslothai/unsloth/pull/4696
fix(studio): correct default weight_decay and learning rate by @danielhanchen in https://github.com/unslothai/unsloth/pull/4695
fix: auto-retry stalled HF downloads with HF_HUB_DISABLE_XET=1 by @rolandtannous in https://github.com/unslothai/unsloth/pull/4712
studio: add update button to navbar with guided commands and cross-platform support by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4721
studio: improve GGUF tool calling accuracy and reliability by @danielhanchen in https://github.com/unslothai/unsloth/pull/4700
studio: fix export HF model dropdown clearing on enter/click-away by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4726
Studio: simplify tool-call dedup and replace html2text with builtin converter by @danielhanchen in https://github.com/unslothai/unsloth/pull/4722
feat: custom scan folders for GGUF model discovery by @Shine1i in https://github.com/unslothai/unsloth/pull/4723
Bump installer minimum version pin to 2026.3.18 by @danielhanchen in https://github.com/unslothai/unsloth/pull/4729

Full Changelog: https://github.com/unslothai/unsloth/compare/v0.1.25-beta...v0.1.3-beta

View release on GitHub

v0.1.25-beta New feature 4mo

Notable features

20-30% faster inference
Model auto-detection from LM Studio/HuggingFace
Training history viewer

Full changelog

Hey guys, it's only been 2 days since our last release, but we’ve got a lot more important updates:

Inference is now 20–30% faster. Previously, tool-calling and repeat penalty could slow inference below normal speeds. Inference tokens/s should now perform similar to llama-server / llama.cpp.
Now Auto-detects older or pre-existing models downloaded from LM Studio, Hugging Face, and similar sources.
Inference token/s speed is now calculated correctly. Previously, tokens/s included startup time, which made the displayed speed look slower than it actually was. It should now reflect 'true' inference speed.
CPU usage no longer spikes. Previously, inline querier identity changed every render, causing useLiveQuery to resubscribe continuously.
Unsloth Studio now has a shutdown x button and shuts down properly. Previously, closing it after opening from the desktop icon would not close it properly. Now, launching from the shortcut also opens the terminal, and closing that terminal fully exits Unsloth Studio. If you still have it open from a previous session you can restart your computer or run lsof -i :8888 then kill -9 <PID>.
Even better tool-calling and websearch with reduced errors.
Updated documentation with lots of new info on deleting models, uninstalling etc.
Cleaner, smarter install and setup logging across Windows and Linux. Output is now easier to read with consistent formatting, quieter by default for a smoother experience, and supports richer --verbose diagnostics when you want full technical detail.
{% endupdate %}
You can now view your training history

What's Changed

Bump installer min version to 2026.3.12 by @danielhanchen in https://github.com/unslothai/unsloth/pull/4600
Fix Colab Studio launch and setup.ps1 box alignment by @danielhanchen in https://github.com/unslothai/unsloth/pull/4601
Fix Colab huggingface-hub conflict, ensurepip fallback, bump to 2026.3.14 by @danielhanchen in https://github.com/unslothai/unsloth/pull/4603
Update README.md by @rolandtannous in https://github.com/unslothai/unsloth/pull/4604
fix: skip flex_attention for models with non-zero attention_dropout by @Abhinavexists in https://github.com/unslothai/unsloth/pull/4605
Fix Colab setup skipping llama.cpp installation by @rolandtannous in https://github.com/unslothai/unsloth/pull/4618
fix: show recommended models in search results by @Shine1i in https://github.com/unslothai/unsloth/pull/4615
studio: align Dataset/Parameters/Training cards, fix expandable height, animate LoRA settings by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4614
fix: Windows installer fails on _yaml.pyd Access Denied (os error 5) by @Etherll in https://github.com/unslothai/unsloth/pull/4617
studio: humanize ETA display for long training runs by @RadouaneElhajali in https://github.com/unslothai/unsloth/pull/4608
fix: add python-json-logger to data-designer-deps by @Shine1i in https://github.com/unslothai/unsloth/pull/4627
[Studio] Colab fix - Allow install_python_stack to run on Colab by @rolandtannous in https://github.com/unslothai/unsloth/pull/4633
Fix repetition_penalty default causing 24% TPS drop in GGUF inference by @danielhanchen in https://github.com/unslothai/unsloth/pull/4634
fix: install.sh Mac Intel compatibility + Studio no-torch support by @danielhanchen in https://github.com/unslothai/unsloth/pull/4624
tests: add no-torch / Intel Mac test suite by @danielhanchen in https://github.com/unslothai/unsloth/pull/4646
fix: use unsloth[huggingfacenotorch] instead of --no-deps in no-torch mode by @danielhanchen in https://github.com/unslothai/unsloth/pull/4647
Fix Gemma3N audio training stride assertion with non-reentrant checkpointing by @danielhanchen in https://github.com/unslothai/unsloth/pull/4629
Fix missing num_items_in_batch in unsloth_prediction_step by @danielhanchen in https://github.com/unslothai/unsloth/pull/4616
Make Studio shortcuts launch in a visible terminal by @danielhanchen in https://github.com/unslothai/unsloth/pull/4638
studio: setup log styling by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4494
Fix ~1.2s TTFT penalty when tools are enabled in Studio by @danielhanchen in https://github.com/unslothai/unsloth/pull/4639
Fix GGUF GPU fit check to account for KV cache VRAM by @danielhanchen in https://github.com/unslothai/unsloth/pull/4623
feat: update app icons to rounded logo by @Shine1i in https://github.com/unslothai/unsloth/pull/4640
Streaming tool detection: guard late tool_calls, filter incomplete fragments by @danielhanchen in https://github.com/unslothai/unsloth/pull/4648
fix: install no-torch runtime deps via requirements file by @danielhanchen in https://github.com/unslothai/unsloth/pull/4649
Fix orphan server cleanup killing user's own llama-server by @danielhanchen in https://github.com/unslothai/unsloth/pull/4622
fix: add auth + UX improvements to shutdown button by @Shine1i in https://github.com/unslothai/unsloth/pull/4642
Fix inference failing for transformers 5.x models (trust_remote_code) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4652
fix: no-torch install deps without pulling torch transitively by @danielhanchen in https://github.com/unslothai/unsloth/pull/4650
Detect always-on reasoning models and show Think button as locked-on by @danielhanchen in https://github.com/unslothai/unsloth/pull/4654
fix: replace navbar shutdown text button with icon-only button by @Shine1i in https://github.com/unslothai/unsloth/pull/4655
Fall back to parsing model name when HF API has no param count by @danielhanchen in https://github.com/unslothai/unsloth/pull/4656
fix: disable OCR in pymupdf4llm PDF extraction by @Shine1i in https://github.com/unslothai/unsloth/pull/4659
Fix HF cache default and show LM Studio models in chat/inference by @rolandtannous in https://github.com/unslothai/unsloth/pull/4653
Bump minimum unsloth version to 2026.3.16 in install scripts by @danielhanchen in https://github.com/unslothai/unsloth/pull/4663

New Contributors

@Abhinavexists made their first contribution in https://github.com/unslothai/unsloth/pull/4605
@RadouaneElhajali made their first contribution in https://github.com/unslothai/unsloth/pull/4608

Full Changelog: https://github.com/unslothai/unsloth/compare/v0.1.2-beta...v0.1.25-beta

View release on GitHub

v0.1.2-beta Breaking risk 4mo

Notable features

In-place Studio update command
App shortcuts for Windows/Mac/Linux
Pre-compiled llama.cpp binaries

Full changelog

Hey guys, this is our first release since we launched Unsloth Studio last week. From now on you can directly access all our updates through our changelog here: https://unsloth.ai/docs/new/changelog

You can now update Unsloth Studio! Just use: unsloth studio update. Please update to use all the newest fixes and features.

Tool calling improved. Better llama.cpp parsing, no raw tool markup in chat, faster inference, a new Tool Outputs panel, timers.
Windows CPU or GPU now works seamlessly. Please reinstall!
App shortcuts. Once installed, you can now launch in Windows, MacOS and Linux via a shortcut icon in the Start / Launch and Desktop.
Pre-compiled llama.cpp binaries and mamba_ssm for finetuning - 6x faster installs! Also <300MB in size for binaries.
50% reduced installation sizes (-7GB or more savings), 2x faster installs and faster resolving. 50% smaller pypi sizes.
Colab with free T4 GPUs with Unsloth Studio now fixed! Try it here. Due to pre-compiled binaries, it's also 20x faster!
You can now properly use old GGUFs from Hugging Face or LM Studio
MacOS and CPU now have Data Recipes enabled with multi-file uploading.
AMD support preliminary for Linux only machines - auto detects.
Settings sidebar redesign. Settings are now grouped into Model, Sampling, Tools, and Preferences
Context length now adjustable. Keep in mind this is not needed as llama.cpp smartly uses the exact context you need via --fit on
Persistent system prompts and presets. Custom system prompts and chat presets now persist across reloads and page changes.
Multi-file upload. Data recipes now support multiple drag-and-drop uploads for PDF, DOCX, TXT, and MD, with backend extraction, saved uploads, and improved previews.
Better chat observability. Studio now shows llama-server timings and usage, a context-window usage bar, and richer source hover cards.
Better UX overall - clickable links, better LaTeX parsing, tool / code / web tooltips for default cards and much more!
LiteLLM - Unsloth Studio and Unsloth were NOT affected by the recent LiteLLM compromise. Nemo Data Designer used LiteLLM only up to 1.80, not the affected 1.82.7 or 1.82.8, and has since removed it entirely.
We now have a new one line install command, just run: Copycurl -fsSL https://unsloth.ai/install.sh | sh

Fixes:

Windows/setup improvements. Fixed silent Windows exits, Anaconda/conda-forge startup crashes, broken non-NVIDIA Windows installs, and missing early CUDA/stale-venv setup checks.
System prompts fixed. They work again for non-GGUF text and vision inference.
GGUF export expanded. Full fine-tunes, not just LoRA/PEFT, can now export to GGUF. Base model resolution is more reliable, and unsupported export options are disabled in the UI.
Chat scroll/layout fixes. Fixed scroll-position issues during generation, thinking-panel layout shift, and viewport jumps when collapsing reasoning panels.
Smarter port conflict detection. Studio now detects loopback conflicts, can identify the blocking process when possible, and gives clearer fallback-port messages.

Example of automatic parameter settings for context length etc:

https://github.com/user-attachments/assets/6a70a680-fccd-4d50-ad47-eb45d6827a06

What's Changed

[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/4542
fix: store embedding_learning_rate on self in UnslothTrainingArguments by @GoldenGrapeGentleman in https://github.com/unslothai/unsloth/pull/4531
studio: persist system prompt and preset settings across navigation by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4538
studio: stop scroll hijack during generation and fix thinking panel layout shift by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4543
Fix Studio port conflict detection for loopback addresses by @danielhanchen in https://github.com/unslothai/unsloth/pull/4532
fix(studio): show Windows-specific reset-password command by @Shine1i in https://github.com/unslothai/unsloth/pull/4529
fix(studio): restore scroll lock on reasoning panel collapse by @danielhanchen in https://github.com/unslothai/unsloth/pull/4545
fix: always show chat tool icons by @Shine1i in https://github.com/unslothai/unsloth/pull/4525
fix: system prompt ignored in unsloth inference by @Shine1i in https://github.com/unslothai/unsloth/pull/4528
fix: handle prompt/completion datasets in slow-path BOS detection by @danielhanchen in https://github.com/unslothai/unsloth/pull/4548
fix: give @0xKushwaha git history credit for completion_only_loss fix by @danielhanchen in https://github.com/unslothai/unsloth/pull/4552
⚠️Remove quarantined litellm for precaution -- Unsloth Studio NOT affected by @danielhanchen in https://github.com/unslothai/unsloth/pull/4553
fix: pin unsloth>=2026.3.11 in install scripts by @danielhanchen in https://github.com/unslothai/unsloth/pull/4556
Regroup chat settings sidebar into focused sections by @Shine1i in https://github.com/unslothai/unsloth/pull/4551
Add GRPO resume vLLM cleanup guard by @MagellaX in https://github.com/unslothai/unsloth/pull/4411
fix: prevent UnicodeEncodeError on Windows CP1252 consoles in studio setup by @Krishnachaitanyakc in https://github.com/unslothai/unsloth/pull/4563
studio: windows desktop shortcut launcher by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4558
Remove duplicate frontend assets from wheel (~31 MB savings) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4567
feat(studio): training history persistence and past runs viewer by @Shine1i in https://github.com/unslothai/unsloth/pull/4501
fix: remove auto wandb.finish() after train() to allow post-training evaluate() by @Krishnachaitanyakc in https://github.com/unslothai/unsloth/pull/4564
feat: Implement Q-GaLore optimizer and custom embedding learning rate… by @OnePunchMonk in https://github.com/unslothai/unsloth/pull/4511
Bump Data Designer to 0.5.4 (removes litellm dependency) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4569
feat(chat): cleaner tool UI, inline LaTeX, clickable links by @Shine1i in https://github.com/unslothai/unsloth/pull/4561
[Studio] Try installing causal-conv1d from prebuilt wheels if avialable by @Datta0 in https://github.com/unslothai/unsloth/pull/4547
Feature/add dependabot and codeql security checks by @pkloehn1 in https://github.com/unslothai/unsloth/pull/4479
build(deps): bump the actions group with 2 updates by @dependabot[bot] in https://github.com/unslothai/unsloth/pull/4570
build(deps): bump oxc-parser from 0.116.0 to 0.121.0 in /studio/backend/core/data_recipe/oxc-validator in the npm-oxc-validator group by @dependabot[bot] in https://github.com/unslothai/unsloth/pull/4571
Remove advanced CodeQL workflow (conflicts with default setup) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4584
Add macOS and Linux desktop shortcuts to install.sh by @danielhanchen in https://github.com/unslothai/unsloth/pull/4568
perf(studio): upgrade to Vite 8 + auto-install bun for faster frontend builds by @Etherll in https://github.com/unslothai/unsloth/pull/4522
feat(tokenizer): add get_tokenizer_info() diagnostic helper by @cz-03 in https://github.com/unslothai/unsloth/pull/4436
Add ROCm (AMD GPU) support to studio setup by @danielhanchen in https://github.com/unslothai/unsloth/pull/4585
Consolidate dual venvs and separate install from update by @rolandtannous in https://github.com/unslothai/unsloth/pull/4530
studio: stabilize reasoning panel scroll behavior and prevent composer overlap by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4587
Use prebuilt llama.cpp for unsloth studio setup by @mmathew23 in https://github.com/unslothai/unsloth/pull/4562
fix(studio): add -ngl flag for GPU offloading in llama-server by @danielhanchen in https://github.com/unslothai/unsloth/pull/4588
fix(studio): add pip nvidia CUDA libs to LD_LIBRARY_PATH for llama-server by @danielhanchen in https://github.com/unslothai/unsloth/pull/4590
fix(studio): validate bun install and retry from official source on failure by @danielhanchen in https://github.com/unslothai/unsloth/pull/4589
fix(studio): clear bun cache on failure and retry before falling back to npm by @danielhanchen in https://github.com/unslothai/unsloth/pull/4594
Pin torch>=2.4,<2.11.0 in Studio installers by @danielhanchen in https://github.com/unslothai/unsloth/pull/4595
fix(studio): source-build fallback prefers Unsloth's tested tag over upstream latest by @danielhanchen in https://github.com/unslothai/unsloth/pull/4593
fix(studio): add bun cache validation to Windows setup.ps1 by @danielhanchen in https://github.com/unslothai/unsloth/pull/4596
feat: multi-source model discovery (HF default, legacy cache, LM Studio) by @rolandtannous in https://github.com/unslothai/unsloth/pull/4591
Add unsloth to User PATH on Windows after install by @danielhanchen in https://github.com/unslothai/unsloth/pull/4597
Add PID file tracking and unsloth studio stop command by @danielhanchen in https://github.com/unslothai/unsloth/pull/4598
feat(studio): editable context length with Apply/Reset for GGUF settings by @danielhanchen in https://github.com/unslothai/unsloth/pull/4592

New Contributors

@MagellaX made their first contribution in https://github.com/unslothai/unsloth/pull/4411
@Krishnachaitanyakc made their first contribution in https://github.com/unslothai/unsloth/pull/4563
@OnePunchMonk made their first contribution in https://github.com/unslothai/unsloth/pull/4511
@pkloehn1 made their first contribution in https://github.com/unslothai/unsloth/pull/4479
@dependabot[bot] made their first contribution in https://github.com/unslothai/unsloth/pull/4570
@cz-03 made their first contribution in https://github.com/unslothai/unsloth/pull/4436

Full Changelog: https://github.com/unslothai/unsloth/compare/b8475...v0.1.2-beta

View release on GitHub

b8475 Feature 4mo

Notable features

Install-ready llama.cpp bundles for Unsloth Studio

Changelog

Install-ready Unsloth Studio llama.cpp bundles for b8475.

View release on GitHub

b8457 Feature 4mo

Notable features

Install-ready llama.cpp bundles for streamlined setup

Changelog

Install-ready Unsloth Studio llama.cpp bundles for b8457.

View release on GitHub

v0.1.0-beta New feature 4mo

Notable features

Unsloth Studio web UI launch
500+ model support
70% VRAM reduction vs standard training

Full changelog

Hey guys, we're super excited to launch Unsloth Studio (Beta), a new open-source web UI to train and run LLMs.

Blog + everything you need to know: https://unsloth.ai/docs/new/studio

Run models locally on Mac, Windows, Linux
Compare and battle models side-by-side
Train 500+ models 2x faster with 70% less VRAM
Supports GGUF, vision, audio, embedding models
Self-healing Tool calling / web search + code execution
Auto-create datasets from PDF, CSV, DOCX
Export models to GGUF, safetensor and more formats

MacOS, Linux, WSL:

For MacOS, ensure you have cmake installed. If not, run brew install cmake.

curl -fsSL https://unsloth.ai/install.sh | sh

Then to launch every time:

source unsloth_studio/bin/activate
unsloth studio -H 0.0.0.0 -p 8888

Windows:

Run in Windows Powershell:

irm https://unsloth.ai/install.ps1 | iex

Then to launch every time:

.\unsloth_studio\Scripts\activate
unsloth studio -H 0.0.0.0 -p 8888

Docker

Use our Docker image unsloth/unsloth container. Run:

docker run -d -e JUPYTER_PASSWORD="mypassword" \
  -p 8888:8888 -p 8000:8000 -p 2222:22 \
  -v $(pwd)/work:/workspace/work \
  --gpus all \
  unsloth/unsloth

https://github.com/user-attachments/assets/4f48e6ed-5ef9-42d8-8404-64a4d8b36846

What's Changed

Update CODEOWNERS for studio and cli by @danielhanchen in https://github.com/unslothai/unsloth/pull/4266
[Feature] Support Sequence Classification by @danielhanchen in https://github.com/unslothai/unsloth/pull/4264
[Feature] VLMs support for GRPO by @danielhanchen in https://github.com/unslothai/unsloth/pull/4265
[Fix] Respect llm_int8_skip_modules for VLM by @danielhanchen in https://github.com/unslothai/unsloth/pull/4249
ROCM support by @danielhanchen in https://github.com/unslothai/unsloth/pull/4271
Remove Blackwell flex attention disable workaround from studio by @danielhanchen in https://github.com/unslothai/unsloth/pull/4273
ROCM support by @danielhanchen in https://github.com/unslothai/unsloth/pull/4272
fix: prevent ai-assist model config RCE via untrusted Hugging Face repos by @danielhanchen in https://github.com/unslothai/unsloth/pull/4274
fix(seed): disable remote code execution in seed inspect dataset loads by @danielhanchen in https://github.com/unslothai/unsloth/pull/4275
Update CODEOWNERS by @danielhanchen in https://github.com/unslothai/unsloth/pull/4279
fix: install data-designer plugin non-editable for Colab compatibility by @LeoBorcherding in https://github.com/unslothai/unsloth/pull/4268
Arch/mixtral by @danielhanchen in https://github.com/unslothai/unsloth/pull/4283
Improve documentation on how to export model from Colab by @danielhanchen in https://github.com/unslothai/unsloth/pull/4284
feat: Add Mixtral model support by @danielhanchen in https://github.com/unslothai/unsloth/pull/4285
Initial changes: Refactor Attention by @danielhanchen in https://github.com/unslothai/unsloth/pull/4286
patch vlm trainer to resize images by @danielhanchen in https://github.com/unslothai/unsloth/pull/4287
[WIP] add support for mixtral by @danielhanchen in https://github.com/unslothai/unsloth/pull/4288
studio: speed up setup -- uv for installs (8x), Ninja for llama.cpp (1.7x) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4289
fix: remove old comments by @Shine1i in https://github.com/unslothai/unsloth/pull/4292
PR: Windows Setup Improvements by @rolandtannous in https://github.com/unslothai/unsloth/pull/4299
miscallenous studio by @Shine1i in https://github.com/unslothai/unsloth/pull/4293
Fix: Compare Mode Deadlock, Cancel Event Poisoning & IPC Optimization by @rolandtannous in https://github.com/unslothai/unsloth/pull/4303
studio: fix GGUF inference -- reasoning tokens, max_tokens, server flags, GPU allocation by @danielhanchen in https://github.com/unslothai/unsloth/pull/4290
chat only with gguf for mac devices by @Manan17 in https://github.com/unslothai/unsloth/pull/4300
studio: add max steps and epochs toggle switch by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4296
Fix/colab plugin editable install by @LeoBorcherding in https://github.com/unslothai/unsloth/pull/4281
Graceful shutdown on Windows (signal handlers for Ctrl+C) by @rolandtannous in https://github.com/unslothai/unsloth/pull/4306
studio: simplify auth UX to password-only login by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4305
studio: preserve save_steps when toggling to epochs mode by @Imagineer99 in https://github.com/unslothai/unsloth/pull/4308
Fix studio frontend build producing empty Tailwind CSS by @danielhanchen in https://github.com/unslothai/unsloth/pull/4311
Fix setup.sh crash on Mac with empty gitignore array by @danielhanchen in https://github.com/unslothai/unsloth/pull/4313
[Feature] studio: user can upload eval dataset by @Manan17 in https://github.com/unslothai/unsloth/pull/4307
fix: Ctrl+C not terminating backend on Linux by @rolandtannous in https://github.com/unslothai/unsloth/pull/4316
Add download progress bar for non-GGUF models in Chat by @danielhanchen in https://github.com/unslothai/unsloth/pull/4314
Apply use_reentrant removal to all TRL trainer configs by @danielhanchen in https://github.com/unslothai/unsloth/pull/4321
Fix VLM GRPO matmul shape mismatch in _get_per_token_logps_and_entropies by @danielhanchen in https://github.com/unslothai/unsloth/pull/4301
Improve AI Assist: Update default model, model output parsing, logging, and dataset mapping UX by @rolandtannous in https://github.com/unslothai/unsloth/pull/4323
studio: per-model inference defaults, GGUF slider fix, reasoning toggle by @danielhanchen in https://github.com/unslothai/unsloth/pull/4325
fix: Resolve CUDA toolkit mismatch on multi-CUDA Windows systems by @rolandtannous in https://github.com/unslothai/unsloth/pull/4324
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/4332
Fix/colab comment edits by @LeoBorcherding in https://github.com/unslothai/unsloth/pull/4317
fix: add Qwen3.5 version gate in loader dispatch by @danielhanchen in https://github.com/unslothai/unsloth/pull/4335
Fix xformers Blackwell guard: broader coverage and root cause docs by @danielhanchen in https://github.com/unslothai/unsloth/pull/4338
studio: improve Colab notebook, redesign ready popup, and clean up install output by @LeoBorcherding in https://github.com/unslothai/unsloth/pull/4339
Add check to disable xformers on newer GPUs by @pluesclues in https://github.com/unslothai/unsloth/pull/4342
studio: training progress, CUDA lib path, dataset_num_proc fix by @danielhanchen in https://github.com/unslothai/unsloth/pull/4336
studio: fix stale GGUF metadata, update helper model, auth improvements by @danielhanchen in https://github.com/unslothai/unsloth/pull/4346
studio: show "Off" for repetition penalty = 1 by @danielhanchen in https://github.com/unslothai/unsloth/pull/4349
studio: update Creative/Precise presets, show "Off" for disabled samplers by @danielhanchen in https://github.com/unslothai/unsloth/pull/4350
studio: fix slow cancellation of GGUF generation by @danielhanchen in https://github.com/unslothai/unsloth/pull/4352
Fix: Remove unused warmupToastShown variable (TS6133) by @rolandtannous in https://github.com/unslothai/unsloth/pull/4353
Studio: SVG preview, fix streaming and model selector bugs by @danielhanchen in https://github.com/unslothai/unsloth/pull/4354
fix: comment out debug print statements by @rolandtannous in https://github.com/unslothai/unsloth/pull/4357
fix(llm_assist): disable thinking mode for helper model JSON output by @rolandtannous in https://github.com/unslothai/unsloth/pull/4358
studio: improve onboarding UX, tooltips, and training defaults by @danielhanchen in https://github.com/unslothai/unsloth/pull/4355

New Contributors

@LeoBorcherding made their first contribution in https://github.com/unslothai/unsloth/pull/4268
@Shine1i made their first contribution in https://github.com/unslothai/unsloth/pull/4292
@Manan17 made their first contribution in https://github.com/unslothai/unsloth/pull/4300
@Imagineer99 made their first contribution in https://github.com/unslothai/unsloth/pull/4296

Full Changelog: https://github.com/unslothai/unsloth/commits/March-2026

View release on GitHub

February-2026 New feature 5mo

Notable features

12x faster MoE training with 35% less VRAM
1.8-3.3x faster embedding model training
7x longer context RL training

Full changelog

Our first release of 2026! This year we’ve got a lot of exciting things coming and to kick things off, we’re introducing faster MoE training, embedding model support, and ultra long context for Reinforcement Learning. We’ll also be launching our brand new UI very soon.

We’d like to thank all of you for 50K stars on GitHub! ⭐

We’ve also added support for many new models that you can now run and fine-tune locally, including DeepSeek-OCR 2, GLM-4.7-Flash, Kimi-2.5, and more.

🚀 Faster MoE training

You can now train MoE models 12× faster with 35% less VRAM and 6x longer context via our new Triton and math kernels (no accuracy loss). gpt-oss-20b works on 12.8GB VRAM. Qwen3-30B-A3B (16-bit LoRA) uses 63GB.

Unsloth supports fast training for gpt-oss, Qwen3 (30B, 235B, VL, Coder), DeepSeek R1/V3 arch and GLM (4.7, Flash) models.

Faster MoE Blog

🔎 Embedding models now train 2× faster

We collaborated with Hugging Face to enable 1.8-3.3x faster embedding, BERT and classifier model training with 20% less VRAM, 2x longer context & no accuracy loss vs. FA2 setups.

Embedding model Blog

💡 Ultra Long Context RL is here

We’re introducing new batching algorithms to enable ~7x longer context (can be more than 12x) RL training with no accuracy or speed degradation vs. other optimized setups that use FA3, kernels & chunked losses.

Unsloth now trains gpt-oss QLoRA with 380K context on a single 192GB NVIDIA B200 GPU

Long Context RL Blog

🔮 New models

🐳 DeepSeek-OCR 2 - Run and fine-tune the new OCR model.
🥝 Kimi 2.5 - Run the SOTA model locally with Unsloth GGUFs.
⚡ GLM-4.7-Flash - Run and fine-tune the best-in-class 30B LLM.

🎉 Extra Updates

As part of our MoE release, we also made Gemma-3 now use Flex-Attention by default, and this works in float16 settings as well (there were infinities which we solved a while back). Gemma-3 now uses O(N) memory and not O(N^2) memory, and trains >3x faster (scales even better with context length). Previous Unsloth versions would OOM.
Vision fine-tuning now accepts mixed data of only images and text data!
trl==0.27.1 and transformers==5.1.0 are supported well - previous coverage was 30% of all our 120 notebooks, but now we have >80% coverage - we plan to make it 100% over the next few days.
And many many other bug fixes and other updates!

📖 New Guides

</> How To Use Claude Code + Codex with local LLMs: Guide
👾 Train & deploy to LM Studio for local inference: Guide
🎨 Run Diffusion image models with Unsloth GGUFs: Guide

[!TIP]
Update Unsloth via pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo
If you want PyTorch 2.9: pip install --upgrade unsloth unsloth_zoo

February is shaping up to be an amazing month for LLM releases, and we hope you’re just as excited as we are. 😊

What's Changed

[FIX] [Transformers] VLM input embeds fix for gradients by @Datta0 in https://github.com/unslothai/unsloth/pull/3715
[fbgemm] Silence tma fbgemm by @Datta0 in https://github.com/unslothai/unsloth/pull/3735
[hf_hub] Token login by @Datta0 in https://github.com/unslothai/unsloth/pull/3739
Do not overwrite slots by @Datta0 in https://github.com/unslothai/unsloth/pull/3752
Fix VLM + DDP checkpointing by @djsaunde in https://github.com/unslothai/unsloth/pull/3751
Enable 4-bit quantization on AMD Radeon GPUs by @sstamenk in https://github.com/unslothai/unsloth/pull/3748
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3753
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3760
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3767
Add missing import of inspect by @sstamenk in https://github.com/unslothai/unsloth/pull/3778
Clarify NotImplementedError for fast_inference with full_finetuning by @Fizza-Mukhtar in https://github.com/unslothai/unsloth/pull/3768
Update FUNDING.yml by @danielhanchen in https://github.com/unslothai/unsloth/pull/3792
fix(trainer): import psutil to prevent NameError in _prepare_dataset by @alkinun in https://github.com/unslothai/unsloth/pull/3780
fastrope fix for zero strided tensors by @f14-bertolotti in https://github.com/unslothai/unsloth/pull/3782
Fix crash when trl.experimental.openenv is unavailable by @Fizza-Mukhtar in https://github.com/unslothai/unsloth/pull/3787
Fix Boolean value of Tensor ambiguity error in mistral.py by @yurekami in https://github.com/unslothai/unsloth/pull/3790
fix: add support for init_lora_weights="corda" in get_peft_model by @majiayu000 in https://github.com/unslothai/unsloth/pull/3794
Fix correctness bugs in rl.py, rl_replacements.py, and vision.py by @danielhanchen in https://github.com/unslothai/unsloth/pull/3811
Fix correctness bugs across multiple model files by @danielhanchen in https://github.com/unslothai/unsloth/pull/3813
Fix 3D tensor support for bitsandbytes 8-bit matmul in forward pass by @Fizza-Mukhtar in https://github.com/unslothai/unsloth/pull/3806
FIX: weight tying for LoRA embeddings and lm_head by @oKatanaaa in https://github.com/unslothai/unsloth/pull/3711
Fix Gemma3 QAT training instability with int8-int4 scheme by @danielhanchen in https://github.com/unslothai/unsloth/pull/3818
Add helpful error messages for fast_generate when fast_inference=False by @danielhanchen in https://github.com/unslothai/unsloth/pull/3820
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3821
Make llama.cpp CURL dependency optional when building from source by @Fizza-Mukhtar in https://github.com/unslothai/unsloth/pull/3822
remove redundant code of has_block by @ykaitao in https://github.com/unslothai/unsloth/pull/3832
rl.py fixes: buffer reset, safer attribute access, typo fix by @danielhanchen in https://github.com/unslothai/unsloth/pull/3834
Respect user quantization_config by @danielhanchen in https://github.com/unslothai/unsloth/pull/3835
Fix vLLM PDL bug on Blackwell GPUs (B200/B100) by @danielhanchen in https://github.com/unslothai/unsloth/pull/3841
Sync chat_template from tokenizer to vLLM by @danielhanchen in https://github.com/unslothai/unsloth/pull/3842
remove unused variable BlockDiagonalCausalMask by @ykaitao in https://github.com/unslothai/unsloth/pull/3836
Replace GitHub API check with vLLM version check for PDL fix by @danielhanchen in https://github.com/unslothai/unsloth/pull/3849
GRPO: restore model mode after generate (stacked on #3754) by @danielhanchen in https://github.com/unslothai/unsloth/pull/3851
Fix model training state restoration in GRPO trainer by @numb3r33 in https://github.com/unslothai/unsloth/pull/3754
Unify Version usage and fix TRL version handling by @danielhanchen in https://github.com/unslothai/unsloth/pull/3843
[ModelScope] Disable stats when modelscope is being used by @Datta0 in https://github.com/unslothai/unsloth/pull/3857
Fix FBGEMM/CUTLASS errors on SM100 (Blackwell) GPUs by @danielhanchen in https://github.com/unslothai/unsloth/pull/3863
Feature/raw text dataprep by @Vangmay in https://github.com/unslothai/unsloth/pull/3612
Fix Kaggle telemetry misclassification when COLAB_ keys exist by @hnxnq7 in https://github.com/unslothai/unsloth/pull/3869
reduce code duplication by _offload_frozen_module_for_training by @ykaitao in https://github.com/unslothai/unsloth/pull/3865
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3881
wrong number of dimensions by @f14-bertolotti in https://github.com/unslothai/unsloth/pull/3880
Disable gradient checkpointing when explicitly off for vision by @ducviet00 in https://github.com/unslothai/unsloth/pull/3879
[trl] use non lora model as base for RL by @Datta0 in https://github.com/unslothai/unsloth/pull/3895
Chunk Across Batch and Context length for logprob calculations for grpo by @pluesclues in https://github.com/unslothai/unsloth/pull/3628
add weight-only int8 QAT scheme and update tests for torchao 0.15.0 by @electroglyph in https://github.com/unslothai/unsloth/pull/3859
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3905
Fix vllm ipykernel patch by @pluesclues in https://github.com/unslothai/unsloth/pull/3907
Handle Transformers 5 vLLM import errors by @danielhanchen in https://github.com/unslothai/unsloth/pull/3908
add FastSentenceTransformer for easily finetuning SentenceTransformer models by @electroglyph in https://github.com/unslothai/unsloth/pull/3719
Guard torch.compile on ROCm when triton_key is missing by @hnxnq7 in https://github.com/unslothai/unsloth/pull/3923
Grpo compile settings update by @pluesclues in https://github.com/unslothai/unsloth/pull/3927
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3937
chore: Update outdated GitHub Actions version by @pgoslatara in https://github.com/unslothai/unsloth/pull/3936
[trl] vllm trl topk fixup by @Datta0 in https://github.com/unslothai/unsloth/pull/3935
[fix] qwen3-guard tokenizer by @Datta0 in https://github.com/unslothai/unsloth/pull/3959
fix for intel devices torch compile configs by @leizhenyuan in https://github.com/unslothai/unsloth/pull/3952
Use standard gradient checkpointing for small sequence lengths by @danielhanchen in https://github.com/unslothai/unsloth/pull/3867
reduce code duplication by @ykaitao in https://github.com/unslothai/unsloth/pull/3877
Fix TRL 0.27.0 GRPO compatibility and PEFT model handling by @danielhanchen in https://github.com/unslothai/unsloth/pull/3969
Fix Vision GRPO string prompts and OpenEnv async compatibility by @danielhanchen in https://github.com/unslothai/unsloth/pull/3964
Fix num_train_epochs=None causing TypeError in GRPOConfig by @danielhanchen in https://github.com/unslothai/unsloth/pull/3972
Add TRL truncation regression and metadata loss fixes (Fixes 1 and 3) by @danielhanchen in https://github.com/unslothai/unsloth/pull/3971
Add vLLM + torch < 2.9.0 + SM100 compatibility check by @danielhanchen in https://github.com/unslothai/unsloth/pull/3973
Fix torchvision compatibility check for source builds and future torch versions by @danielhanchen in https://github.com/unslothai/unsloth/pull/3978
Trl 0.27.0 update by @pluesclues in https://github.com/unslothai/unsloth/pull/3965
Prefer flex attention when available by @danielhanchen in https://github.com/unslothai/unsloth/pull/3979
Fix GPT-OSS BlockMask error during inference by @danielhanchen in https://github.com/unslothai/unsloth/pull/3982
Silence third-party deprecation warnings and fix socket leak by @danielhanchen in https://github.com/unslothai/unsloth/pull/3983
Silence non-actionable TRL trainer import failures by @danielhanchen in https://github.com/unslothai/unsloth/pull/3980
Add PyTorch 2.10 and xformers 0.0.34 support by @danielhanchen in https://github.com/unslothai/unsloth/pull/3985
[MoE] Improve moe kernels for unsloth fine tuning by @Datta0 in https://github.com/unslothai/unsloth/pull/3812
Fix RuntimeError not caught when torchcodec fails to load by @danielhanchen in https://github.com/unslothai/unsloth/pull/3987
Fix cutlass inductor options for PyTorch < 2.8.0 by @danielhanchen in https://github.com/unslothai/unsloth/pull/3988
Disable torchcodec in transformers when FFmpeg is missing by @danielhanchen in https://github.com/unslothai/unsloth/pull/3989
Update rl_replacements.py to filter through correct trl version by @pluesclues in https://github.com/unslothai/unsloth/pull/3990
Fix multiprocessing crash on Windows/macOS and unify num_proc logic by @danielhanchen in https://github.com/unslothai/unsloth/pull/3999
Fix triton 3.6.0 + torch 2.9.x torch.compile crash (missing cluster_dims) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4001
Add push_to_hub_gguf support for FastSentenceTransformer by @Etherll in https://github.com/unslothai/unsloth/pull/4002
[Feature] seperate gguf file path by @RektPunk in https://github.com/unslothai/unsloth/pull/3934
Refactor Ollama template wiring and harden packing helpers by @mmangkad in https://github.com/unslothai/unsloth/pull/3890
Fix multi-GPU loading for quantized models in distributed training by @Fizza-Mukhtar in https://github.com/unslothai/unsloth/pull/3917
Fix broken documentation links, typos, and formatting in README by @danielhanchen in https://github.com/unslothai/unsloth/pull/4003
fix: inputs_embeds ignored when input_ids is not None in _fast_prepare_inputs_for_generation by @siddhudonda in https://github.com/unslothai/unsloth/pull/3814
Fix notebook compatibility for transformers 4.57.6 and TRL 0.22-0.27 by @danielhanchen in https://github.com/unslothai/unsloth/pull/3998
Fix VLM model + text-only dataset ValueError in TRL 0.22.x by @danielhanchen in https://github.com/unslothai/unsloth/pull/4004
Fix trl.experimental thin wrapper compilation and OOM from peft_config overwrite by @danielhanchen in https://github.com/unslothai/unsloth/pull/4006
Fix dtype mismatch in fp16 + 4-bit/8-bit LoRA training by @danielhanchen in https://github.com/unslothai/unsloth/pull/4005
Silence TRL's batch_size=1 padding-free warning in compiled trainer source by @danielhanchen in https://github.com/unslothai/unsloth/pull/4007
Silence peft target_parameters RuntimeWarning for MoE models by @danielhanchen in https://github.com/unslothai/unsloth/pull/4008
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/4009
Suppress vLLM v1 executor sleep/wake log messages by @danielhanchen in https://github.com/unslothai/unsloth/pull/4011
Inject model reference for dynamic token_type_ids detection in SFTTrainer by @danielhanchen in https://github.com/unslothai/unsloth/pull/4012
Fix EmbeddingGemma float16 NaN via FORCE_FLOAT32 for gemma3_text by @danielhanchen in https://github.com/unslothai/unsloth/pull/4014
Fix #3397: Prevent trainer tokenization hang with safe num_proc by @Fizza-Mukhtar in https://github.com/unslothai/unsloth/pull/4013
add llama.cpp prefix to gguf conversion help messages by @rolandtannous in https://github.com/unslothai/unsloth/pull/4016
[Misc] Fixes by @Datta0 in https://github.com/unslothai/unsloth/pull/4015
FP8: Load model on-the-fly in vLLM by @andrewor14 in https://github.com/unslothai/unsloth/pull/3717
Fix Gemma3 4B training on transformers 5.x (token_type_ids) by @danielhanchen in https://github.com/unslothai/unsloth/pull/4017
Fix warmup_ratio deprecation for transformers >= 5.0 by @danielhanchen in https://github.com/unslothai/unsloth/pull/4019
Misc fixes by @Datta0 in https://github.com/unslothai/unsloth/pull/4018

Unsloth Zoo Changes

Fix training crash when using DoRA + 4-bit quantization by @Etherll in https://github.com/unslothai/unsloth-zoo/pull/394
fix for #392, transformers 5 by @electroglyph in https://github.com/unslothai/unsloth-zoo/pull/393
fix: adds missing import for torch.distributed by @namekian-mystifier in https://github.com/unslothai/unsloth-zoo/pull/422
Fix dtype mismatch in full finetuning + float16 inference by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/424
Fix undefined variable 'e' in Version() function by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/425
Fix correctness bugs in logging_utils.py and loss_utils.py by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/426
Fix execute_with_time_limit start_method bug by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/428
Fix OpenEnv PYTHONPATH auto-detection for compatibility by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/429
Fix VARIANT_KWARG_KEYS import for peft >= 0.18.0 by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/430
Fix ZeroDivisionError in fused cross entropy when GPU memory exhausted by @GabrielArpini in https://github.com/unslothai/unsloth-zoo/pull/432
Only enable gradient checkpointing when requested by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/433
Removing import check in compiler.py by @Vidit-Ostwal in https://github.com/unslothai/unsloth-zoo/pull/431

Unsloth Notebooks changes

Add Gemma phone deployment notebook by @glee2429 in https://github.com/unslothai/notebooks/pull/146
Use stable executorch 1.0.0 and optimum-executorch v0.1.0 by @danielhanchen in https://github.com/unslothai/notebooks/pull/151
Update 2048 RL notebook with training results by @danielhanchen in https://github.com/unslothai/notebooks/pull/152
Update 2048 RL notebook with extended training results by @danielhanchen in https://github.com/unslothai/notebooks/pull/153
new GRPO update notebooks by @pluesclues in https://github.com/unslothai/notebooks/pull/155
gemma3 1b changes by @pluesclues in https://github.com/unslothai/notebooks/pull/156
nemo gym multi environment notebook by @cmunley1 in https://github.com/unslothai/notebooks/pull/158
Add LFM2.5 notebooks by @mlabonne in https://github.com/unslothai/notebooks/pull/159
Revert "Add LFM2.5 notebooks" by @danielhanchen in https://github.com/unslothai/notebooks/pull/161
Restore UNSLOTH_VLLM_STANDBY in Kaggle Gemma3 Vision GRPO by @danielhanchen in https://github.com/unslothai/notebooks/pull/163
Grpo update gemma notebooks correctly and news lines for notebooks by @pluesclues in https://github.com/unslothai/notebooks/pull/157
Add LFM2.5 notebooks (reopen #159) by @danielhanchen in https://github.com/unslothai/notebooks/pull/164
GLM 4.7 Flash finetuning notebook by @Datta0 in https://github.com/unslothai/notebooks/pull/166
Embedding models notebooks by @Etherll in https://github.com/unslothai/notebooks/pull/160
add Qwen3_Embedding_0.6B notebook by @Etherll in https://github.com/unslothai/notebooks/pull/167
[UPDATE] Update openenv notebooks to use the latest implementation by @burtenshaw in https://github.com/unslothai/notebooks/pull/165
Fix Vision GRPO chat template and Orpheus column removal by @danielhanchen in https://github.com/unslothai/notebooks/pull/171
update nemo gym notebooks by @cmunley1 in https://github.com/unslothai/notebooks/pull/169
Fix Vision GRPO notebooks and Orpheus TTS compatibility by @danielhanchen in https://github.com/unslothai/notebooks/pull/172
Add AMD known issues note by @hnxnq7 in https://github.com/unslothai/notebooks/pull/168
Update Dockerfile_DGX_Spark by @XEL-Maker in https://github.com/unslothai/notebooks/pull/162
Revert PR #165 - OpenEnv notebooks by @danielhanchen in https://github.com/unslothai/notebooks/pull/179
Fix update_all_notebooks.py script improvements by @danielhanchen in https://github.com/unslothai/notebooks/pull/176
Makign qwen 2.5 7b compatible with old trl versions. by @pluesclues in https://github.com/unslothai/notebooks/pull/177
Fix Ministral VL installation cells by @danielhanchen in https://github.com/unslothai/notebooks/pull/181
Improve update_all_notebooks.py: format preservation, cross-platform fixes, parallelization by @danielhanchen in https://github.com/unslothai/notebooks/pull/183
Refactor update_all_notebooks.py: reorder sections, CRLF handling, README categories by @danielhanchen in https://github.com/unslothai/notebooks/pull/184
Separate OCR into its own README section by @danielhanchen in https://github.com/unslothai/notebooks/pull/185
[MoE] notebooks for Colab by @Datta0 in https://github.com/unslothai/notebooks/pull/187

New Contributors

@sstamenk made their first contribution in https://github.com/unslothai/unsloth/pull/3748
@Fizza-Mukhtar made their first contribution in https://github.com/unslothai/unsloth/pull/3768
@alkinun made their first contribution in https://github.com/unslothai/unsloth/pull/3780
@f14-bertolotti made their first contribution in https://github.com/unslothai/unsloth/pull/3782
@yurekami made their first contribution in https://github.com/unslothai/unsloth/pull/3790
@majiayu000 made their first contribution in https://github.com/unslothai/unsloth/pull/3794
@ykaitao made their first contribution in https://github.com/unslothai/unsloth/pull/3832
@numb3r33 made their first contribution in https://github.com/unslothai/unsloth/pull/3754
@Vangmay made their first contribution in https://github.com/unslothai/unsloth/pull/3612
@hnxnq7 made their first contribution in https://github.com/unslothai/unsloth/pull/3869
@ducviet00 made their first contribution in https://github.com/unslothai/unsloth/pull/3879
@electroglyph made their first contribution in https://github.com/unslothai/unsloth/pull/3859
@pgoslatara made their first contribution in https://github.com/unslothai/unsloth/pull/3936
@RektPunk made their first contribution in https://github.com/unslothai/unsloth/pull/3934
@mmangkad made their first contribution in https://github.com/unslothai/unsloth/pull/3890
@siddhudonda made their first contribution in https://github.com/unslothai/unsloth/pull/3814

Full Changelog: https://github.com/unslothai/unsloth/compare/December-2025...February-2026

View release on GitHub

December-2025 New feature 7mo

Notable features

3x faster training with padding-free packing
500K context length support
PyTorch phone deployment guide

Full changelog

Thanks for all the love and support this year! We're wishing you all a lovely Christmas. Please update Unsloth & our Docker to use the latest updates! 🦥

Introducing 3x faster training & 30% less VRAM. New Triton kernels, padding-free & packing. Blog
500K Context training and reinforcement learning is now possible on a single 80GB GPU. Blog • Notebook
Fine-tune then Deploy LLMs on your Phone with PyTorch and Unsloth. Tweet • Read Guide
🤗 Transformers v5 is now supported! It's not enabled by default due to possible instability issues.
Preliminary multi-GPU support: DDP Guide (not representative of the official release early next year)
More: Sudoku RL nb • Paddle-OCR nb • New NVIDIA blog
Lots of bug fixes! See further below.

:crystal_ball: New Models + Guides

:sparkles:FunctionGemma: Google new 270M tool-calling LLM. Guide • Notebook
Nemotron 3: NVIDIA new 30B reasoning model. Guide • GGUF
Mistral: new coding & instruct VLMs. Ministral 3 • Devstral 2
GLM-4.6V: new vision models. Guide • 4.6V • 4.6V-Flash
More: Qwen3-Next • Mistral Large 3 • FLUX.2-dev

[!TIP]
Update Unsloth via pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo
If you want PyTorch 2.9: pip install --upgrade unsloth unsloth_zoo

Bug Fixes and Enhancements

Supports rollout_func allowing multi turn RL to work
Supports vllm>=0.12.0 and efficient GRPO for it
Supports transformers>=5.0.0, first shown via our Ministral notebooks
Fix HuggingFace token logins not working for private repos
Fixes TorchAO and QAT not working during saving
Fixed DeepSeek OCR finetuning not loading finetuned models
Improved vision utilities for vision VLM finetuning

What's Changed

Fix llama tokenizer padding_side when using model.generate in inference mode by @dmsuehir in https://github.com/unslothai/unsloth/pull/3644
Fix indefinite article usage in comments and docstrings by @mk0walsk in https://github.com/unslothai/unsloth/pull/3648
fix rope_theta -> rope_parameters['rope_theta'] by @mmathew23 in https://github.com/unslothai/unsloth/pull/3651
Fix broken link for advanced pip installation in README by @gitpullpull in https://github.com/unslothai/unsloth/pull/3652
Fix: prevent load_in_fp8 kwarg from reaching Qwen3MoeForCausalLM constructor (Fix #3649) by @bhuvanprakash in https://github.com/unslothai/unsloth/pull/3654
make unsloth_tiled_mlp a from_pretrained arg by @mmathew23 in https://github.com/unslothai/unsloth/pull/3655
FIX set defualt [128, 128] insted of none by @ved1beta in https://github.com/unslothai/unsloth/pull/3658
Fix: Pass gradient_checkpointing parameter to model.for_training() by @sbhavani in https://github.com/unslothai/unsloth/pull/3659
[FIX] Vllm guided decoding params by @Datta0 in https://github.com/unslothai/unsloth/pull/3662
Vllm guided decoding by @Datta0 in https://github.com/unslothai/unsloth/pull/3663
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3664
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3666
Update transformers version constraint in pyproject.toml by @noah1510 in https://github.com/unslothai/unsloth/pull/3689
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3694
Remove reload_weights rpc call from grpo trainer by @Datta0 in https://github.com/unslothai/unsloth/pull/3673
[Fix] [TRL] load_lora for multi line llm.chat/generate by @Datta0 in https://github.com/unslothai/unsloth/pull/3696
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3698
SFT sample packing by @djsaunde in https://github.com/unslothai/unsloth/pull/3566
Auto-enable padding-free SFT by @djsaunde in https://github.com/unslothai/unsloth/pull/3672
[FIX] fbgemm version check by @Datta0 in https://github.com/unslothai/unsloth/pull/3704
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3706
update TRL filter by @djsaunde in https://github.com/unslothai/unsloth/pull/3707
[intel] skip xpu fbgemm fp8 by @leizhenyuan in https://github.com/unslothai/unsloth/pull/3625
Mistral packing, train on completions only, simplifications by @djsaunde in https://github.com/unslothai/unsloth/pull/3709
Update torchao save by @metascroy in https://github.com/unslothai/unsloth/pull/3679
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3720
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3731
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3734
Update FUNDING.yml by @danielhanchen in https://github.com/unslothai/unsloth/pull/3736
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3737
Fix Deepseek OCR Lora Model Load by @mmathew23 in https://github.com/unslothai/unsloth/pull/3738

Unsloth Zoo Changes

updates for vLLM compativility with lora by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/359
Nightly by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/355
Add logging to tiled mlp and fix target chunk size calculation by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/361
Remove include_buffers from init_empty_weights by @pluesclues in https://github.com/unslothai/unsloth-zoo/pull/363
packed seq lengths token count correction by @djsaunde in https://github.com/unslothai/unsloth-zoo/pull/348
Configure ce target gb by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/365
[FIX] vLLM LoRA extra vocab by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/367
Nightly by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/368
[FIX] vLLM local lora tensor loading by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/370
vllm lora_dir rename and make embedding padding optional by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/373
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/375
Update e to error by @ChetanKrishna07 in https://github.com/unslothai/unsloth-zoo/pull/374
Vision utils decode image improvement by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/372
[FIX] [DDP] Fix compile for distributed training by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/379
Nightly by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/382
update compiler for XLMRobertaModel by @electroglyph in https://github.com/unslothai/unsloth-zoo/pull/383
Fix Deepseek OCR Lora Model Load by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/386
fix for non-generation models in transformers 5 by @electroglyph in https://github.com/unslothai/unsloth-zoo/pull/388

New Contributors

@dmsuehir made their first contribution in https://github.com/unslothai/unsloth/pull/3644
@gitpullpull made their first contribution in https://github.com/unslothai/unsloth/pull/3652
@bhuvanprakash made their first contribution in https://github.com/unslothai/unsloth/pull/3654
@ved1beta made their first contribution in https://github.com/unslothai/unsloth/pull/3658
@sbhavani made their first contribution in https://github.com/unslothai/unsloth/pull/3659
@noah1510 made their first contribution in https://github.com/unslothai/unsloth/pull/3689
@ChetanKrishna07 made their first contribution in https://github.com/unslothai/unsloth-zoo/pull/374
@electroglyph made their first contribution in https://github.com/unslothai/unsloth-zoo/pull/383

Full Changelog: https://github.com/unslothai/unsloth/compare/November-2025...December-2025

View release on GitHub

November-2025 New feature 8mo

Notable features

FP8 RL training with 1.4x speedup
DeepSeek-OCR fine-tuning support
Qwen3-VL model support

Full changelog

We’re getting close to our final release of 2025! Thanks so much for sticking with us this year. We’ve got lots of new features so please update Unsloth & our Docker to use the latest updates! 🦥

Introducing FP8 Reinforcement Learning in Unsloth! Train on any FP8 supported GPU and get 1.4x faster with 60% less VRAM: Read our Blog/Guide • Notebooks: Qwen3-8B FP8 GRPO and Llama-3.2-1B FP8 GRPO
You may notice Unsloth now uses much less VRAM than before, enabling even longer context. We’re also implementing faster training very soon and we’ll share all the details in an upcoming blog.
DeepSeek-OCR fine-tuning is here! We fine-tuned DeepSeek-OCR, improving its language understanding by 89%. Read our Blog • Free notebook
Qwen3-VL models supported including GGUFs to run locally: Blogpost + fixes • GGUFs
We analyzed RL training-inference mismatch for FP16 vs. BF16 and concluded that Unsloth does not have this issue: Analysis and Results
We’ve partnered with Docker to let you run LLMs locally with zero setup. Docker GGUFs are now powered by Unsloth Dynamic.
Example: docker model run hf.co/unsloth/gpt-oss-20b-GGUF:F16 Read guide
Baidu ERNIE models are now supported. Notebooks coming soon.
Unsloth now supports SGLang. Read our guide
We wrote guides for LoRA Hot Swapping and vLLM Engine Arguments
Run Kimi-K2-Thinking the most powerful open model locally. Kimi-K2 Guide
Lots of bug fixes! See further below.

[!TIP]
Update Unsloth via pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo
If you want PyTorch 2.9: pip install --upgrade unsloth unsloth_zoo

Bug Fixes and Enhancements

Supports trl>=0.25.0 and vllm>=0.11.2 and transformers>=4.57.1
Fixed gpt-oss GRPO, RL excessive re-compilations on torch>=2.9.0
Fixes Sleep mode and reduces memory usage by 5 to 15% further for RL, GRPO
Fix propagation of trust_remote_code = True
Fix Unsloth offloaded gradient checkpointing not offloading on 1st step - reduces VRAM by >20%
Add logits.detach() to GRPO to solve double backwards on some pathways
Add int64 kernels & fixed RoPE embeddings to allow super ultra long context training
Fixed 📓 OpenEnv gpt-oss RL notebook
DGX Spark docker image fixed

What's Changed

Grpo gradient accumulation edits by @pluesclues in https://github.com/unslothai/unsloth/pull/3390
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3532
Handle TRL version compatibility in rl_replacements.py by @pluesclues in https://github.com/unslothai/unsloth/pull/3540
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3546
Sleep trl patch by @Datta0 in https://github.com/unslothai/unsloth/pull/3517
Detach logits before returning from function by @pluesclues in https://github.com/unslothai/unsloth/pull/3554
Fix typos in comment by @mk0walsk in https://github.com/unslothai/unsloth/pull/3557
Formatting & bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3563
DeepseekOCR: add trust_remote_code kwarg by @mmathew23 in https://github.com/unslothai/unsloth/pull/3564
pre-commit CI config by @djsaunde in https://github.com/unslothai/unsloth/pull/3565
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3576
Resize rope embeddings for long sequence training by @mmathew23 in https://github.com/unslothai/unsloth/pull/3586
Patch in tiled mlp by @mmathew23 in https://github.com/unslothai/unsloth/pull/3584
Support for out-of-source quantizers by @Giuseppe5 in https://github.com/unslothai/unsloth/pull/3534
Fix: prevent rope_embedding AssertionError by checking kv_seq_len before reuse by @jarrycyx in https://github.com/unslothai/unsloth/pull/3578
Extend TorchAOConfig to support mobile usecases by @metascroy in https://github.com/unslothai/unsloth/pull/3587
fix qwen3 vl gradient accumulation by @mmathew23 in https://github.com/unslothai/unsloth/pull/3598
Do not force set beta to 0 for DAPO by @Datta0 in https://github.com/unslothai/unsloth/pull/3604
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3606
Fix broken links and typo in README by @mk0walsk in https://github.com/unslothai/unsloth/pull/3611
remove pre-commit workflow (covered by pre-commit app) by @djsaunde in https://github.com/unslothai/unsloth/pull/3618
Add an int64 path for mlp kernels by @mmathew23 in https://github.com/unslothai/unsloth/pull/3614
Remove grpo requirement bs=num_generations by @mmathew23 in https://github.com/unslothai/unsloth/pull/3609
Enable FP8 + RL training for bf16 models by @andrewor14 in https://github.com/unslothai/unsloth/pull/3440
Fix/save torchao model loading logic by @rolandtannous in https://github.com/unslothai/unsloth/pull/3621
Fix LlamaModel_fast_forward signature to match HF Transformers (Support inputs_embeds) by @MercuryYen in https://github.com/unslothai/unsloth/pull/3623
Add 128x128 PerBlock FP8 + RL by @andrewor14 in https://github.com/unslothai/unsloth/pull/3629
Add trust_remote_code parameter to tokenizer by @Etherll in https://github.com/unslothai/unsloth/pull/3631
[intel] change windows to remove windows-triton for intel xpu by @leizhenyuan in https://github.com/unslothai/unsloth/pull/3168

Unsloth Zoo Changes

Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/327
Fix GRPO by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/328
fix gpt oss memory calculation for intel device by @leizhenyuan in https://github.com/unslothai/unsloth-zoo/pull/330
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/331
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/332
fixed unbound local error tokenizer-model from cache by @rolandtannous in https://github.com/unslothai/unsloth-zoo/pull/333
Now it works on a uv venv by @kittawere in https://github.com/unslothai/unsloth-zoo/pull/336
Gemma3n fix by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/338
[Intel] remove triton windows for intel by @leizhenyuan in https://github.com/unslothai/unsloth-zoo/pull/243
FP8 training enhancements by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/337
GRPO gradient accumulation steps update and DAPO support by @pluesclues in https://github.com/unslothai/unsloth-zoo/pull/308
Fix/video collate by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/342
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/344
FP8, Standby and vLLM updates by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/340
Put importance sampling into no grad by @pluesclues in https://github.com/unslothai/unsloth-zoo/pull/343
Detach hidden states to avoid gradient carry by @pluesclues in https://github.com/unslothai/unsloth-zoo/pull/345
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/347
MoE: Cast routing_weights dtype correctly by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/349
return local model in determine_base_model_source with any quantization by @noah1510 in https://github.com/unslothai/unsloth-zoo/pull/334
Enable FP8 + RL training by @andrewor14 in https://github.com/unslothai/unsloth-zoo/pull/351
Tiled MLP Implementation by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/350
Fix gradient checkpointing layer caller kwargs by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/353
vLLM weight scale FP8 and standby override by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/354
Fix docstring removing regex to support empty parentheses by @noisycat3 in https://github.com/unslothai/unsloth-zoo/pull/360

Unsloth Notebooks Changes

Feat/qwen3 vl by @Erland366 in https://github.com/unslothai/notebooks/pull/119
Feat/double footer fix by @Erland366 in https://github.com/unslothai/notebooks/pull/121
Add GGUF section for Qwen3-VL by @Etherll in https://github.com/unslothai/notebooks/pull/123
Fix TypeError in unsloth_push_to_hub_gguf() when pushing GGUF model to Hugging Face by @samanta-sc in https://github.com/unslothai/notebooks/pull/125
fix TorchAOConfig' object has no attribute 'base_config' error by @rolandtannous in https://github.com/unslothai/notebooks/pull/129
Updated Dockerfile for DGX Spark by @sameersegal in https://github.com/unslothai/notebooks/pull/133
gemma3-270m: reduce batch size for sample packing by @djsaunde in https://github.com/unslothai/notebooks/pull/135
fix dataset formatting and mapping for Magistral reasoning by @rolandtannous in https://github.com/unslothai/notebooks/pull/136
fix magistral inference by @rolandtannous in https://github.com/unslothai/notebooks/pull/138

Full Changelog: https://github.com/unslothai/unsloth/compare/October-2025...November-2025

What's Changed

Grpo gradient accumulation edits by @pluesclues in https://github.com/unslothai/unsloth/pull/3390
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3532
Handle TRL version compatibility in rl_replacements.py by @pluesclues in https://github.com/unslothai/unsloth/pull/3540
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3546
Sleep trl patch by @Datta0 in https://github.com/unslothai/unsloth/pull/3517
Detach logits before returning from function by @pluesclues in https://github.com/unslothai/unsloth/pull/3554
Fix typos in comment by @mk0walsk in https://github.com/unslothai/unsloth/pull/3557
Formatting & bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3563
DeepseekOCR: add trust_remote_code kwarg by @mmathew23 in https://github.com/unslothai/unsloth/pull/3564
pre-commit CI config by @djsaunde in https://github.com/unslothai/unsloth/pull/3565
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3576
Resize rope embeddings for long sequence training by @mmathew23 in https://github.com/unslothai/unsloth/pull/3586
Patch in tiled mlp by @mmathew23 in https://github.com/unslothai/unsloth/pull/3584
Support for out-of-source quantizers by @Giuseppe5 in https://github.com/unslothai/unsloth/pull/3534
Fix: prevent rope_embedding AssertionError by checking kv_seq_len before reuse by @jarrycyx in https://github.com/unslothai/unsloth/pull/3578
Extend TorchAOConfig to support mobile usecases by @metascroy in https://github.com/unslothai/unsloth/pull/3587
fix qwen3 vl gradient accumulation by @mmathew23 in https://github.com/unslothai/unsloth/pull/3598
Do not force set beta to 0 for DAPO by @Datta0 in https://github.com/unslothai/unsloth/pull/3604
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3606
Fix broken links and typo in README by @mk0walsk in https://github.com/unslothai/unsloth/pull/3611
remove pre-commit workflow (covered by pre-commit app) by @djsaunde in https://github.com/unslothai/unsloth/pull/3618
Add an int64 path for mlp kernels by @mmathew23 in https://github.com/unslothai/unsloth/pull/3614
Remove grpo requirement bs=num_generations by @mmathew23 in https://github.com/unslothai/unsloth/pull/3609
Enable FP8 + RL training for bf16 models by @andrewor14 in https://github.com/unslothai/unsloth/pull/3440
Fix/save torchao model loading logic by @rolandtannous in https://github.com/unslothai/unsloth/pull/3621
Fix LlamaModel_fast_forward signature to match HF Transformers (Support inputs_embeds) by @MercuryYen in https://github.com/unslothai/unsloth/pull/3623
Add 128x128 PerBlock FP8 + RL by @andrewor14 in https://github.com/unslothai/unsloth/pull/3629
Add trust_remote_code parameter to tokenizer by @Etherll in https://github.com/unslothai/unsloth/pull/3631
[intel] change windows to remove windows-triton for intel xpu by @leizhenyuan in https://github.com/unslothai/unsloth/pull/3168
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3634
Float8 GRPO, RL by @danielhanchen in https://github.com/unslothai/unsloth/pull/3640

New Contributors

@mk0walsk made their first contribution in https://github.com/unslothai/unsloth/pull/3557
@pre-commit-ci[bot] made their first contribution in https://github.com/unslothai/unsloth/pull/3576
@Giuseppe5 made their first contribution in https://github.com/unslothai/unsloth/pull/3534
@jarrycyx made their first contribution in https://github.com/unslothai/unsloth/pull/3578
@MercuryYen made their first contribution in https://github.com/unslothai/unsloth/pull/3623

Full Changelog: https://github.com/unslothai/unsloth/compare/October-2025...November-2025

View release on GitHub

October-2025 New feature 9mo

Notable features

Unsloth Docker image on Docker Hub
Quantization-Aware Training (QAT) with 70% accuracy recovery
Qwen3-VL and Granite-4.0 support

Full changelog

Hey everyone, please update Unsloth to use the latest updates! 🦥

Unsloth now has its own 🐋 Docker image! Start training with no setup: Read our Guide • Docker image
We collabed with NVIDIA for Blackwell and DGX Spark support. Read our Blackwell guide and DGX guide.

New model updates

Qwen3-VL models are all now supported: Blogpost • SFT 8B notebook • GRPO 8B notebook
IBM Granite-4.0 models are now supported. Granite-4.0 guide • Notebook
OpenAI showcased our new gpt-oss RL notebook for autonomously solving the 2048 game. Blogpost • Notebook
Read about our GLM-4.6 chat template fixes and how to run the model here

New features

Introducing Quantization-Aware Training: We collabed with Pytorch for QAT, recovering as much 70% accuracy. Read blog
Unsloth supports OpenEnv to allow for open RL environments. Blog coming soon • Notebook
New customer support agent notebook to enable real-time analysis & solving of customer interactions. You'll also learn how to train models using data from Google Sheets.
Support for Python 3.13, PyTorch 2.9 and the latest Hugging Face TRL and transformers are now fixed.
Save to TorchAO supported as well:

from torchao.quantization import Int4WeightOnlyConfig
model.save_pretrained_torchao("model", tokenizer, torchao_config = Int4WeightOnlyConfig())

[!TIP]
Update Unsloth via pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo
If you want PyTorch 2.9: pip install --upgrade unsloth unsloth_zoo

RL Improvements

Fixed Standby consuming more VRAM than usual. Auto selects the maximum 80% to 95% of GPU utilization if import os; os.environ["UNSLOTH_VLLM_STANDBY"] = "1" is used.
Fixed GRPO training hangs with better environment timers - works on DGX Spark and all other GPUs.
Fixes GRPO RuntimeError: shape '[1, 887, 1, 128]' is invalid for input of size 3633152 for all models

RL Environment functions

New execute_with_time_limit function to force functions to execute within a time limit. E.g. with a 2 second time limit, use:

from unsloth import execute_with_time_limit
@execute_with_time_limit(2)
def execute_strategy(strategy, game):
    return _execute_strategy(strategy, game)
try:
    execute_strategy(strategy, game)
except TimeoutError as e:
    print(f"Timed out with error = {str(e)}")

To check if only Python standard modules are used in a function, use check_python_modules.
Use create_locked_down_function to create a function without leakage of global variables.
Use Benchmarker ie from unsloth import Benchmarker to benchmark functions accurately. It wipes the L1 to L3 cache approximately to reduce chances of benchmark cheating.
Use launch_openenv to launch a continuous reloaded OpenEnv environment process (to stop it from closing down) ie from unsloth import launch_openenv It will auto find a port that is not used.

Bug fixes

GPT-OSS BF16 The GPTOSSRouter works with load_in_4bit = True AttributeError: 'GptOssTopKRouter' object has no attribute 'weight'
Mistral training fixed - sentencepiece proto issue fixed (any protobuf version works)
Fix evaluation ie UNSLOTH_RETURN_LOGITS="1" works. Fixes https://github.com/unslothai/unsloth/issues/3126 https://github.com/unslothai/unsloth/issues/3071
Fixes Output 0 of UnslothFusedLossBackward is a view and is being modified inplace. for Gemma 3 and transformers>=4.57.1
If you see ImportError: cannot import name '_Ink' from 'PIL._typing' (/usr/local/lib/python3.12/dist-packages/PIL/_typing.py) please update and use our new notebooks

Don't forget to also join our Reddit: r/unsloth 🥰

What's Changed

Fix loading as 8bit by @Etherll in https://github.com/unslothai/unsloth/pull/3384
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3392
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3394
Update int8-int4 QAT config to use Int8DynamicActivationIntxWeightConfig by @metascroy in https://github.com/unslothai/unsloth/pull/3391
Gemma 3 bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3410
Transformers Fix v4.57 rename from PretrainedConfig to PreTrainedConfig by @mmathew23 in https://github.com/unslothai/unsloth/pull/3445
improve qat by @Etherll in https://github.com/unslothai/unsloth/pull/3446
Fix eval metric issue by @pluesclues in https://github.com/unslothai/unsloth/pull/3420
[Part2] Reinstate llama.cpp Compatibility and GGUF Conversion with Multiple Quantizations and Automated Ollama Modelfile Creation by @rolandtannous in https://github.com/unslothai/unsloth/pull/3356
vLLM FP8 quantized support for SFT/GRPO by @Datta0 in https://github.com/unslothai/unsloth/pull/3414
Fix by @danielhanchen in https://github.com/unslothai/unsloth/pull/3466
AMD fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3467
Fix transformers 4.57.1 by @danielhanchen in https://github.com/unslothai/unsloth/pull/3473
GRPO bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3474
EOL LF (unix line endings) normalization by @djsaunde in https://github.com/unslothai/unsloth/pull/3478
Fix out of resources issue for llama3.2 sft on amd gpu by @wangxunx in https://github.com/unslothai/unsloth/pull/3455
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3483
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3484
Patch sleep mode properly for trl by @Datta0 in https://github.com/unslothai/unsloth/pull/3492
Sleep trl patch by @Datta0 in https://github.com/unslothai/unsloth/pull/3494
fix cross entropy loss issue for small vocab size on amd gpu by @wangxunx in https://github.com/unslothai/unsloth/pull/3503
Gemma 3n fix by @mmathew23 in https://github.com/unslothai/unsloth/pull/3499
enable intel for torch2.8 by @leizhenyuan in https://github.com/unslothai/unsloth/pull/3381
add code for intel qlora by @leizhenyuan in https://github.com/unslothai/unsloth/pull/3370
fix for intel memory calculation by @leizhenyuan in https://github.com/unslothai/unsloth/pull/3513
[intel] enable support 2.9 for intel xpu by @leizhenyuan in https://github.com/unslothai/unsloth/pull/3514
FP8 training enhancements by @Datta0 in https://github.com/unslothai/unsloth/pull/3496

New Contributors

@metascroy made their first contribution in https://github.com/unslothai/unsloth/pull/3391
@djsaunde made their first contribution in https://github.com/unslothai/unsloth/pull/3478
@wangxunx made their first contribution in https://github.com/unslothai/unsloth/pull/3455

Full Changelog: https://github.com/unslothai/unsloth/compare/September-2025-v3...October-2025

View release on GitHub

September-2025-v3 Breaking risk 10mo

Notable features

gpt-oss RL with 3x faster inference
Custom matrix multiplication kernels
Reward-hacking mitigation

Full changelog

We’re introducing gpt-oss RL support and the fastest RL inference and lowest VRAM use vs. any implementation. Blog: https://docs.unsloth.ai/new/gpt-oss-reinforcement-learning

Unsloth now offers the fastest inference (~3x faster), lowest VRAM (50% less) and most context (8x longer) for gpt-oss RL vs. any implementation - with no accuracy loss.
Since RL on gpt-oss isn't yet vLLM compatible, we rewrote Transformers inference code to enable faster inference
gpt-oss-20b GSPO free Colab notebook
This notebook automatically creates faster matrix multiplication kernels and uses a new Unsloth reward function. We also show how to counteract reward-hacking which is one of RL's biggest challenges.

We previously released Vision RL with GSPO support
⚠️ Reminder to NOT use Flash Attention 3 for gpt-oss as it'll make your training loss wrong.
DeepSeek-V3.1-Terminus is here and you can run locally via our GGUF
Read how our 3-bit GGUF beats Claude-4-Opus (thinking) on Aider Polyglot here
Magistral 1.2 is here and you can run it locally here or fine-tune it for free by using our Kaggle notebook
Fine-tuning the new Qwen3 models including Qwen3-VL, Qwen3-Omni and Qwen3-Next should work in Unsloth if you install the latest transformers. The models are big however so ensure you have enough VRAM.
BERT is now fixed! Feel free to use our BERT fine-tuning notebook

Don't forget to also join our Reddit: r/unsloth 🥰

What's Changed

Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3329
Fix QAT + LoRA fast path, add tests by @andrewor14 in https://github.com/unslothai/unsloth/pull/3307
Use gemma3n embedder patch + adjust FORCE_FLOAT32 match logic by @mmathew23 in https://github.com/unslothai/unsloth/pull/3332
Synthetic Data updates by @mmathew23 in https://github.com/unslothai/unsloth/pull/3333
Fix loading issues for BERT by @Etherll in https://github.com/unslothai/unsloth/pull/3339
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3335
peft_config before model_config by @mmathew23 in https://github.com/unslothai/unsloth/pull/3342
specify different tokenizer_path/name by @mmathew23 in https://github.com/unslothai/unsloth/pull/3343
correct python support statement by @laz-001 in https://github.com/unslothai/unsloth/pull/3374
GPT OSS RL by @danielhanchen in https://github.com/unslothai/unsloth/pull/3362

New Contributors

@laz-001 made their first contribution in https://github.com/unslothai/unsloth/pull/3374

Full Changelog: https://github.com/unslothai/unsloth/compare/September-2025-v2...September-2025-v3

View release on GitHub

September-2025-v2 New feature 10mo

Notable features

Vision/multimodal RL support
GSPO algorithm implementation
Unsloth Standby for RL memory efficiency

Full changelog

We're excited to support Vision models for RL and even more memory efficient + faster RL!

Unsloth now supports vision/multimodal RL with Gemma 3, Qwen2.5-VL and other vision models. Due to Unsloth's unique weight sharing and custom kernels, Unsloth makes VLM RL 1.5–2× faster, uses 90% less VRAM, and enables 10× longer context lengths than FA2 setups, with no accuracy loss. Qwen2.5-VL GSPO notebook
Gemma 3 (4B) Vision GSPO notebook

Full details in our blogpost: https://docs.unsloth.ai/new/vision-reinforcement-learning-vlm-rl

This update also introduces Qwen's GSPO algorithm.
Our new vision RL support also comes now even faster & more memory efficient! Our new kernels & algos allows faster RL for text and vision LLMs with 50% less VRAM & 10× more context.
Introducing a new RL feature called 'Standby'. Before, RL requires GPU splitting between training & inference. With Unsloth Standby, you no longer have to & 'Unsloth Standby' uniquely limits speed degradation compared to other implementations and sometimes makes training even faster! Read our Blog
We released Aider Polyglot benchmarks for our DeepSeek-V3.1 Dynamic GGUFs and Unsloth quants perform consistently better than others. Blog

Don't forget to also join our Reddit: r/unsloth 🥰

What's Changed

GPT OSS Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3231
tests for mxfp4 and quantized models merge fix unsloth zoo pr 254 by @rolandtannous in https://github.com/unslothai/unsloth/pull/3223
Update mistral.py, showed flag to not call cut cross entropy by @pluesclues in https://github.com/unslothai/unsloth/pull/3233
Remove old version constraint in dependency list by @timkpaine in https://github.com/unslothai/unsloth/pull/3237
chore: Fix Typos by @DefiWimar7 in https://github.com/unslothai/unsloth/pull/3246
Fix incorrect function call in test_qwen3_grpo.py by @stevenxdavis in https://github.com/unslothai/unsloth/pull/3212
[Intel] make intel device support ROPE by @leizhenyuan in https://github.com/unslothai/unsloth/pull/3164
Support saving locally in model.save_pretrained_torchao by @jerryzh168 in https://github.com/unslothai/unsloth/pull/3263
fixed save_pretrained_torchao and associated tests by @rolandtannous in https://github.com/unslothai/unsloth/pull/3264
patch sftrainer to disable _is_vlm by @mmathew23 in https://github.com/unslothai/unsloth/pull/3265
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3266
Filter vllm executor log by @Datta0 in https://github.com/unslothai/unsloth/pull/3268
llama vision inference fix by @mmathew23 in https://github.com/unslothai/unsloth/pull/3270
Add TorchAO quantization tests with FP16 models and serialization workarounds by @rolandtannous in https://github.com/unslothai/unsloth/pull/3269
GptAttention turn training off during inference by @mmathew23 in https://github.com/unslothai/unsloth/pull/3289
Add support for QAT full fine-tuning by @andrewor14 in https://github.com/unslothai/unsloth/pull/3238
simplify unsloth_base_fast_generate by @mmathew23 in https://github.com/unslothai/unsloth/pull/3291
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3295
[ROCm] add hip device path by @billishyahao in https://github.com/unslothai/unsloth/pull/3301
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3322
Add support for modules_to_save in FastModel.get_peft_model by @l1ghtsource in https://github.com/unslothai/unsloth/pull/3317
Fast Inference with vLLM for VLMs by @Datta0 in https://github.com/unslothai/unsloth/pull/2975
TRL Updated version of VLM GRPO update along with GSPO by @pluesclues in https://github.com/unslothai/unsloth/pull/3132

New Contributors

@timkpaine made their first contribution in https://github.com/unslothai/unsloth/pull/3237
@stevenxdavis made their first contribution in https://github.com/unslothai/unsloth/pull/3212
@l1ghtsource made their first contribution in https://github.com/unslothai/unsloth/pull/3317

Full Changelog: https://github.com/unslothai/unsloth/compare/August-2025-v2...September-2025-v2

View release on GitHub

August-2025-v2 New feature 11mo

Notable features

Flex Attention for 8x longer context
Export QLoRA to llama.cpp/vLLM/HF
MXFP4 inference swiglu fix

Full changelog

We’re excited to introduce Unsloth Flex Attention support for OpenAI gpt-oss training that enables >8× longer context lengths, >50% less VRAM usage and >1.5× faster training compared to all implementations including those using Flash Attention 3 (FA3). Unsloth Flex Attention makes it possible to train with a 60K context length on just 80GB of VRAM for BF16 LoRA. Also:

You can now export/save your QLoRA fine-tuned gpt-oss model to llama.cpp, vLLM, or HF.
We fixed gpt-oss training losses going to infinity on float16 GPUs (like T4 Colab)
We fixed gpt-oss implementation issues, most notably ensuring that swiglu_limit = 7.0 is properly applied during MXFP4 inference in transformers
Unsloth Flex Attention scales with context, longer sequences yield bigger savings in both VRAM and training time

Full details in our blogpost: https://docs.unsloth.ai/basics/long-context-gpt-oss-training

What's Changed

Add Qwen3 Instruct / Thinking chat templates by @Etherll in https://github.com/unslothai/unsloth/pull/3110
Add Qwen3 4B to mapper.py by @Etherll in https://github.com/unslothai/unsloth/pull/3120
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3148
Fix GPT OSS by @danielhanchen in https://github.com/unslothai/unsloth/pull/3154
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3169
Update Blackwell install instructions for latest vLLM release by @qingy1337 in https://github.com/unslothai/unsloth/pull/3175
Fix potential generator exhaustion bug in model loading file detection by @rolandtannous in https://github.com/unslothai/unsloth/pull/3167
Fix vision model GGUF quantization_method error type by @rolandtannous in https://github.com/unslothai/unsloth/pull/3173
Replace back ticks with single quotes by @rnowling in https://github.com/unslothai/unsloth/pull/3157
Fix original_push_to_hub fallback by @Thiraput01 in https://github.com/unslothai/unsloth/pull/3115
Add support for QAT + LoRA by @andrewor14 in https://github.com/unslothai/unsloth/pull/2976
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3180
Torch 2.8 by @danielhanchen in https://github.com/unslothai/unsloth/pull/3186
Fix extras transformers typo in pyproject.toml by @parth2510 in https://github.com/unslothai/unsloth/pull/3187
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3195
allow torch.float32 dtype in FastLanguageModel by @mmathew23 in https://github.com/unslothai/unsloth/pull/3204
fix is casual for qwen3 by @leizhenyuan in https://github.com/unslothai/unsloth/pull/3213
Support model.save_pretrained_torchao by @jerryzh168 in https://github.com/unslothai/unsloth/pull/3111
Fix gemma-3n by @mmathew23 in https://github.com/unslothai/unsloth/pull/3219
Handle transformers move to dtype from torch_dtype by @mmathew23 in https://github.com/unslothai/unsloth/pull/3225
chore: Fix Typos by @DefiWimar7 in https://github.com/unslothai/unsloth/pull/3224

New Contributors

@rnowling made their first contribution in https://github.com/unslothai/unsloth/pull/3157
@Thiraput01 made their first contribution in https://github.com/unslothai/unsloth/pull/3115
@andrewor14 made their first contribution in https://github.com/unslothai/unsloth/pull/2976
@parth2510 made their first contribution in https://github.com/unslothai/unsloth/pull/3187
@jerryzh168 made their first contribution in https://github.com/unslothai/unsloth/pull/3111
@DefiWimar7 made their first contribution in https://github.com/unslothai/unsloth/pull/3224

Full Changelog: https://github.com/unslothai/unsloth/compare/August-2025...August-2025-v2

View release on GitHub

August-2025 New feature 11mo

Notable features

gpt-oss training on 14GB VRAM (Colab compatible)
1.5x faster training, 50% less VRAM
Blackwell RTX 50 support

Full changelog

gpt-oss is here! ✨

Finetune gpt-oss for free with our Unsloth Colab notebook!

We’ve managed to make gpt-oss train on just 14GB of VRAM, making it possible to work on free Colab due to our linear conversions. For more details, Read our Guide/Blogpost
Fine-tuning gpt-oss is 1.5x faster and uses 50% less VRAM with Unsloth. gpt-oss-120b model fits on 65GB of VRAM.
Model uploads: 20b GGUF • 120b GGUF • All uploads

:sloth: Unsloth updates

We’ve made algorithmic updates to Unsloth so every model now trains faster and with less VRAM, no matter which.
Unsloth now works on RTX 50 and Blackwell GPUs. Read our guide.
Official Unsloth Docker image coming very soon!
You can now run Unsloth models directly via Docker: docker model pull hf.co/unsloth/gpt-oss-20b-GGUF

:stars: Qwen3-Coder + Qwen3-2507

Qwen made July, 2025 updates called 'Qwen3-2507' and launched their SOTA coding models!

Qwen3-Coder (with Unsloth fixes): Guide • Coder uploads
Qwen3-2507: Guide • 2507 uploads
Fine-tune Qwen3-4B-2507 with our Colab notebook

:crystal_ball: New models + Support:

Run these new models:

Kimi-K2: Guide • GGUF
GLM: 4.5-Air • 4.5 • 4-32B-0414
Orpheus-3B • Hunyuan-A13B

Unsloth also now supports running + training for:

We collabed with the Liquid & TII teams to support training for Falcon-H1-7B and LFM2-1.2B! Notebooks here
Devstral-2507 • Magistral-2507 • SmolLM3-3B

Don't forget to also join our Reddit: r/unsloth 🥰

What's Changed

Fix argument mismatch in GRPO _get_per_token_logps lambda function by @rolandtannous in https://github.com/unslothai/unsloth/pull/2929
patch falcon h1 inference by @mmathew23 in https://github.com/unslothai/unsloth/pull/2932
Fix falcon H1 dropout issue by @Datta0 in https://github.com/unslothai/unsloth/pull/2938
fix: change lora_dropout from int to float for type consistency by @muzzlol in https://github.com/unslothai/unsloth/pull/2949
GRPO fix dataloader_num_workers value error in GRPOTrainer by @rolandtannous in https://github.com/unslothai/unsloth/pull/2944
GRPO Fix - Support vllm pre-dequantized quantization states in fast_dequantize kernel by @rolandtannous in https://github.com/unslothai/unsloth/pull/2943
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/2982
Update unsloth-cli.py by @qgallouedec in https://github.com/unslothai/unsloth/pull/2985
use fastmodel falcon h1 by @mmathew23 in https://github.com/unslothai/unsloth/pull/2987
Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model merge error by @rolandtannous in https://github.com/unslothai/unsloth/pull/2986
Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model merge error" by @danielhanchen in https://github.com/unslothai/unsloth/pull/2988
Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized … by @danielhanchen in https://github.com/unslothai/unsloth/pull/2990
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/2998
Update README.md by @qgallouedec in https://github.com/unslothai/unsloth/pull/2991
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3017
[bugs] fix for casual mask by @leizhenyuan in https://github.com/unslothai/unsloth/pull/3011
[intel] add for intel path for llama.py by @leizhenyuan in https://github.com/unslothai/unsloth/pull/3012
Fix Gemma 2 by @danielhanchen in https://github.com/unslothai/unsloth/pull/3024
falcon h1 force float32 when dtype is torch.float16 by @mmathew23 in https://github.com/unslothai/unsloth/pull/3026
Fix torch compile issues by @danielhanchen in https://github.com/unslothai/unsloth/pull/3028
Fix Llama and Gemma inference by @Erland366 in https://github.com/unslothai/unsloth/pull/3034
Fixup multi GPU workload. by @Datta0 in https://github.com/unslothai/unsloth/pull/3049
Bug Fixes and Enhancements for Model Loading by @Etherll in https://github.com/unslothai/unsloth/pull/3052
Add gemma-3n chat template to chat_templates.py by @Etherll in https://github.com/unslothai/unsloth/pull/3051
Fix: Added specific check for Gemma so models like BERT properly init… by @Sekinal in https://github.com/unslothai/unsloth/pull/3055
fixup rope sync for everything by @Datta0 in https://github.com/unslothai/unsloth/pull/3061
get_per_token_logps_and_entropies: return tuple instead of dict by @mmathew23 in https://github.com/unslothai/unsloth/pull/3080
Docs: Add WSL Installation Guide for Blackwell / RTX 5090 GPU by @dongbin-lunark in https://github.com/unslothai/unsloth/pull/3079
GPT-OSS support by @mmathew23 in https://github.com/unslothai/unsloth/pull/3099
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3102
gpt-oss manually call temporary patch by @mmathew23 in https://github.com/unslothai/unsloth/pull/3104

New Contributors

@muzzlol made their first contribution in https://github.com/unslothai/unsloth/pull/2949
@Sekinal made their first contribution in https://github.com/unslothai/unsloth/pull/3055
@dongbin-lunark made their first contribution in https://github.com/unslothai/unsloth/pull/3079

Full Changelog: https://github.com/unslothai/unsloth/compare/July-2025...August-2025

View release on GitHub

July-2025 Bug fix 1y

Notable features

10-25% VRAM reduction across all models
GRPO works with latest TRL main
Qwen 2.5 and GLM fixes

Full changelog

More VRAM reduction, faster & bug fixes

Please update Unsloth! pip install --upgrade --force-reinstall --no-deps --no-cache-dir unsloth unsloth_zoo

Gemma 3N Vision now works and is fixed! Please re-download all model checkpoints (Unsloth will auto do it) Try Kaggle Notebook! There is also a challenge with a prize pool of $100,000!
Gemma 3 text and vision are all fixed for T4, and is much faster. Losses of 6 to 7 are now fixed - it should be 1 to 2.
10 to 25% less VRAM consumption for all models. Also faster compiling and less errors. Unsloth is now more stable!
Downloads stuck at 90% to 95% fixed!
Qwen 2.5, Qwen 2, GLM all fixed as well.
GRPO now works with latest main TRL
Main TRL, PEFT, Transformers all work
Forced upgrading transformers is now fixed.
Falcon H1 finetuning should work great! Notebooks incoming
Devstral 1.1 and MedGemma 27B, 4B support with vision
Many many many more bug fixes - this release of Unsloth should be much more stable and error tolerant!

Please update Unsloth! pip install --upgrade --force-reinstall --no-deps --no-cache-dir unsloth unsloth_zoo

What's Changed

Gemma 3N by @danielhanchen in https://github.com/unslothai/unsloth/pull/2809
Add instructions for installing unsloth on RTX 5090 by @jeromeku in https://github.com/unslothai/unsloth/pull/2812
Add falcon h1 by @dhiaEddineRhaiem in https://github.com/unslothai/unsloth/pull/2650
Granite4 support by @mmathew23 in https://github.com/unslothai/unsloth/pull/2799
import undefined transformers_version for falcon model by @mmathew23 in https://github.com/unslothai/unsloth/pull/2822
Fix LoftQ with FastBaseModel by @mehmetoguzderin in https://github.com/unslothai/unsloth/pull/2826
Create stale.yml by @danielhanchen in https://github.com/unslothai/unsloth/pull/2832
Create stale.yml by @danielhanchen in https://github.com/unslothai/unsloth/pull/2836
Added conda/mamba section to blackwell installation readme by @rolandtannous in https://github.com/unslothai/unsloth/pull/2817
Gemma 3N bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/2842
Fix loftq None config for FastBaseModel by @mmathew23 in https://github.com/unslothai/unsloth/pull/2848
Convert torch.bfloat16, torch.float16, etc. to vLLM valid dtypes by @rishabh135 in https://github.com/unslothai/unsloth/pull/2811
[Feature] enable unsloth on amd gpu by @billishyahao in https://github.com/unslothai/unsloth/pull/2520
Fix Gemma 3N by @danielhanchen in https://github.com/unslothai/unsloth/pull/2854
fix quantized model parameter count method by @rolandtannous in https://github.com/unslothai/unsloth/pull/2855
Update CSM for faster inference (no compile) by @mmathew23 in https://github.com/unslothai/unsloth/pull/2865
Fix UnslothTrainingArguments not patching trl.Config properly by @Erland366 in https://github.com/unslothai/unsloth/pull/2873
Fix unnecessary warning for transformers >= 4.53.0 by @mmathew23 in https://github.com/unslothai/unsloth/pull/2867
Update README.md by @danielhanchen in https://github.com/unslothai/unsloth/pull/2885
Many bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/2908
silenty skip falcon h1 import if transformers_version < 4.53.0 by @mmathew23 in https://github.com/unslothai/unsloth/pull/2912
Dynamically adjust get_per_token_logps [trl main upgrade] by @Datta0 in https://github.com/unslothai/unsloth/pull/2911
[Intel] add intel gpu with vllm support by @leizhenyuan in https://github.com/unslothai/unsloth/pull/2903
[bugs] fix for casual mask by @leizhenyuan in https://github.com/unslothai/unsloth/pull/2868
Explicitly check if xformers exists for attention by @Datta0 in https://github.com/unslothai/unsloth/pull/2889
Falcon H1: if mlp doesn't exist in layer module check for feed_forward by @mmathew23 in https://github.com/unslothai/unsloth/pull/2913
Move inputs to right devices. by @Datta0 in https://github.com/unslothai/unsloth/pull/2919
Many bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/2927

New Contributors

@dhiaEddineRhaiem made their first contribution in https://github.com/unslothai/unsloth/pull/2650
@mehmetoguzderin made their first contribution in https://github.com/unslothai/unsloth/pull/2826
@rishabh135 made their first contribution in https://github.com/unslothai/unsloth/pull/2811
@billishyahao made their first contribution in https://github.com/unslothai/unsloth/pull/2520

Full Changelog: https://github.com/unslothai/unsloth/compare/June-2025...July-2025

View release on GitHub

June-2025 New feature 1y

Notable features

Gemma 3n text-image-video-audio support
TTS model fine-tuning (Orpheus, Whisper)
DeepSeek-R1 GRPO support

Full changelog

✨ Gemma 3n now available

Google's new Gemma 3n multimodal models that support text, image, video & audio. Guide
Gemma 3n finetuning notebook + audio, vision, text inference Colab notebook
Gemma 3n collection in dynamic GGUF, safetensor 4-bit etc formats: Gemma-3n

🎵 Text-to-Speech (TTS) Fine-tuning

Train TTS/STT models like Sesame-CSM, Orpheus-TTS and OpenAI's Whisper locally! Guide
Clone voices, learn new emotions, tones & styles with 1.5x faster training and -50% VRAM. Notebooks

[!TIP]
Update Unsloth via pip install --upgrade --force-reinstall unsloth unsloth_zoo

🧠 DeepSeek-R1-0528 Support with Dynamic 1-bit GGUFs

Fine-tune DeepSeek-R1-0528-Qwen3 with GRPO! Our new reward function increases multilingual response rates by 40%+ Notebook
Dynamic 1-bit GGUFs shrink the full 715GB model to just 175GB (-80% size)

📈 Dynamic 2.0 GGUFs

New quantization method that achieves SOTA performance. More info
Sets new benchmarks for 5-shot MMLU and KL Divergence and selectively quantizes layers for optimal accuracy

⚡ Advanced Qwen3 GRPO notebook

Proximity scoring for more better reward functions. Advanced GRPO notebook
New Prefinetuning/priming to skip GRPO format learning

🎯 Magistral Conversational Reasoning

Fine-tune Magistral-24B for advanced conversational reasoning. Notebook

👁️ Gemma3 Vision Support

Fine-tune Gemma3 vision models for multimodal tasks Notebook

Documentation & Guides

Reinforcement Learning Guide: Complete guide on RL for LLMs covering GRPO, RLHF, DPO. Guide
LoRA Hyperparameters Guide: Master optimal learning rates, epochs, LoRA rank & alpha settings. Guide

What's Changed

Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/2448
Added k_norm & q_norm to merged Qwen3 layers by @cblomert in https://github.com/unslothai/unsloth/pull/2452
MoE Kernel by @jeromeku in https://github.com/unslothai/unsloth/pull/2465
Blackwell Support by @johnnynunez in https://github.com/unslothai/unsloth/pull/2458
Added missing code of conduct by @rolandtannous in https://github.com/unslothai/unsloth/pull/2416
Fix readme example by @yuanzhedong in https://github.com/unslothai/unsloth/pull/2492
the pixtral vision notebook fails during inference by @mmathew23 in https://github.com/unslothai/unsloth/pull/2466
[1/N] Enable intel GPU for unsloth by @leizhenyuan in https://github.com/unslothai/unsloth/pull/2350
[2/N] Enable intel GPU for unsloth by @leizhenyuan in https://github.com/unslothai/unsloth/pull/2388
vLLM Windows CUDA support [tested] by @fenglui in https://github.com/unslothai/unsloth/pull/2158
Add Sesame CSM by @mmathew23 in https://github.com/unslothai/unsloth/pull/2527
Add Qwen-3 chat template and Ollama template support by @kiankyars in https://github.com/unslothai/unsloth/pull/2537
Fix typos by @omahs in https://github.com/unslothai/unsloth/pull/2540
Add use_rslora reference to LoraConfig inititalisation by @jkumz in https://github.com/unslothai/unsloth/pull/2539
TTS by @danielhanchen in https://github.com/unslothai/unsloth/pull/2545
Quick fix on the CompileConfig error by @Erland366 in https://github.com/unslothai/unsloth/pull/2554
Fix trust remote code by @Etherll in https://github.com/unslothai/unsloth/pull/2357
fix issue with qwen3 template double quote escapes by @davedgd in https://github.com/unslothai/unsloth/pull/2563
Display the model name in RoPE scaling unsupported error by @emmanuel-ferdman in https://github.com/unslothai/unsloth/pull/2564
Fix Whisper, ModernBERT by @danielhanchen in https://github.com/unslothai/unsloth/pull/2565
fix: improved error handling when llama.cpp build fails #2358 by @Hansehart in https://github.com/unslothai/unsloth/pull/2603
Remove dataset_text_field from SFTConfig by @qgallouedec in https://github.com/unslothai/unsloth/pull/2609
Upgrade trl fix by @Datta0 in https://github.com/unslothai/unsloth/pull/2544
Check the skip_prepare_dataset before accessing dataset fields. #2496 by @Premik in https://github.com/unslothai/unsloth/pull/2633
Llama4 MoE Grouped GEMM by @jeromeku in https://github.com/unslothai/unsloth/pull/2639
Latest TRL, GRPO + Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/2645
Fix SFTtraining for new trl by @mmathew23 in https://github.com/unslothai/unsloth/pull/2647
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/2651
Fix quant model param fetch regex by @Datta0 in https://github.com/unslothai/unsloth/pull/2662
Fix batched generation for prompts of different lengths by @RunFMe in https://github.com/unslothai/unsloth/pull/2216
reroute merge logic language models + comprehensive tests + eval kits by @rolandtannous in https://github.com/unslothai/unsloth/pull/2673
unsloth checkpointing fix for latest transformers==4.52.x by @mmathew23 in https://github.com/unslothai/unsloth/pull/2674
patch sft_trainer to favor max_seq_length over max_length in config by @mmathew23 in https://github.com/unslothai/unsloth/pull/2669
Update prepare 4d causal attention call by @mmathew23 in https://github.com/unslothai/unsloth/pull/2678
Ignore None Values when building vllm subprocess_command by @Salpingopharyngeus in https://github.com/unslothai/unsloth/pull/2680
add support for torch270 with Intel GPU by @leizhenyuan in https://github.com/unslothai/unsloth/pull/2709
Making protobuf version more flexible by @user799595 in https://github.com/unslothai/unsloth/pull/2637
tests for additional merge fix unsloth zoo pr 163 by @rolandtannous in https://github.com/unslothai/unsloth/pull/2719
Reward modeling update (There seems to be another patch) by @pluesclues in https://github.com/unslothai/unsloth/pull/2710
Fix Typos in Documentation and Comments by @leopardracer in https://github.com/unslothai/unsloth/pull/2721
Fix renaming on other model than Llama by @Erland366 in https://github.com/unslothai/unsloth/pull/2762
Enable vLLM to share memory space by @Datta0 in https://github.com/unslothai/unsloth/pull/2712
Fix TRL 1.8.2 by @marcandrelarochelle in https://github.com/unslothai/unsloth/pull/2774
Fix AttributeError in GRPO trainer for models without llm attribute by @rolandtannous in https://github.com/unslothai/unsloth/pull/2780
Additional tests for unsloth-zoo PR#174 by @rolandtannous in https://github.com/unslothai/unsloth/pull/2779
Update pyproject.toml by @amrothemich in https://github.com/unslothai/unsloth/pull/2778
Fix for grpo_compute_loss_slow by @simpissa in https://github.com/unslothai/unsloth/pull/2702
Fix GRPO by @danielhanchen in https://github.com/unslothai/unsloth/pull/2787
Docs: Fix typo and improve MoE docstrings by @kilavvy in https://github.com/unslothai/unsloth/pull/2784
[5/N] Enable intel GPU for unsloth by @leizhenyuan in https://github.com/unslothai/unsloth/pull/2768
Sequence Classification Bug Fixes by @pluesclues in https://github.com/unslothai/unsloth/pull/2793
intel 5/N fix patch by @mmathew23 in https://github.com/unslothai/unsloth/pull/2792
[3/N] Enable intel GPU for unsloth by @leizhenyuan in https://github.com/unslothai/unsloth/pull/2620
[4/N] Enable intel GPU for unsloth by @mmathew23 in https://github.com/unslothai/unsloth/pull/2801
[intel] use DeviceProperties instead of torch.xxx.deviceproperties by @leizhenyuan in https://github.com/unslothai/unsloth/pull/2803
Fix grpo sleep regex and indentation by @Datta0 in https://github.com/unslothai/unsloth/pull/2804
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/2805
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/2807

New Contributors

@cblomert made their first contribution in https://github.com/unslothai/unsloth/pull/2452
@johnnynunez made their first contribution in https://github.com/unslothai/unsloth/pull/2458
@rolandtannous made their first contribution in https://github.com/unslothai/unsloth/pull/2416
@yuanzhedong made their first contribution in https://github.com/unslothai/unsloth/pull/2492
@mmathew23 made their first contribution in https://github.com/unslothai/unsloth/pull/2466
@leizhenyuan made their first contribution in https://github.com/unslothai/unsloth/pull/2350
@fenglui made their first contribution in https://github.com/unslothai/unsloth/pull/2158
@kiankyars made their first contribution in https://github.com/unslothai/unsloth/pull/2537
@omahs made their first contribution in https://github.com/unslothai/unsloth/pull/2540
@jkumz made their first contribution in https://github.com/unslothai/unsloth/pull/2539
@davedgd made their first contribution in https://github.com/unslothai/unsloth/pull/2563
@emmanuel-ferdman made their first contribution in https://github.com/unslothai/unsloth/pull/2564
@qgallouedec made their first contribution in https://github.com/unslothai/unsloth/pull/2609
@Premik made their first contribution in https://github.com/unslothai/unsloth/pull/2633
@RunFMe made their first contribution in https://github.com/unslothai/unsloth/pull/2216
@Salpingopharyngeus made their first contribution in https://github.com/unslothai/unsloth/pull/2680
@user799595 made their first contribution in https://github.com/unslothai/unsloth/pull/2637
@pluesclues made their first contribution in https://github.com/unslothai/unsloth/pull/2710
@leopardracer made their first contribution in https://github.com/unslothai/unsloth/pull/2721
@marcandrelarochelle made their first contribution in https://github.com/unslothai/unsloth/pull/2774
@amrothemich made their first contribution in https://github.com/unslothai/unsloth/pull/2778
@simpissa made their first contribution in https://github.com/unslothai/unsloth/pull/2702
@kilavvy made their first contribution in https://github.com/unslothai/unsloth/pull/2784

Full Changelog: https://github.com/unslothai/unsloth/compare/May-2025...June-2025

View release on GitHub

All releases

New models

Unsloth Updates

What's Changed in Unsloth

What's changed in Unsloth-Zoo

What's Changed

New Contributors

Gemma 4 Training Fixes:

Gemma 4 Quant Re-uploads

Unsloth Studio Updates

What's Changed

New Contributors

Updates

What's Changed

New Contributors

New features

Much smoother and faster Studio

To update Studio:

What's Changed

What's Changed

New Contributors

Fixes:

What's Changed

New Contributors

MacOS, Linux, WSL:

Windows:

Docker

What's Changed

New Contributors

🚀 Faster MoE training

🔎 Embedding models now train 2× faster

💡 Ultra Long Context RL is here

🔮 New models

🎉 Extra Updates

📖 New Guides

What's Changed

Unsloth Zoo Changes

Unsloth Notebooks changes

New Contributors

:crystal_ball: New Models + Guides

Bug Fixes and Enhancements

What's Changed

Unsloth Zoo Changes

New Contributors

Bug Fixes and Enhancements

What's Changed

Unsloth Zoo Changes

Unsloth Notebooks Changes

What's Changed

New Contributors

New model updates

New features

RL Improvements

RL Environment functions

Bug fixes

What's Changed

New Contributors

What's Changed

New Contributors

What's Changed

New Contributors

What's Changed

New Contributors

gpt-oss is here! ✨

:sloth: Unsloth updates

:stars: Qwen3-Coder + Qwen3-2507

:crystal_ball: New models + Support:

What's Changed

New Contributors

More VRAM reduction, faster & bug fixes

What's Changed

New Contributors

✨ Gemma 3n now available

🎵 Text-to-Speech (TTS) Fine-tuning

🧠 DeepSeek-R1-0528 Support with Dynamic 1-bit GGUFs

📈 Dynamic 2.0 GGUFs

⚡ Advanced Qwen3 GRPO notebook

🎯 Magistral Conversational Reasoning

👁️ Gemma3 Vision Support

Documentation & Guides