Unsloth

v0.1.40-beta Breaking

This release includes breaking changes for platform teams planning a safe upgrade.

Published 2mo Model Serving & MLOps

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agent deepseek fine-tuning gemma gemma3 gpt-oss

+13 more

llama llama3 llm llms mistral openai qwen reinforcement-learning self-hosted text-to-speech tts ui unsloth

Affected surfaces

auth rbac

Summary

AI summary

Auto‑enabled MTP speculative decoding makes GGUF inference up to 2× faster.

Changes in this release

Type	Severity	Summary	CVE
Security	Medium	Security improvements across Unsloth Studio and runtime Security improvements across Unsloth Studio and runtime Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Security	Medium	Unsloth Studio security improvements: authentication rate‑limiting, sandboxed workers, path containment, strict CSP, removed torch.load fallback on training_args.bin Unsloth Studio security improvements: authentication rate‑limiting, sandboxed workers, path containment, strict CSP, removed torch.load fallback on training_args.bin Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Breaking	Medium	Auto MTP speculative decoding enabled by default for MTP GGUFs; warns on stale llama.cpp prebuilt Auto MTP speculative decoding enabled by default for MTP GGUFs; warns on stale llama.cpp prebuilt Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Feature
Feature	Medium	API support for OpenAI, Anthropic etc. with auto prompt caching, web search, code execution API support for OpenAI, Anthropic etc. with auto prompt caching, web search, code execution Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Feature	Medium	Connect to external inference backends: vLLM, Ollama llama-server Connect to external inference backends: vLLM, Ollama llama-server Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Feature	Medium	Experimental MLX inference on Mac machines Experimental MLX inference on Mac machines Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Feature	Medium	Proper support for non-English languages (e.g., Japanese, Chinese) Proper support for non-English languages (e.g., Japanese, Chinese) Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Feature	Medium	Built-in web search for OpenAI, Anthropic, OpenRouter and Kimi Built-in web search for OpenAI, Anthropic, OpenRouter and Kimi Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Feature	Medium	Built-in code execution for OpenAI and Anthropic (containers persist across turns) Built-in code execution for OpenAI and Anthropic (containers persist across turns) Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Feature	Medium	Prompt caching enabled for OpenAI and Anthropic models, saving 50‑90% costs Prompt caching enabled for OpenAI and Anthropic models, saving 50‑90% costs Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Feature	Medium	API key now optional for local providers (llama.cpp / vLLM / Ollama) API key now optional for local providers (llama.cpp / vLLM / Ollama) Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Feature	Medium	Auto-load models when adding a cloud provider Auto-load models when adding a cloud provider Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Feature	Medium	Dark theme refactor, right sidebar redesign, time‑of‑day sloth mascot, dismissable copyable toasts, larger chat composer, code‑execution config polish, composer action pill styling, narrower Discord button Dark theme refactor, right sidebar redesign, time‑of‑day sloth mascot, dismissable copyable toasts, larger chat composer, code‑execution config polish, composer action pill styling, narrower Discord button Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Feature	Medium	Auto-install flash‑linear‑attention and tilelang for Qwen3.5 family Auto-install flash‑linear‑attention and tilelang for Qwen3.5 family Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Feature	Medium	OpenDocument chat attachments support OpenDocument chat attachments support Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Feature	Medium	IME composer hardening, RTL `dir= IME composer hardening, RTL `dir= Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Performance	Medium	~2x faster GGUF inference with automatically enabled MTP ~2x faster GGUF inference with automatically enabled MTP Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Performance	Medium	GPU pinned at 95% headroom with warning on silent CPU fallback GPU pinned at 95% headroom with warning on silent CPU fallback Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—

Full changelog

We've got lots of new updates:

~2x faster GGUF inference with automatically enabled MTP
API support for OpenAI, Anthropic etc. with auto prompt caching, web search, code execution
Connect to external inference backends: vLLM, Ollama llama-server
Experimental MLX inference
Proper support for non-English languages
Security improvements

MTP speculative decoding support 1.4 to 2x faster inference!

Auto MTP speculative decoding for MTP GGUFs; warn when the bundled llama.cpp prebuilt is stale or too old for MTP
New pre-built llama.cpp binaries for MTP support!

API provider calling & external connections

You can now connect Unsloth to any API cloud provider (OpenAI, Anthropic, OpenRouter etc.)
Built-in web search for OpenAI, Anthropic, OpenRouter and Kimi
Built-in code execution for OpenAI and Anthropic (Anthropic containers persist and are reused across turns)
Prompt caching is enabled for OpenAI and Anthropic models saving 50 to 90% of costs.
API key now optional for local providers (llama.cpp / vLLM / Ollama)
Auto-load models when adding a cloud provider

MLX inference (Experimental)

MLX quants and models now can run locally on your Mac machines!
We'll be adding thinking, tools and web search soon!

Other Unsloth Studio updates

OpenDocument chat attachments
o3 reasoning summary payload
Sending/prompting non-English languages (e.g. Japanese, Chinese) now works properly
IME composer hardening, RTL dir="auto", long log-line truncation fix
Tool reasoning trace rendering in UI
Fully offline support: cached GGUF discovery and offline DNS auto-detect for both inference and training
Lots of UI/UX polish: dark theme refactor, right sidebar redesign, time-of-day sloth mascot, dismissable copyable toasts, larger chat composer, code-execution config polish, composer action pill styling, narrower Discord button

Training updates

Gemma attention mask fixes
Multi Image GRPO
GRPO hidden-state return experiments
New Continued Pretraining (CPT) training method as a first-class option
Gemma-4 MoE LoRA extractor registered to fix grouped_mm contraction crash
Opt-in fused lm_head + cross-entropy forward, with single-matmul path under UNSLOTH_RETURN_LOGITS=1
Pass batch size for eval
Eval/training paths now honour HF_DATASETS_OFFLINE alongside HF_HUB_OFFLINE

Unsloth Studio security improvements

Authentication rate-limiting, proxy-aware so reverse proxies don't bypass it
Sandboxed worker with a tightened blocklist (bash, hf upload, NOFILE)
Path containment so workers can't escape their in-flight tmp dirs
Strict schema validation across the Studio API
Tightened CSP / security headers (only legitimate favicon hosts allowed)
Removed the torch.load fallback on training_args.bin so untrusted pickles can never execute on model load
Hardened Tauri desktop release flow
Frontend auth: singleflight token refresh, current-password input on changes, working logout, shared 422 helper
Cancel cleanup now scoped strictly to in-flight tmp dirs so it can never delete user state

Bug fixes and correctness

Layout-aware MoE LoRA merge with loud-fail on fallback (no more silent wrong saves)
num_logits_to_keep regression fixed on transformers >= 4.52
Preserve tokenizer EOS token on merged saves
Resume PEFT checkpoints under sentence-transformers >= 5.4
Restore Flash > SDPA > Flex attention priority for non-Gemma3 models
ORPO text-only tokenization now works with processors
Embedding matrix size mismatch fix
Vicuna chat template fix
fast_generate unifies legacy and new logits kwargs (fixes Mistral merge site)
higher_precision_softmax made idempotent
Patch every LOSS_MAPPING key aliased to ForCausalLMLoss (covers transformers 5.x)
GGUF converter sibling imports fixed
UTF-8 encoding added to all text-mode file operations
Serialise GGUF reload and inherit unsloth-run extra args
Fix /recommended-folders 500 on unreadable model directories under Python 3.12+
Cross-family GGUF projector blocked in flat local dirs (no more wrong-vision-tower loads)

Installer and platform reliability

Custom install paths via STUDIO_HOME / UNSLOTH_STUDIO_HOME
CPU-only Linux x86_64 routed to ggml-org/llama.cpp prebuilts
Windows CUDA install fixes: paired cudart bundle and Torch NVIDIA DLL paths added to PATH
Skip flash-attn install on Blackwell GPUs (sm_100+)
Refresh Intel XPU extras for torch 2.7.1 / 2.9.1 / 2.10 / 2.11.0 / 2.12.0; torch upper cap raised to <2.13.0
HIP source builds on Ubuntu 24.04 now inject --gcc-install-dir
Linux prebuilt fixes for branch-based llama.cpp releases (mangled symlink repair, top-level dir strip)
New uninstallers for Linux, macOS (uninstall.sh) and Windows (uninstall.ps1)
Mac desktop shortcut spawning and lifecycle fixed
unsloth --version flag
Studio web update banner and release version display
GPU pinned at 95% headroom, with a warning on silent CPU fallback
Auto-install flash-linear-attention and tilelang for Qwen3.5 family

What's Changed in Unsloth

Bump installer floor to 2026.5.2 by @danielhanchen in https://github.com/unslothai/unsloth/pull/5297
install: support STUDIO_HOME / UNSLOTH_STUDIO_HOME for custom install paths by @danielhanchen in https://github.com/unslothai/unsloth/pull/5190
Route CPU-only Linux x86_64 to ggml-org/llama.cpp prebuilts by @danielhanchen in https://github.com/unslothai/unsloth/pull/5302
feat(studio): MLX training tab on Apple Silicon (LoRA / full FT, VLM, export) by @Manan17 in https://github.com/unslothai/unsloth/pull/5265
feat(studio): add Continued Pretraining (CPT) as a training method by @OnePunchMonk in https://github.com/unslothai/unsloth/pull/4677
Fix 14 stale tests under tests/studio/install/ that drifted from code by @danielhanchen in https://github.com/unslothai/unsloth/pull/5305
Add Studio PR-time CI: pin enforcement, frontend, backend, wheel smoke by @danielhanchen in https://github.com/unslothai/unsloth/pull/5298
Studio: restore Studio API and Help menu UI by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5310
[studio]: Fix tool reasoning trace in UI by @CodeMan62 in https://github.com/unslothai/unsloth/pull/5314
fix: 3 patch_* helpers — fast_lora import, sft_trainer Union, openenv OSError by @danielhanchen in https://github.com/unslothai/unsloth/pull/5319
Studio: API settings overflow with long Colab URLs by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5286
tests/studio/install: parallel UNSLOTH_STUDIO_HOME smoke test by @danielhanchen in https://github.com/unslothai/unsloth/pull/5306
Studio: Dark theme refactor, right sidebar redesign, and chat UI polish by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5150
fix: harden Studio IME composer sends by @Etherll in https://github.com/unslothai/unsloth/pull/5327
Studio: stop truncating long log lines as suspected base64 by @rolandtannous in https://github.com/unslothai/unsloth/pull/5335
fix(gh_client): fail fast on 401/403 auth errors instead of retrying forever (#5325) by @Anai-Guo in https://github.com/unslothai/unsloth/pull/5329
fix: unblock 4 tests deselected/skipped in #5312 (real bugs) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5359
fix(tests/sh): accept pinned tokenizers line after #5359 by @danielhanchen in https://github.com/unslothai/unsloth/pull/5361
CI: scope GITHUB_TOKEN permissions, add MLX CI, unblock ~60 skipped tests by @danielhanchen in https://github.com/unslothai/unsloth/pull/5312
studio/tests: make Playwright model-selector probe best-effort by @danielhanchen in https://github.com/unslothai/unsloth/pull/5371
Studio: download paired cudart bundle on Windows CUDA installs by @danielhanchen in https://github.com/unslothai/unsloth/pull/5322
Studio: add torch's pip nvidia DLL dirs to PATH on Windows by @danielhanchen in https://github.com/unslothai/unsloth/pull/5324
studio: authenticate HF downloads across Studio CI workflows by @danielhanchen in https://github.com/unslothai/unsloth/pull/5370
dependabot: group security updates and cover /studio/frontend npm advisories by @danielhanchen in https://github.com/unslothai/unsloth/pull/5372
Add Studio web update banner and release version display by @wasimysaid in https://github.com/unslothai/unsloth/pull/5308
ci/install: retry transient github.com 5xx on unsloth-zoo git fetches by @danielhanchen in https://github.com/unslothai/unsloth/pull/5389
studio/ci: pre-install lockfile supply-chain audit (npm + cargo) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5392
studio/ci: npm tarball content scanner (no-install, hostile-input safe) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5393
studio/tests: AbortSignal-bound in-page fetches and wall-clock watchdog for Playwright probes by @danielhanchen in https://github.com/unslothai/unsloth/pull/5391
chore: remove unused .semgrep/unsloth-rules.yml by @danielhanchen in https://github.com/unslothai/unsloth/pull/5395
studio/ci: sweep actions/cache v5 hardening across sibling smoke workflows by @danielhanchen in https://github.com/unslothai/unsloth/pull/5399
studio/ci: harden HF_HOME cache against actions/cache v5 silent restore failures by @danielhanchen in https://github.com/unslothai/unsloth/pull/5396
Harden Tauri release flow by @wasimysaid in https://github.com/unslothai/unsloth/pull/5341
Gemma attn by @Datta0 in https://github.com/unslothai/unsloth/pull/5346
Multi Image GRPO by @Datta0 in https://github.com/unslothai/unsloth/pull/5197
[GRPO] Try returning hidden statex for GRPO by @Datta0 in https://github.com/unslothai/unsloth/pull/5142
Studio: pin GPU at 95% headroom and warn on silent CPU fallback by @danielhanchen in https://github.com/unslothai/unsloth/pull/5323
Chore(deps): bump the actions group across 1 directory with 4 updates by @dependabot[bot] in https://github.com/unslothai/unsloth/pull/5394
security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only by @danielhanchen in https://github.com/unslothai/unsloth/pull/5397
studio: security and hardening pass (auth rate-limit, sandbox, path containment, schema validation, headers) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5375
studio: fix training page regressions from the security hardening pass by @rolandtannous in https://github.com/unslothai/unsloth/pull/5409
Studio: parity of thinking trace icon with Think toggle icon by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5407
Studio: vary empty chat sloth mascot by local time of day by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5354
security: persist-credentials:false on every actions/checkout (org-wide sweep) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5413
import_fixes: stub transformers.conversion_mapping so peft 0.19.x imports on transformers 4.x by @danielhanchen in https://github.com/unslothai/unsloth/pull/5416
chore: trim verbose comments added in PR #5416 (commit 12295c1f) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5418
studio/ci: flat GGUF+mmproj cache for Mac json-images smoke, save partial caches on cancel by @danielhanchen in https://github.com/unslothai/unsloth/pull/5417
studio: comment out training_args.bin torch.load fallback in model_config by @danielhanchen in https://github.com/unslothai/unsloth/pull/5419
tests: import_fixes drift detectors (HARD GATE on Core matrix) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5414
tests: drift detector parity with unsloth-zoo (fix Core matrix RED on triton + vllm) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5421
scripts: ship deterministic comment / docstring-only diff verifier by @danielhanchen in https://github.com/unslothai/unsloth/pull/5422
studio: API external provider support for chat (OpenAI, Mistral, Gemini, Cohere, Anthropic, OpenRouter, DeepSeek, custom providers) by @rolandtannous in https://github.com/unslothai/unsloth/pull/4706
import_fixes + drift detectors: cover transformers 5.x drift (unblocks PR #5376) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5423
MLX training support for Studio on Apple Silicon by @mmathew23 in https://github.com/unslothai/unsloth/pull/5340
studio: drop unused max_grad_value schema + route plumbing by @danielhanchen in https://github.com/unslothai/unsloth/pull/5424
Studio: Passing batch size for eval by @uderbashi in https://github.com/unslothai/unsloth/pull/5168
studio: skip flash-attn install on Blackwell GPUs (sm_100+) by @rolandtannous in https://github.com/unslothai/unsloth/pull/5420
Fix: Add missing utf-8 encoding to text-mode file operations by @Tenith01 in https://github.com/unslothai/unsloth/pull/5356
Fix/issue 3667 vicuna template by @Tenith01 in https://github.com/unslothai/unsloth/pull/5357
tests: public-api surface drift detector (companion to test_import_fixes_drift.py) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5428
add UNSLOTH_ALLOW_CPU=1 path for CPU-only CI / source-inspection tests by @danielhanchen in https://github.com/unslothai/unsloth/pull/5429
fix(studio/mmproj): block cross-family projectors in flat local GGUF dirs (#5347) by @Anai-Guo in https://github.com/unslothai/unsloth/pull/5350
studio/mmproj: skip unwanted GGUF values via seek instead of read by @danielhanchen in https://github.com/unslothai/unsloth/pull/5431
ci: install ipython so transformers.utils.notebook imports cleanly in zoo pytest by @danielhanchen in https://github.com/unslothai/unsloth/pull/5437
studio/mlx: lower per-element grad clip default from 5.0 to 1.0 by @danielhanchen in https://github.com/unslothai/unsloth/pull/5440
studio/frontend: drop unused next dependency by @danielhanchen in https://github.com/unslothai/unsloth/pull/5438
Update version-compat-ci.yml by @rolandtannous in https://github.com/unslothai/unsloth/pull/5445
ci: merge duplicate with: keys in notebooks-ci checkout steps by @rolandtannous in https://github.com/unslothai/unsloth/pull/5447
studio/chat: built-in web search for OpenAI, Anthropic, OpenRouter, Kimi by @rolandtannous in https://github.com/unslothai/unsloth/pull/5443
ci: make compiler-cache shim test order-independent by @danielhanchen in https://github.com/unslothai/unsloth/pull/5449
Studio: o3 reasoning summary payload by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5426
ci: compiler-cache-shim must mutate live module globals + skip rerun by @danielhanchen in https://github.com/unslothai/unsloth/pull/5452
Polish/cloud to providers by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5450
ci: cap each compiler-sweep iteration with SIGALRM + log progress by @danielhanchen in https://github.com/unslothai/unsloth/pull/5456
ci: add tx >=5,<6 slow compile model_types to KNOWN_BROKEN_COMPILE by @danielhanchen in https://github.com/unslothai/unsloth/pull/5458
Restore Flash > SDPA > Flex priority for non-gemma3 models by @mmathew23 in https://github.com/unslothai/unsloth/pull/5455
ci: stop a partial mmproj cache from poisoning Mac Studio GGUF CI by @danielhanchen in https://github.com/unslothai/unsloth/pull/5459
ci: make Windows Stop Studio teardown tolerate Git Bash signal exit by @danielhanchen in https://github.com/unslothai/unsloth/pull/5460
Studio: make API key optional for local providers (llama.cpp/vLLM/Ollama) by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5457
studio/chat: built-in code execution for OpenAI + Anthropic by @rolandtannous in https://github.com/unslothai/unsloth/pull/5461
ci: switch Windows Stop Studio to a cmd no-op marker by @danielhanchen in https://github.com/unslothai/unsloth/pull/5462
tests: raise pwsh/bash subprocess timeout from 10s to 60s by @danielhanchen in https://github.com/unslothai/unsloth/pull/5463
studio/install: repair upstream llama.cpp prebuilt mangled symlinks by @danielhanchen in https://github.com/unslothai/unsloth/pull/5465
studio/chat: OpenAI container picker delete reliability by @rolandtannous in https://github.com/unslothai/unsloth/pull/5466
studio/install: strip top-level dir from repaired symlink target by @danielhanchen in https://github.com/unslothai/unsloth/pull/5467
Stop: drop Ollama API key, clean up code execution UI by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5464
tests/openai: patch httpx.AsyncClient ctor so delete tests hit mock by @danielhanchen in https://github.com/unslothai/unsloth/pull/5469
revert: stop touching DEVICE_TYPE == cuda branches for CPU CI by @danielhanchen in https://github.com/unslothai/unsloth/pull/5473
ci: drop cache: 'npm' from setup-node (silent abort on Windows) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5474
ci: bump Mac json-images timeout 30 -> 45 min (cache-miss path) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5475
ci: wrap hf download in xet-tuned stall-retry loop (root-cause Mac 30-min hang) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5476
ci: deterministic check for studio/frontend dep removals by @danielhanchen in https://github.com/unslothai/unsloth/pull/5478
studio/frontend: drop unused dependencies, move type pkg to devDeps by @danielhanchen in https://github.com/unslothai/unsloth/pull/5477
intel-gpu: refresh xpu extras (fix torch 2.10, add 2.7.1 / 2.9.1 / 2.11.0 / 2.12.0) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5484
Studio: auto-load models when adding a cloud provider by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5472
Studio: code execution config visual polish by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5471
disable_torchcodec_if_broken: also patch datasets and clean sys.modules by @danielhanchen in https://github.com/unslothai/unsloth/pull/5483
tests: pinned-symbol canary for unsloth-zoo save_pretrained_merged guards (#5410) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5433
intel-gpu: pin unsloth_zoo>=2026.5.2 (fixes #5494) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5499
fix(sentence_transformer): resume PEFT checkpoints under sentence-transformers >= 5.4 by @Etherll in https://github.com/unslothai/unsloth/pull/5454
Studio: serialise GGUF reload and inherit unsloth-run extra args by @danielhanchen in https://github.com/unslothai/unsloth/pull/5427
Studio: IME / multilingual composer regression test + RTL dir="auto" by @danielhanchen in https://github.com/unslothai/unsloth/pull/5485
fix: preserve tokenizer eos token on merged saves by @anmolxlight in https://github.com/unslothai/unsloth/pull/5451
studio/chat: reuse Anthropic code_execution container across turns by @rolandtannous in https://github.com/unslothai/unsloth/pull/5519
Fix Linux prebuilt installs for branch-based llama.cpp releases by @mmathew23 in https://github.com/unslothai/unsloth/pull/5493
Studio: stop hint, Uvicorn log rename, reachability check + Mac UI CI retry hardening by @danielhanchen in https://github.com/unslothai/unsloth/pull/5503
Studio composer action pill styling by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5522
Fix /recommended-folders 500 on unreadable model directories (Python 3.12+) by @mmathew23 in https://github.com/unslothai/unsloth/pull/5523
studio/chat: persist Anthropic container id on first turn of new thread by @rolandtannous in https://github.com/unslothai/unsloth/pull/5526
studio/openai: align chat completions docstring with stream=false default (closes #5047) by @wtfashwin in https://github.com/unslothai/unsloth/pull/5524
Add a simple --version flag by @melroy89 in https://github.com/unslothai/unsloth/pull/5516
studio: load cached GGUF models when fully offline by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5505
studio: expose launcher capability bits on unauth /api/health by @danielhanchen in https://github.com/unslothai/unsloth/pull/5486
studio: tighten sandbox blocklist precision (bash, hf upload, NOFILE) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5487
studio: scope cancel-cleanup to in-flight tmp dirs; walk back tool_call_id by @danielhanchen in https://github.com/unslothai/unsloth/pull/5488
studio: proxy-aware login rate-limit; allow google favicons in CSP by @danielhanchen in https://github.com/unslothai/unsloth/pull/5489
studio/frontend: wire logout, singleflight refresh, shared 422 helper, current-password input by @danielhanchen in https://github.com/unslothai/unsloth/pull/5490
tests/studio: lock in Windows GPU detection fix (#5106) with a synthetic CI test by @danielhanchen in https://github.com/unslothai/unsloth/pull/5376
Studio: auto-enable MTP speculative decoding for MTP GGUFs by @danielhanchen in https://github.com/unslothai/unsloth/pull/5527
Studio: warn when llama.cpp prebuilt is too old for MTP by @danielhanchen in https://github.com/unslothai/unsloth/pull/5528
Studio: warn when llama.cpp prebuilt is at least 3 days behind by @danielhanchen in https://github.com/unslothai/unsloth/pull/5529
studio: extend offline DNS auto-detect to inference parent + training by @danielhanchen in https://github.com/unslothai/unsloth/pull/5512
Fix ORPO text-only tokenization with processors by @alkinun in https://github.com/unslothai/unsloth/pull/5501
fix(studio/worker): inject --gcc-install-dir for HIP source builds on Ubuntu 24.04 by @h34v3nzc0dex in https://github.com/unslothai/unsloth/pull/5517
studio: gate image input on effective vision capability by @Etherll in https://github.com/unslothai/unsloth/pull/5492
studio/install: fix mac desktop shortcut spawning and lifecycle by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5496
studio: add uninstall.sh and document it in README by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5497
Studio update CI: round-trip install -> update -> uninstall by @danielhanchen in https://github.com/unslothai/unsloth/pull/5536
studio: fix Connections dialog UX issues surfaced by image-gate probe by @danielhanchen in https://github.com/unslothai/unsloth/pull/5518
studio: add uninstall.ps1 for Windows by @danielhanchen in https://github.com/unslothai/unsloth/pull/5513
Fix num_logits_to_keep regression on transformers >= 4.52 by @danielhanchen in https://github.com/unslothai/unsloth/pull/5538
Uninstaller script by @PTFOPlayer in https://github.com/unslothai/unsloth/pull/4611
Add OpenDocument chat attachments by @alkinun in https://github.com/unslothai/unsloth/pull/5510
studio/frontend: stop showing Generating spinner on empty welcome view by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5530
studio/frontend: swap Hugeicons spokes spinner for CSS ring by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5531
studio/frontend: grow chat composer to 16 rows and inset scrollbar by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5540
studio/frontend: make toast and inline error text selectable and copyable by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5506
studio: add dismissable toasts with corner close button by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5509
studio: install flash-linear-attention and tilelang for Qwen3.5 family by @danielhanchen in https://github.com/unslothai/unsloth/pull/5434
studio/frontend: soften toast shadow and tighten vertical padding by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5511
fast_generate: unify legacy/new logits kwarg + fix Mistral merge site by @danielhanchen in https://github.com/unslothai/unsloth/pull/5543
studio/frontend: hide Current password input on first boot by @danielhanchen in https://github.com/unslothai/unsloth/pull/5545
tests/studio: tighten MLX smoke gates (loss + round-trip, _on_step grad_norm) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5537
tests + CI: callback signature drift detector by @danielhanchen in https://github.com/unslothai/unsloth/pull/5498
images: use narrower Discord button and drop duplicate by @danielhanchen in https://github.com/unslothai/unsloth/pull/5552
fix(studio): handle expired OpenAI shell-tool containers without surfacing error in chat by @rolandtannous in https://github.com/unslothai/unsloth/pull/5547
studio/chat: release stuck IME flag when compositionend never fires by @wtfashwin in https://github.com/unslothai/unsloth/pull/5551

New Contributors

@Anai-Guo made their first contribution in https://github.com/unslothai/unsloth/pull/5329
@uderbashi made their first contribution in https://github.com/unslothai/unsloth/pull/5168
@Tenith01 made their first contribution in https://github.com/unslothai/unsloth/pull/5356
@anmolxlight made their first contribution in https://github.com/unslothai/unsloth/pull/5451
@wtfashwin made their first contribution in https://github.com/unslothai/unsloth/pull/5524
@melroy89 made their first contribution in https://github.com/unslothai/unsloth/pull/5516
@h34v3nzc0dex made their first contribution in https://github.com/unslothai/unsloth/pull/5517
@PTFOPlayer made their first contribution in https://github.com/unslothai/unsloth/pull/4611

Full Changelog: https://github.com/unslothai/unsloth/compare/v0.1.39-beta...v0.1.40-beta

What's Changed in Unsloth-Zoo

Register Gemma-4 MoE LoRA extractor to fix grouped_mm contraction crash by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/624
feat(mlx): Apple Silicon training (text + VLM, LoRA / full FT, CCE, export) by @Manan17 in https://github.com/unslothai/unsloth-zoo/pull/620
tests: skip MoE LoRA extractor coverage when discovery finds zero classes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/628
tests: pivot MoE-coverage canary to _unsloth_already_patched marker by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/630
fix(compiler): make higher_precision_softmax idempotent by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/631
fix(mlx): unblock GGUF export and LoRA reload on Apple Silicon by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/627
fix(compiler): unblock all model_types across transformers 4.57.6 and 5.x by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/632
Mask for gemma3 attn by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/635
Multi Image GRPO by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/613
[GRPO] Try returning hidden statex for GRPO by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/609
Refactor and consolidate moe lora extractors by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/629
security + CI: mirror unsloth's hardening stack onto zoo (greenfield .github/) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/637
remove unsloth_zoo/import_fixes.py: redundant with unsloth's by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/639
chore: trim verbose comments across PR #637 landing by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/640
scripts: ship deterministic comment / docstring-only diff verifier by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/641
fix mlx: Adds the MLX training path used by Studio on Apple Silicon by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/634
tests: drift detectors cover transformers 5.x (mirror unsloth PR #5423) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/642
gpt_oss: reorder helpers before patch_gpt_oss_bnb4bit_auto by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/643
init: lazy-load legacy MLX aliases on every host by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/644
tests: contain security-conftest network block; fix stale mlx paths; skip GPU import in trainer-exec-marker by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/648
fix CI fallout from MLX subpackage refactor (#634) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/646
tests: tolerate transformers 5.x source/signature drift in two zoo drift detectors by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/650
tests: skip _assert_params_superset when upstream forward is (*args, **kwargs) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/651
mlx: lower max_grad_value default from 5.0 to 1.0 by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/652
saving: layout-aware MoE LoRA merge + loud-fail on fallback (#5410) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/647
tests: follow MoE merge wrapper delegation in drift detector by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/653
additional import try except handling for mlx by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/654
Patch every LOSS_MAPPING key aliased to ForCausalLMLoss by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/656
deps: bump torch upper cap to <2.13.0 (allow xpu 2.11.0 / 2.12.0) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/658
Auto-install fused lm_head + cross_entropy forward (opt-in) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/657
tests: CPU regression detectors for the MoE merge / save path (#5410) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/655
Fix GGUF converter sibling imports by @alkinun in https://github.com/unslothai/unsloth-zoo/pull/661
fix embedding matrix size mismatch bug by @CodeMan62 in https://github.com/unslothai/unsloth-zoo/pull/645
Honor UNSLOTH_RETURN_LOGITS in fused forward by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/665
init: include HF_DATASETS_OFFLINE in the offline env cross-sync by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/664
compiler: single-matmul opt-in for UNSLOTH_RETURN_LOGITS=1 by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/666

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track Unsloth

Get notified when new releases ship.

About Unsloth

All releases →

Related context

Related tools

Earlier breaking changes

v0.1.43-beta Do not use `unsloth studio update`; it does not fetch latest updates.