Skip to content

Unsloth

v0.1.40-beta Breaking

This release includes breaking changes for platform teams planning a safe upgrade.

Published 16d Model Serving & MLOps
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agent deepseek fine-tuning gemma gemma3 gpt-oss
+13 more
llama llama3 llm llms mistral openai qwen reinforcement-learning self-hosted text-to-speech tts ui unsloth

Affected surfaces

auth rbac

Summary

AI summary

Auto‑enabled MTP speculative decoding makes GGUF inference up to 2× faster.

Changes in this release

Security Medium

Security improvements across Unsloth Studio and runtime

Security improvements across Unsloth Studio and runtime

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Security Medium

Unsloth Studio security improvements: authentication rate‑limiting, sandboxed workers, path containment, strict CSP, removed torch.load fallback on training_args.bin

Unsloth Studio security improvements: authentication rate‑limiting, sandboxed workers, path containment, strict CSP, removed torch.load fallback on training_args.bin

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Breaking Medium

Auto MTP speculative decoding enabled by default for MTP GGUFs; warns on stale llama.cpp prebuilt

Auto MTP speculative decoding enabled by default for MTP GGUFs; warns on stale llama.cpp prebuilt

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

API support for OpenAI, Anthropic etc. with auto prompt caching, web search, code execution

API support for OpenAI, Anthropic etc. with auto prompt caching, web search, code execution

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Connect to external inference backends: vLLM, Ollama llama-server

Connect to external inference backends: vLLM, Ollama llama-server

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Experimental MLX inference on Mac machines

Experimental MLX inference on Mac machines

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Proper support for non-English languages (e.g., Japanese, Chinese)

Proper support for non-English languages (e.g., Japanese, Chinese)

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Built-in web search for OpenAI, Anthropic, OpenRouter and Kimi

Built-in web search for OpenAI, Anthropic, OpenRouter and Kimi

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Built-in code execution for OpenAI and Anthropic (containers persist across turns)

Built-in code execution for OpenAI and Anthropic (containers persist across turns)

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Prompt caching enabled for OpenAI and Anthropic models, saving 50‑90% costs

Prompt caching enabled for OpenAI and Anthropic models, saving 50‑90% costs

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

API key now optional for local providers (llama.cpp / vLLM / Ollama)

API key now optional for local providers (llama.cpp / vLLM / Ollama)

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Auto-load models when adding a cloud provider

Auto-load models when adding a cloud provider

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Dark theme refactor, right sidebar redesign, time‑of‑day sloth mascot, dismissable copyable toasts, larger chat composer, code‑execution config polish, composer action pill styling, narrower Discord button

Dark theme refactor, right sidebar redesign, time‑of‑day sloth mascot, dismissable copyable toasts, larger chat composer, code‑execution config polish, composer action pill styling, narrower Discord button

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Auto-install flash‑linear‑attention and tilelang for Qwen3.5 family

Auto-install flash‑linear‑attention and tilelang for Qwen3.5 family

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

OpenDocument chat attachments support

OpenDocument chat attachments support

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

IME composer hardening, RTL `dir=

IME composer hardening, RTL `dir=

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Performance Medium

~2x faster GGUF inference with automatically enabled MTP

~2x faster GGUF inference with automatically enabled MTP

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Performance Medium

GPU pinned at 95% headroom with warning on silent CPU fallback

GPU pinned at 95% headroom with warning on silent CPU fallback

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Full changelog

We've got lots of new updates:

  • ~2x faster GGUF inference with automatically enabled MTP
  • API support for OpenAI, Anthropic etc. with auto prompt caching, web search, code execution
  • Connect to external inference backends: vLLM, Ollama llama-server
  • Experimental MLX inference
  • Proper support for non-English languages
  • Security improvements

MTP speculative decoding support 1.4 to 2x faster inference!

  • Auto MTP speculative decoding for MTP GGUFs; warn when the bundled llama.cpp prebuilt is stale or too old for MTP
  • New pre-built llama.cpp binaries for MTP support!

API provider calling & external connections

  • You can now connect Unsloth to any API cloud provider (OpenAI, Anthropic, OpenRouter etc.)
  • Built-in web search for OpenAI, Anthropic, OpenRouter and Kimi
  • Built-in code execution for OpenAI and Anthropic (Anthropic containers persist and are reused across turns)
  • Prompt caching is enabled for OpenAI and Anthropic models saving 50 to 90% of costs.
  • API key now optional for local providers (llama.cpp / vLLM / Ollama)
  • Auto-load models when adding a cloud provider

MLX inference (Experimental)

  • MLX quants and models now can run locally on your Mac machines!
  • We'll be adding thinking, tools and web search soon!

Other Unsloth Studio updates

  • OpenDocument chat attachments
  • o3 reasoning summary payload
  • Sending/prompting non-English languages (e.g. Japanese, Chinese) now works properly
  • IME composer hardening, RTL dir="auto", long log-line truncation fix
  • Tool reasoning trace rendering in UI
  • Fully offline support: cached GGUF discovery and offline DNS auto-detect for both inference and training
  • Lots of UI/UX polish: dark theme refactor, right sidebar redesign, time-of-day sloth mascot, dismissable copyable toasts, larger chat composer, code-execution config polish, composer action pill styling, narrower Discord button

Training updates

  • Gemma attention mask fixes
  • Multi Image GRPO
  • GRPO hidden-state return experiments
  • New Continued Pretraining (CPT) training method as a first-class option
  • Gemma-4 MoE LoRA extractor registered to fix grouped_mm contraction crash
  • Opt-in fused lm_head + cross-entropy forward, with single-matmul path under UNSLOTH_RETURN_LOGITS=1
  • Pass batch size for eval
  • Eval/training paths now honour HF_DATASETS_OFFLINE alongside HF_HUB_OFFLINE

Unsloth Studio security improvements

  • Authentication rate-limiting, proxy-aware so reverse proxies don't bypass it
  • Sandboxed worker with a tightened blocklist (bash, hf upload, NOFILE)
  • Path containment so workers can't escape their in-flight tmp dirs
  • Strict schema validation across the Studio API
  • Tightened CSP / security headers (only legitimate favicon hosts allowed)
  • Removed the torch.load fallback on training_args.bin so untrusted pickles can never execute on model load
  • Hardened Tauri desktop release flow
  • Frontend auth: singleflight token refresh, current-password input on changes, working logout, shared 422 helper
  • Cancel cleanup now scoped strictly to in-flight tmp dirs so it can never delete user state

Bug fixes and correctness

  • Layout-aware MoE LoRA merge with loud-fail on fallback (no more silent wrong saves)
  • num_logits_to_keep regression fixed on transformers >= 4.52
  • Preserve tokenizer EOS token on merged saves
  • Resume PEFT checkpoints under sentence-transformers >= 5.4
  • Restore Flash > SDPA > Flex attention priority for non-Gemma3 models
  • ORPO text-only tokenization now works with processors
  • Embedding matrix size mismatch fix
  • Vicuna chat template fix
  • fast_generate unifies legacy and new logits kwargs (fixes Mistral merge site)
  • higher_precision_softmax made idempotent
  • Patch every LOSS_MAPPING key aliased to ForCausalLMLoss (covers transformers 5.x)
  • GGUF converter sibling imports fixed
  • UTF-8 encoding added to all text-mode file operations
  • Serialise GGUF reload and inherit unsloth-run extra args
  • Fix /recommended-folders 500 on unreadable model directories under Python 3.12+
  • Cross-family GGUF projector blocked in flat local dirs (no more wrong-vision-tower loads)

Installer and platform reliability

  • Custom install paths via STUDIO_HOME / UNSLOTH_STUDIO_HOME
  • CPU-only Linux x86_64 routed to ggml-org/llama.cpp prebuilts
  • Windows CUDA install fixes: paired cudart bundle and Torch NVIDIA DLL paths added to PATH
  • Skip flash-attn install on Blackwell GPUs (sm_100+)
  • Refresh Intel XPU extras for torch 2.7.1 / 2.9.1 / 2.10 / 2.11.0 / 2.12.0; torch upper cap raised to <2.13.0
  • HIP source builds on Ubuntu 24.04 now inject --gcc-install-dir
  • Linux prebuilt fixes for branch-based llama.cpp releases (mangled symlink repair, top-level dir strip)
  • New uninstallers for Linux, macOS (uninstall.sh) and Windows (uninstall.ps1)
  • Mac desktop shortcut spawning and lifecycle fixed
  • unsloth --version flag
  • Studio web update banner and release version display
  • GPU pinned at 95% headroom, with a warning on silent CPU fallback
  • Auto-install flash-linear-attention and tilelang for Qwen3.5 family

What's Changed in Unsloth

  • Bump installer floor to 2026.5.2 by @danielhanchen in https://github.com/unslothai/unsloth/pull/5297
  • install: support STUDIO_HOME / UNSLOTH_STUDIO_HOME for custom install paths by @danielhanchen in https://github.com/unslothai/unsloth/pull/5190
  • Route CPU-only Linux x86_64 to ggml-org/llama.cpp prebuilts by @danielhanchen in https://github.com/unslothai/unsloth/pull/5302
  • feat(studio): MLX training tab on Apple Silicon (LoRA / full FT, VLM, export) by @Manan17 in https://github.com/unslothai/unsloth/pull/5265
  • feat(studio): add Continued Pretraining (CPT) as a training method by @OnePunchMonk in https://github.com/unslothai/unsloth/pull/4677
  • Fix 14 stale tests under tests/studio/install/ that drifted from code by @danielhanchen in https://github.com/unslothai/unsloth/pull/5305
  • Add Studio PR-time CI: pin enforcement, frontend, backend, wheel smoke by @danielhanchen in https://github.com/unslothai/unsloth/pull/5298
  • Studio: restore Studio API and Help menu UI by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5310
  • [studio]: Fix tool reasoning trace in UI by @CodeMan62 in https://github.com/unslothai/unsloth/pull/5314
  • fix: 3 patch_* helpers — fast_lora import, sft_trainer Union, openenv OSError by @danielhanchen in https://github.com/unslothai/unsloth/pull/5319
  • Studio: API settings overflow with long Colab URLs by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5286
  • tests/studio/install: parallel UNSLOTH_STUDIO_HOME smoke test by @danielhanchen in https://github.com/unslothai/unsloth/pull/5306
  • Studio: Dark theme refactor, right sidebar redesign, and chat UI polish by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5150
  • fix: harden Studio IME composer sends by @Etherll in https://github.com/unslothai/unsloth/pull/5327
  • Studio: stop truncating long log lines as suspected base64 by @rolandtannous in https://github.com/unslothai/unsloth/pull/5335
  • fix(gh_client): fail fast on 401/403 auth errors instead of retrying forever (#5325) by @Anai-Guo in https://github.com/unslothai/unsloth/pull/5329
  • fix: unblock 4 tests deselected/skipped in #5312 (real bugs) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5359
  • fix(tests/sh): accept pinned tokenizers line after #5359 by @danielhanchen in https://github.com/unslothai/unsloth/pull/5361
  • CI: scope GITHUB_TOKEN permissions, add MLX CI, unblock ~60 skipped tests by @danielhanchen in https://github.com/unslothai/unsloth/pull/5312
  • studio/tests: make Playwright model-selector probe best-effort by @danielhanchen in https://github.com/unslothai/unsloth/pull/5371
  • Studio: download paired cudart bundle on Windows CUDA installs by @danielhanchen in https://github.com/unslothai/unsloth/pull/5322
  • Studio: add torch's pip nvidia DLL dirs to PATH on Windows by @danielhanchen in https://github.com/unslothai/unsloth/pull/5324
  • studio: authenticate HF downloads across Studio CI workflows by @danielhanchen in https://github.com/unslothai/unsloth/pull/5370
  • dependabot: group security updates and cover /studio/frontend npm advisories by @danielhanchen in https://github.com/unslothai/unsloth/pull/5372
  • Add Studio web update banner and release version display by @wasimysaid in https://github.com/unslothai/unsloth/pull/5308
  • ci/install: retry transient github.com 5xx on unsloth-zoo git fetches by @danielhanchen in https://github.com/unslothai/unsloth/pull/5389
  • studio/ci: pre-install lockfile supply-chain audit (npm + cargo) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5392
  • studio/ci: npm tarball content scanner (no-install, hostile-input safe) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5393
  • studio/tests: AbortSignal-bound in-page fetches and wall-clock watchdog for Playwright probes by @danielhanchen in https://github.com/unslothai/unsloth/pull/5391
  • chore: remove unused .semgrep/unsloth-rules.yml by @danielhanchen in https://github.com/unslothai/unsloth/pull/5395
  • studio/ci: sweep actions/cache v5 hardening across sibling smoke workflows by @danielhanchen in https://github.com/unslothai/unsloth/pull/5399
  • studio/ci: harden HF_HOME cache against actions/cache v5 silent restore failures by @danielhanchen in https://github.com/unslothai/unsloth/pull/5396
  • Harden Tauri release flow by @wasimysaid in https://github.com/unslothai/unsloth/pull/5341
  • Gemma attn by @Datta0 in https://github.com/unslothai/unsloth/pull/5346
  • Multi Image GRPO by @Datta0 in https://github.com/unslothai/unsloth/pull/5197
  • [GRPO] Try returning hidden statex for GRPO by @Datta0 in https://github.com/unslothai/unsloth/pull/5142
  • Studio: pin GPU at 95% headroom and warn on silent CPU fallback by @danielhanchen in https://github.com/unslothai/unsloth/pull/5323
  • Chore(deps): bump the actions group across 1 directory with 4 updates by @dependabot[bot] in https://github.com/unslothai/unsloth/pull/5394
  • security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only by @danielhanchen in https://github.com/unslothai/unsloth/pull/5397
  • studio: security and hardening pass (auth rate-limit, sandbox, path containment, schema validation, headers) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5375
  • studio: fix training page regressions from the security hardening pass by @rolandtannous in https://github.com/unslothai/unsloth/pull/5409
  • Studio: parity of thinking trace icon with Think toggle icon by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5407
  • Studio: vary empty chat sloth mascot by local time of day by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5354
  • security: persist-credentials:false on every actions/checkout (org-wide sweep) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5413
  • import_fixes: stub transformers.conversion_mapping so peft 0.19.x imports on transformers 4.x by @danielhanchen in https://github.com/unslothai/unsloth/pull/5416
  • chore: trim verbose comments added in PR #5416 (commit 12295c1f) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5418
  • studio/ci: flat GGUF+mmproj cache for Mac json-images smoke, save partial caches on cancel by @danielhanchen in https://github.com/unslothai/unsloth/pull/5417
  • studio: comment out training_args.bin torch.load fallback in model_config by @danielhanchen in https://github.com/unslothai/unsloth/pull/5419
  • tests: import_fixes drift detectors (HARD GATE on Core matrix) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5414
  • tests: drift detector parity with unsloth-zoo (fix Core matrix RED on triton + vllm) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5421
  • scripts: ship deterministic comment / docstring-only diff verifier by @danielhanchen in https://github.com/unslothai/unsloth/pull/5422
  • studio: API external provider support for chat (OpenAI, Mistral, Gemini, Cohere, Anthropic, OpenRouter, DeepSeek, custom providers) by @rolandtannous in https://github.com/unslothai/unsloth/pull/4706
  • import_fixes + drift detectors: cover transformers 5.x drift (unblocks PR #5376) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5423
  • MLX training support for Studio on Apple Silicon by @mmathew23 in https://github.com/unslothai/unsloth/pull/5340
  • studio: drop unused max_grad_value schema + route plumbing by @danielhanchen in https://github.com/unslothai/unsloth/pull/5424
  • Studio: Passing batch size for eval by @uderbashi in https://github.com/unslothai/unsloth/pull/5168
  • studio: skip flash-attn install on Blackwell GPUs (sm_100+) by @rolandtannous in https://github.com/unslothai/unsloth/pull/5420
  • Fix: Add missing utf-8 encoding to text-mode file operations by @Tenith01 in https://github.com/unslothai/unsloth/pull/5356
  • Fix/issue 3667 vicuna template by @Tenith01 in https://github.com/unslothai/unsloth/pull/5357
  • tests: public-api surface drift detector (companion to test_import_fixes_drift.py) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5428
  • add UNSLOTH_ALLOW_CPU=1 path for CPU-only CI / source-inspection tests by @danielhanchen in https://github.com/unslothai/unsloth/pull/5429
  • fix(studio/mmproj): block cross-family projectors in flat local GGUF dirs (#5347) by @Anai-Guo in https://github.com/unslothai/unsloth/pull/5350
  • studio/mmproj: skip unwanted GGUF values via seek instead of read by @danielhanchen in https://github.com/unslothai/unsloth/pull/5431
  • ci: install ipython so transformers.utils.notebook imports cleanly in zoo pytest by @danielhanchen in https://github.com/unslothai/unsloth/pull/5437
  • studio/mlx: lower per-element grad clip default from 5.0 to 1.0 by @danielhanchen in https://github.com/unslothai/unsloth/pull/5440
  • studio/frontend: drop unused next dependency by @danielhanchen in https://github.com/unslothai/unsloth/pull/5438
  • Update version-compat-ci.yml by @rolandtannous in https://github.com/unslothai/unsloth/pull/5445
  • ci: merge duplicate with: keys in notebooks-ci checkout steps by @rolandtannous in https://github.com/unslothai/unsloth/pull/5447
  • studio/chat: built-in web search for OpenAI, Anthropic, OpenRouter, Kimi by @rolandtannous in https://github.com/unslothai/unsloth/pull/5443
  • ci: make compiler-cache shim test order-independent by @danielhanchen in https://github.com/unslothai/unsloth/pull/5449
  • Studio: o3 reasoning summary payload by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5426
  • ci: compiler-cache-shim must mutate live module globals + skip rerun by @danielhanchen in https://github.com/unslothai/unsloth/pull/5452
  • Polish/cloud to providers by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5450
  • ci: cap each compiler-sweep iteration with SIGALRM + log progress by @danielhanchen in https://github.com/unslothai/unsloth/pull/5456
  • ci: add tx >=5,<6 slow compile model_types to KNOWN_BROKEN_COMPILE by @danielhanchen in https://github.com/unslothai/unsloth/pull/5458
  • Restore Flash > SDPA > Flex priority for non-gemma3 models by @mmathew23 in https://github.com/unslothai/unsloth/pull/5455
  • ci: stop a partial mmproj cache from poisoning Mac Studio GGUF CI by @danielhanchen in https://github.com/unslothai/unsloth/pull/5459
  • ci: make Windows Stop Studio teardown tolerate Git Bash signal exit by @danielhanchen in https://github.com/unslothai/unsloth/pull/5460
  • Studio: make API key optional for local providers (llama.cpp/vLLM/Ollama) by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5457
  • studio/chat: built-in code execution for OpenAI + Anthropic by @rolandtannous in https://github.com/unslothai/unsloth/pull/5461
  • ci: switch Windows Stop Studio to a cmd no-op marker by @danielhanchen in https://github.com/unslothai/unsloth/pull/5462
  • tests: raise pwsh/bash subprocess timeout from 10s to 60s by @danielhanchen in https://github.com/unslothai/unsloth/pull/5463
  • studio/install: repair upstream llama.cpp prebuilt mangled symlinks by @danielhanchen in https://github.com/unslothai/unsloth/pull/5465
  • studio/chat: OpenAI container picker delete reliability by @rolandtannous in https://github.com/unslothai/unsloth/pull/5466
  • studio/install: strip top-level dir from repaired symlink target by @danielhanchen in https://github.com/unslothai/unsloth/pull/5467
  • Stop: drop Ollama API key, clean up code execution UI by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5464
  • tests/openai: patch httpx.AsyncClient ctor so delete tests hit mock by @danielhanchen in https://github.com/unslothai/unsloth/pull/5469
  • revert: stop touching DEVICE_TYPE == cuda branches for CPU CI by @danielhanchen in https://github.com/unslothai/unsloth/pull/5473
  • ci: drop cache: 'npm' from setup-node (silent abort on Windows) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5474
  • ci: bump Mac json-images timeout 30 -> 45 min (cache-miss path) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5475
  • ci: wrap hf download in xet-tuned stall-retry loop (root-cause Mac 30-min hang) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5476
  • ci: deterministic check for studio/frontend dep removals by @danielhanchen in https://github.com/unslothai/unsloth/pull/5478
  • studio/frontend: drop unused dependencies, move type pkg to devDeps by @danielhanchen in https://github.com/unslothai/unsloth/pull/5477
  • intel-gpu: refresh xpu extras (fix torch 2.10, add 2.7.1 / 2.9.1 / 2.11.0 / 2.12.0) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5484
  • Studio: auto-load models when adding a cloud provider by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5472
  • Studio: code execution config visual polish by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5471
  • disable_torchcodec_if_broken: also patch datasets and clean sys.modules by @danielhanchen in https://github.com/unslothai/unsloth/pull/5483
  • tests: pinned-symbol canary for unsloth-zoo save_pretrained_merged guards (#5410) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5433
  • intel-gpu: pin unsloth_zoo>=2026.5.2 (fixes #5494) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5499
  • fix(sentence_transformer): resume PEFT checkpoints under sentence-transformers >= 5.4 by @Etherll in https://github.com/unslothai/unsloth/pull/5454
  • Studio: serialise GGUF reload and inherit unsloth-run extra args by @danielhanchen in https://github.com/unslothai/unsloth/pull/5427
  • Studio: IME / multilingual composer regression test + RTL dir="auto" by @danielhanchen in https://github.com/unslothai/unsloth/pull/5485
  • fix: preserve tokenizer eos token on merged saves by @anmolxlight in https://github.com/unslothai/unsloth/pull/5451
  • studio/chat: reuse Anthropic code_execution container across turns by @rolandtannous in https://github.com/unslothai/unsloth/pull/5519
  • Fix Linux prebuilt installs for branch-based llama.cpp releases by @mmathew23 in https://github.com/unslothai/unsloth/pull/5493
  • Studio: stop hint, Uvicorn log rename, reachability check + Mac UI CI retry hardening by @danielhanchen in https://github.com/unslothai/unsloth/pull/5503
  • Studio composer action pill styling by @Imagineer99 in https://github.com/unslothai/unsloth/pull/5522
  • Fix /recommended-folders 500 on unreadable model directories (Python 3.12+) by @mmathew23 in https://github.com/unslothai/unsloth/pull/5523
  • studio/chat: persist Anthropic container id on first turn of new thread by @rolandtannous in https://github.com/unslothai/unsloth/pull/5526
  • studio/openai: align chat completions docstring with stream=false default (closes #5047) by @wtfashwin in https://github.com/unslothai/unsloth/pull/5524
  • Add a simple --version flag by @melroy89 in https://github.com/unslothai/unsloth/pull/5516
  • studio: load cached GGUF models when fully offline by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5505
  • studio: expose launcher capability bits on unauth /api/health by @danielhanchen in https://github.com/unslothai/unsloth/pull/5486
  • studio: tighten sandbox blocklist precision (bash, hf upload, NOFILE) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5487
  • studio: scope cancel-cleanup to in-flight tmp dirs; walk back tool_call_id by @danielhanchen in https://github.com/unslothai/unsloth/pull/5488
  • studio: proxy-aware login rate-limit; allow google favicons in CSP by @danielhanchen in https://github.com/unslothai/unsloth/pull/5489
  • studio/frontend: wire logout, singleflight refresh, shared 422 helper, current-password input by @danielhanchen in https://github.com/unslothai/unsloth/pull/5490
  • tests/studio: lock in Windows GPU detection fix (#5106) with a synthetic CI test by @danielhanchen in https://github.com/unslothai/unsloth/pull/5376
  • Studio: auto-enable MTP speculative decoding for MTP GGUFs by @danielhanchen in https://github.com/unslothai/unsloth/pull/5527
  • Studio: warn when llama.cpp prebuilt is too old for MTP by @danielhanchen in https://github.com/unslothai/unsloth/pull/5528
  • Studio: warn when llama.cpp prebuilt is at least 3 days behind by @danielhanchen in https://github.com/unslothai/unsloth/pull/5529
  • studio: extend offline DNS auto-detect to inference parent + training by @danielhanchen in https://github.com/unslothai/unsloth/pull/5512
  • Fix ORPO text-only tokenization with processors by @alkinun in https://github.com/unslothai/unsloth/pull/5501
  • fix(studio/worker): inject --gcc-install-dir for HIP source builds on Ubuntu 24.04 by @h34v3nzc0dex in https://github.com/unslothai/unsloth/pull/5517
  • studio: gate image input on effective vision capability by @Etherll in https://github.com/unslothai/unsloth/pull/5492
  • studio/install: fix mac desktop shortcut spawning and lifecycle by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5496
  • studio: add uninstall.sh and document it in README by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5497
  • Studio update CI: round-trip install -> update -> uninstall by @danielhanchen in https://github.com/unslothai/unsloth/pull/5536
  • studio: fix Connections dialog UX issues surfaced by image-gate probe by @danielhanchen in https://github.com/unslothai/unsloth/pull/5518
  • studio: add uninstall.ps1 for Windows by @danielhanchen in https://github.com/unslothai/unsloth/pull/5513
  • Fix num_logits_to_keep regression on transformers >= 4.52 by @danielhanchen in https://github.com/unslothai/unsloth/pull/5538
  • Uninstaller script by @PTFOPlayer in https://github.com/unslothai/unsloth/pull/4611
  • Add OpenDocument chat attachments by @alkinun in https://github.com/unslothai/unsloth/pull/5510
  • studio/frontend: stop showing Generating spinner on empty welcome view by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5530
  • studio/frontend: swap Hugeicons spokes spinner for CSS ring by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5531
  • studio/frontend: grow chat composer to 16 rows and inset scrollbar by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5540
  • studio/frontend: make toast and inline error text selectable and copyable by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5506
  • studio: add dismissable toasts with corner close button by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5509
  • studio: install flash-linear-attention and tilelang for Qwen3.5 family by @danielhanchen in https://github.com/unslothai/unsloth/pull/5434
  • studio/frontend: soften toast shadow and tighten vertical padding by @shimmyshimmer in https://github.com/unslothai/unsloth/pull/5511
  • fast_generate: unify legacy/new logits kwarg + fix Mistral merge site by @danielhanchen in https://github.com/unslothai/unsloth/pull/5543
  • studio/frontend: hide Current password input on first boot by @danielhanchen in https://github.com/unslothai/unsloth/pull/5545
  • tests/studio: tighten MLX smoke gates (loss + round-trip, _on_step grad_norm) by @danielhanchen in https://github.com/unslothai/unsloth/pull/5537
  • tests + CI: callback signature drift detector by @danielhanchen in https://github.com/unslothai/unsloth/pull/5498
  • images: use narrower Discord button and drop duplicate by @danielhanchen in https://github.com/unslothai/unsloth/pull/5552
  • fix(studio): handle expired OpenAI shell-tool containers without surfacing error in chat by @rolandtannous in https://github.com/unslothai/unsloth/pull/5547
  • studio/chat: release stuck IME flag when compositionend never fires by @wtfashwin in https://github.com/unslothai/unsloth/pull/5551

New Contributors

  • @Anai-Guo made their first contribution in https://github.com/unslothai/unsloth/pull/5329
  • @uderbashi made their first contribution in https://github.com/unslothai/unsloth/pull/5168
  • @Tenith01 made their first contribution in https://github.com/unslothai/unsloth/pull/5356
  • @anmolxlight made their first contribution in https://github.com/unslothai/unsloth/pull/5451
  • @wtfashwin made their first contribution in https://github.com/unslothai/unsloth/pull/5524
  • @melroy89 made their first contribution in https://github.com/unslothai/unsloth/pull/5516
  • @h34v3nzc0dex made their first contribution in https://github.com/unslothai/unsloth/pull/5517
  • @PTFOPlayer made their first contribution in https://github.com/unslothai/unsloth/pull/4611

Full Changelog: https://github.com/unslothai/unsloth/compare/v0.1.39-beta...v0.1.40-beta

What's Changed in Unsloth-Zoo

  • Register Gemma-4 MoE LoRA extractor to fix grouped_mm contraction crash by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/624
  • feat(mlx): Apple Silicon training (text + VLM, LoRA / full FT, CCE, export) by @Manan17 in https://github.com/unslothai/unsloth-zoo/pull/620
  • tests: skip MoE LoRA extractor coverage when discovery finds zero classes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/628
  • tests: pivot MoE-coverage canary to _unsloth_already_patched marker by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/630
  • fix(compiler): make higher_precision_softmax idempotent by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/631
  • fix(mlx): unblock GGUF export and LoRA reload on Apple Silicon by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/627
  • fix(compiler): unblock all model_types across transformers 4.57.6 and 5.x by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/632
  • Mask for gemma3 attn by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/635
  • Multi Image GRPO by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/613
  • [GRPO] Try returning hidden statex for GRPO by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/609
  • Refactor and consolidate moe lora extractors by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/629
  • security + CI: mirror unsloth's hardening stack onto zoo (greenfield .github/) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/637
  • remove unsloth_zoo/import_fixes.py: redundant with unsloth's by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/639
  • chore: trim verbose comments across PR #637 landing by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/640
  • scripts: ship deterministic comment / docstring-only diff verifier by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/641
  • fix mlx: Adds the MLX training path used by Studio on Apple Silicon by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/634
  • tests: drift detectors cover transformers 5.x (mirror unsloth PR #5423) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/642
  • gpt_oss: reorder helpers before patch_gpt_oss_bnb4bit_auto by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/643
  • init: lazy-load legacy MLX aliases on every host by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/644
  • tests: contain security-conftest network block; fix stale mlx paths; skip GPU import in trainer-exec-marker by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/648
  • fix CI fallout from MLX subpackage refactor (#634) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/646
  • tests: tolerate transformers 5.x source/signature drift in two zoo drift detectors by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/650
  • tests: skip _assert_params_superset when upstream forward is (*args, **kwargs) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/651
  • mlx: lower max_grad_value default from 5.0 to 1.0 by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/652
  • saving: layout-aware MoE LoRA merge + loud-fail on fallback (#5410) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/647
  • tests: follow MoE merge wrapper delegation in drift detector by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/653
  • additional import try except handling for mlx by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/654
  • Patch every LOSS_MAPPING key aliased to ForCausalLMLoss by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/656
  • deps: bump torch upper cap to <2.13.0 (allow xpu 2.11.0 / 2.12.0) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/658
  • Auto-install fused lm_head + cross_entropy forward (opt-in) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/657
  • tests: CPU regression detectors for the MoE merge / save path (#5410) by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/655
  • Fix GGUF converter sibling imports by @alkinun in https://github.com/unslothai/unsloth-zoo/pull/661
  • fix embedding matrix size mismatch bug by @CodeMan62 in https://github.com/unslothai/unsloth-zoo/pull/645
  • Honor UNSLOTH_RETURN_LOGITS in fused forward by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/665
  • init: include HF_DATASETS_OFFLINE in the offline env cross-sync by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/664
  • compiler: single-matmul opt-in for UNSLOTH_RETURN_LOGITS=1 by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/666

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Unsloth

Get notified when new releases ship.

Sign up free

About Unsloth

All releases →

Related context

Earlier breaking changes

  • v0.1.43-beta Do not use `unsloth studio update`; it does not fetch latest updates.

Beta — feedback welcome: [email protected]