Skip to content

ypollak2/llm-router

v8.9.0 Breaking

This release includes 1 breaking change for platform teams planning a safe upgrade.

Published 16d LLM Frameworks
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai-routing anthropic claude claude-code cost-optimization gemini
+7 more
litellm llm llm-router mcp-server model-router ollama openai

Affected surfaces

auth rbac breaking_upgrade

Summary

AI summary

Local‑only classification is now the default, requiring an env var to enable external classifiers.

Changes in this release

Feature Low

Local-only classification by default for prompts unless opted in

Local-only classification by default for prompts unless opted in

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Low

Env var LLM_ROUTER_CLASSIFY_LOCAL_ONLY=false enables external classifiers

Env var LLM_ROUTER_CLASSIFY_LOCAL_ONLY=false enables external classifiers

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Low

Logs which model handles each request before routing

Logs which model handles each request before routing

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Low

Budget enforcement failures visible in logs

Budget enforcement failures visible in logs

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Low

Set LLM_ROUTER_SLIM=off restores all tools

Set LLM_ROUTER_SLIM=off restores all tools

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Low

Only write operations disable enforcement for the session

Only write operations disable enforcement for the session

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Performance Low

Replace silent exception blocks with warning-level logging in routing safety

Replace silent exception blocks with warning-level logging in routing safety

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: low

Bugfix Medium

Fix conflicting baseline for savings using Sonnet as baseline

Fix conflicting baseline for savings using Sonnet as baseline

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Bugfix Medium

Deduplicate savings_stats vs usage table by provider

Deduplicate savings_stats vs usage table by provider

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Bugfix Medium

Phantom savings from failed free-provider calls now $0 saved

Phantom savings from failed free-provider calls now $0 saved

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Bugfix Low

Adaptive color palette fixes white-background readability in dashboard

Adaptive color palette fixes white-background readability in dashboard

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Refactor Low

Default slim mode changed from "off" to "routing" (12 core tools)

Default slim mode changed from "off" to "routing" (12 core tools)

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Refactor Low

Read/Glob/Grep/LS no longer mark session as "coding"

Read/Glob/Grep/LS no longer mark session as "coding"

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Other Low

Fix package name in GETTING_STARTED.md and QUICKSTART_2MIN.md

Fix package name in GETTING_STARTED.md and QUICKSTART_2MIN.md

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: low

Full changelog

Trust & Safety Hardening

This release addresses findings from a comprehensive churn and safety investigation.

Savings Math (Honesty)

  • Fix conflicting baseline: savings now consistently use Sonnet as baseline (was mislabeled as Opus)
  • Fix double-counting: deduplicate savings_stats vs usage table by provider
  • Fix phantom savings: 0 tokens from failed free-provider calls = $0 saved

Privacy (Local-First Default)

  • Local-only classification by default: prompts are no longer sent to OpenAI/Gemini for classification unless explicitly opted in
  • New env var: LLM_ROUTER_CLASSIFY_LOCAL_ONLY=false to enable external classifiers
  • Provider transparency: logs which model handles each request before routing

Routing Safety

  • Replace 5 silent except Exception: pass blocks with warning-level logging
  • Budget enforcement failures are now visible in logs

Scope Reduction

  • Default slim mode changed from "off" to "routing": 12 core tools registered instead of 60
  • Set LLM_ROUTER_SLIM=off to restore all tools

Hook Safety

  • Read/Glob/Grep/LS no longer mark session as "coding" — preserves routing enforcement
  • Only write operations (Edit/Write/MultiEdit) disable enforcement for the session

Dashboard

  • Adaptive color palette for terminal theme compatibility (fixes white-background readability)

Docs

  • Fix package name in GETTING_STARTED.md and QUICKSTART_2MIN.md

Upgrade

pip install --upgrade claude-code-llm-router

Breaking Changes

  • Default behavior for prompt classification changed to local‑only; external classifiers now require `LLM_ROUTER_CLASSIFY_LOCAL_ONLY=false`.

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track ypollak2/llm-router

Get notified when new releases ship.

Sign up free

About ypollak2/llm-router

Subscription-aware LLM router for Claude Code. Routes tasks to 20+ providers (OpenAI, Gemini, Groq, Ollama, Codex) based on complexity classification, Claude subscription pressure, and cost. Free tasks stay on Claude subscription; expensive tasks fall back to the cheapest capable model. Includes 30 MCP tools, 6 auto-routing hooks, semantic dedup cache, prompt caching, daily spend cap, and a live web dashboard.

All releases →

Related context

Earlier breaking changes

  • v9.2.0 Changes auto‑route directive from advisory "DO NOT SKIP" to hard constraint with explicit blocked tools list.
  • v9.2.0 Breaks permanent downgrade of enforcement after first Edit/Write; v13 now requires per‑turn routing.

Beta — feedback welcome: [email protected]