Skip to content

ypollak2/llm-router

v9.0.1 Breaking

This release includes breaking changes for platform teams planning a safe upgrade.

Published 11d LLM Frameworks
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai-routing anthropic claude claude-code cost-optimization gemini
+7 more
litellm llm llm-router mcp-server model-router ollama openai

ReleasePort's take

Light signal
editorial:auto 11d

The routing chain now correctly orders Ollama first and places Claude last; the dashboard adds a LAST PROMPT ROUTING panel with tier labels.

Why it matters: Fixes incorrect free‑first routing order, restores Ollama functionality, and provides UI visibility of model tiers—critical for cost‑effective prompt handling.

Summary

AI summary

Fixed routing chain ordering, re‑enabled Ollama, and added LAST PROMPT ROUTING panel with tier labels.

Changes in this release

Feature Medium

Added LAST PROMPT ROUTING panel showing models used with tier labels.

Added LAST PROMPT ROUTING panel showing models used with tier labels.

Source: llm_adapter@2026-05-23

Confidence: low

Feature Medium

Added Claude host model tracking displaying subscription quota delta.

Added Claude host model tracking displaying subscription quota delta.

Source: llm_adapter@2026-05-23

Confidence: low

Feature Medium

Displays 5‑hour quota reset time next to session quota bar.

Displays 5‑hour quota reset time next to session quota bar.

Source: llm_adapter@2026-05-23

Confidence: low

Deprecation Medium

Removed quality gates counter, baseline/actual comparison, and yearly projection from dashboard.

Removed quality gates counter, baseline/actual comparison, and yearly projection from dashboard.

Source: llm_adapter@2026-05-23

Confidence: low

Bugfix Medium

Re-enabled Ollama as first in routing chain (previously silently disabled).

Re-enabled Ollama as first in routing chain (previously silently disabled).

Source: llm_adapter@2026-05-23

Confidence: high

Bugfix Medium

Corrected free-first routing order: Ollama → Codex → paid APIs → Claude (subscription last).

Corrected free-first routing order: Ollama → Codex → paid APIs → Claude (subscription last).

Source: llm_adapter@2026-05-23

Confidence: high

Full changelog

What's New

Fixed

  • Free-first routing chain — Ollama → Codex → paid APIs → Claude (subscription last). Previously Claude Code sessions put Codex last.
  • Ollama re-enabled — was silently disabled in config; now active as first in chain.

Removed

  • Quality gates counter, baseline/actual comparison, yearly projection from dashboard (misleading for subscription users)

Added

  • LAST PROMPT ROUTING panel — shows all models used in the last prompt with [FREE]/[SUB]/[API] tier labels
  • Claude host model tracking[SUB] claude/opus-4.6 row shows subscription quota delta alongside routed calls
  • 5h quota reset time — displays "resets in Xh Ym (5:15pm BST)" next to session quota bar

Upgrade

pip install --upgrade llm-routing

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track ypollak2/llm-router

Get notified when new releases ship.

Sign up free

About ypollak2/llm-router

Subscription-aware LLM router for Claude Code. Routes tasks to 20+ providers (OpenAI, Gemini, Groq, Ollama, Codex) based on complexity classification, Claude subscription pressure, and cost. Free tasks stay on Claude subscription; expensive tasks fall back to the cheapest capable model. Includes 30 MCP tools, 6 auto-routing hooks, semantic dedup cache, prompt caching, daily spend cap, and a live web dashboard.

All releases →

Related context

Earlier breaking changes

  • v9.2.0 Changes auto‑route directive from advisory "DO NOT SKIP" to hard constraint with explicit blocked tools list.
  • v9.2.0 Breaks permanent downgrade of enforcement after first Edit/Write; v13 now requires per‑turn routing.

Beta — feedback welcome: [email protected]