ypollak2/llm-router

v9.0.1 Breaking

This release includes breaking changes for platform teams planning a safe upgrade.

Published 2mo LLM Frameworks

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai-routing anthropic claude claude-code cost-optimization gemini

+7 more

litellm llm llm-router mcp-server model-router ollama openai

ReleasePort's take

Light signal

editorial:auto 2mo

The routing chain now correctly orders Ollama first and places Claude last; the dashboard adds a LAST PROMPT ROUTING panel with tier labels.

Why it matters: Fixes incorrect free‑first routing order, restores Ollama functionality, and provides UI visibility of model tiers—critical for cost‑effective prompt handling.

Summary

AI summary

Fixed routing chain ordering, re‑enabled Ollama, and added LAST PROMPT ROUTING panel with tier labels.

Changes in this release

Type	Severity	Summary	CVE
Feature
Feature	Medium	Added LAST PROMPT ROUTING panel showing models used with tier labels. Added LAST PROMPT ROUTING panel showing models used with tier labels. Source: llm_adapter@2026-05-23 Confidence: low	—
Feature	Medium	Added Claude host model tracking displaying subscription quota delta. Added Claude host model tracking displaying subscription quota delta. Source: llm_adapter@2026-05-23 Confidence: low	—
Feature	Medium	Displays 5‑hour quota reset time next to session quota bar. Displays 5‑hour quota reset time next to session quota bar. Source: llm_adapter@2026-05-23 Confidence: low	—
Deprecation	Medium	Removed quality gates counter, baseline/actual comparison, and yearly projection from dashboard. Removed quality gates counter, baseline/actual comparison, and yearly projection from dashboard. Source: llm_adapter@2026-05-23 Confidence: low	—
Bugfix	Medium	Re-enabled Ollama as first in routing chain (previously silently disabled). Re-enabled Ollama as first in routing chain (previously silently disabled). Source: llm_adapter@2026-05-23 Confidence: high	—
Bugfix	Medium	Corrected free-first routing order: Ollama → Codex → paid APIs → Claude (subscription last). Corrected free-first routing order: Ollama → Codex → paid APIs → Claude (subscription last). Source: llm_adapter@2026-05-23 Confidence: high	—

Full changelog

What's New

Fixed

Free-first routing chain — Ollama → Codex → paid APIs → Claude (subscription last). Previously Claude Code sessions put Codex last.
Ollama re-enabled — was silently disabled in config; now active as first in chain.

Removed

Quality gates counter, baseline/actual comparison, yearly projection from dashboard (misleading for subscription users)

Added

LAST PROMPT ROUTING panel — shows all models used in the last prompt with [FREE]/[SUB]/[API] tier labels
Claude host model tracking — [SUB] claude/opus-4.6 row shows subscription quota delta alongside routed calls
5h quota reset time — displays "resets in Xh Ym (5:15pm BST)" next to session quota bar

Upgrade

pip install --upgrade llm-routing

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track ypollak2/llm-router

Get notified when new releases ship.

About ypollak2/llm-router

Subscription-aware LLM router for Claude Code. Routes tasks to 20+ providers (OpenAI, Gemini, Groq, Ollama, Codex) based on complexity classification, Claude subscription pressure, and cost. Free tasks stay on Claude subscription; expensive tasks fall back to the cheapest capable model. Includes 30 MCP tools, 6 auto-routing hooks, semantic dedup cache, prompt caching, daily spend cap, and a live web dashboard.

All releases →

Related context

Related tools

Earlier breaking changes

v9.2.0 Changes auto‑route directive from advisory "DO NOT SKIP" to hard constraint with explicit blocked tools list.
v9.2.0 Breaks permanent downgrade of enforcement after first Edit/Write; v13 now requires per‑turn routing.