This release adds 10 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+7 more
Summary
AI summaryDynamic model selection based on real-time budget, quality, latency, and user feedback replaces the static routing table.
Full changelog
What's new — Adaptive Universal Router v5.0
The router now dynamically selects models based on real-time budget pressure, benchmark quality scores, measured latency, and user feedback — replacing the static routing table.
Five new pillars
| Pillar | Module | What it does |
|--------|--------|-------------|
| Budget Oracle | budget.py | Pressure [0.0–1.0] across all provider types. Claude quota uses highest_pressure (max of session/weekly/sonnet dims) |
| Universal Discovery | discover.py | Scans Ollama, HuggingFace, API keys, Codex CLI. 16-family alias registry for local models |
| Live Benchmark Registry | benchmarks.py | get_quality_score() for API + Ollama models. Weekly background refresh from session-start |
| Unified Scorer | scorer.py | Weighted formula: simple tasks = 45% budget + 20% quality; complex = 55% quality + 20% budget |
| Dynamic Chain Builder | chain_builder.py | Assembles best-first chain from discovered models. Free-first invariant preserved |
New MCP tool: llm_budget
Shows real-time Budget Oracle pressure bars for all providers.
Feature flag
LLM_ROUTER_DYNAMIC=true activates dynamic routing. Default false — full v4.x backward compatibility.
Hook improvements
enforce-route.py: session-type tracking — once Claude edits a file, enforcement downgrades to soft for the sessionauto-route.py:_is_build_task()fast-path skips routing for implementation prompts- Pending route TTL: 300s → 60s
Upgrade
pip install --upgrade claude-code-llm-router && llm-router install
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About ypollak2/llm-router
Subscription-aware LLM router for Claude Code. Routes tasks to 20+ providers (OpenAI, Gemini, Groq, Ollama, Codex) based on complexity classification, Claude subscription pressure, and cost. Free tasks stay on Claude subscription; expensive tasks fall back to the cheapest capable model. Includes 30 MCP tools, 6 auto-routing hooks, semantic dedup cache, prompt caching, daily spend cap, and a live web dashboard.
Related context
Related tools
Beta — feedback welcome: [email protected]