This release adds 2 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+7 more
Summary
AI summaryAdded dry‑run CLI, savings dashboard tool, DB schema migration for cost tracking.
Full changelog
Added
-
llm-router test "<prompt>"dry-run CLI (src/llm_router/cli.py)Simulates a routing decision for any prompt without making an API call. Uses the existing 5-layer classifier to determine task type, complexity, and confidence, then maps to the cheapest appropriate model and shows an estimated cost vs Sonnet baseline.
llm-router test "refactor the auth module to use JWT" → Task: analyze / moderate / 85% confidence (via gemini-flash-lite) → Chosen: claude-sonnet-4-6 Baseline: claude-sonnet-4-5 → Saved: $0.00465 (100% cheaper) -
llm_savingsMCP tool (src/llm_router/tools/admin.py)Text-based savings dashboard with time-bucketed aggregates: today / this week / this month / all-time. Shows actual spend, Sonnet baseline, savings, and the efficiency multiplier (Nx) — the "wow" metric that makes routing value tangible.
-
DB schema migration v2.1 (
src/llm_router/cost.py)Four new columns added to
usagetable via idempotentALTER TABLE(safe for existing DBs):baseline_model,potential_cost_usd,saved_usd,is_simulated. -
get_savings_by_period()(src/llm_router/cost.py) — async savings query used by status bar andllm_savings. Falls back to Sonnet estimation for pre-v2.1 rows. -
Enhanced status bar v3 (
src/llm_router/hooks/status-bar.py) — D/W savings, provider health icons (auto-hidden untilhealth.jsonactive), enforcement mode badge, Nx efficiency multiplier. Full mode viaLLM_ROUTER_STATUS_MODE=full.
Fixed
- Ruff F541 (
src/llm_router/cli.py:1275) — spuriousfprefix on string with no placeholders; broke CI lint on both Python 3.11 and 3.12.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About ypollak2/llm-router
Subscription-aware LLM router for Claude Code. Routes tasks to 20+ providers (OpenAI, Gemini, Groq, Ollama, Codex) based on complexity classification, Claude subscription pressure, and cost. Free tasks stay on Claude subscription; expensive tasks fall back to the cheapest capable model. Includes 30 MCP tools, 6 auto-routing hooks, semantic dedup cache, prompt caching, daily spend cap, and a live web dashboard.
Related context
Related tools
Beta — feedback welcome: [email protected]