This release includes 2 security fixes for security teams reviewing exposed deployments.
Topics
+7 more
Affected surfaces
Summary
AI summaryTOCTOU budget enforcement prevents concurrent calls from slipping under spend limits.
Full changelog
What's new
Bug Fixes
- TOCTOU budget enforcement — concurrent calls can no longer both slip under daily/monthly spend limits; provisional
_pending_spendreservation is held inside_budget_lockuntil the call completes - Claude quota staleness guard —
usage.jsonolder than 24h now returnspressure=0.5instead of0.0, preventing unlimited routing when the session hook is absent (LLM_ROUTER_STALE_PRESSURE_FLOORto tune) - Event-loop blocking read —
_claude_subscription_state()now usesasyncio.to_thread()for filesystem access - File handle leak —
auto-route.pyhook now usesPath.read_text()instead of bareopen() - Duplicate models in chain — Ollama/Codex injection no longer re-adds models already in the static chain
Added
- Correlation ID tracing — every routed call gets a
uuid4().hex[:8]ID written to bothusageandrouting_decisionstables for log↔DB joins - DB query indices — four new indices on high-cardinality columns for dashboard and analytics queries
- Dashboard token auth —
aiohttpmiddleware validatesX-Dashboard-Tokenon all API routes
Refactor
route_and_call()reduced from ~960 to ~527 lines via_resolve_profile()and_build_and_filter_chain()extractions
Upgrade
pip install --upgrade claude-code-llm-router && llm-router install
Security Fixes
- Claude quota staleness guard now returns pressure=0.5 when usage.json >24h old (configurable via LLM_ROUTER_STALE_PRESSURE_FLOOR)
- Event‑loop blocking read offloaded to asyncio.to_thread() in _claude_subscription_state()
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About ypollak2/llm-router
Subscription-aware LLM router for Claude Code. Routes tasks to 20+ providers (OpenAI, Gemini, Groq, Ollama, Codex) based on complexity classification, Claude subscription pressure, and cost. Free tasks stay on Claude subscription; expensive tasks fall back to the cheapest capable model. Includes 30 MCP tools, 6 auto-routing hooks, semantic dedup cache, prompt caching, daily spend cap, and a live web dashboard.
Related context
Related tools
Beta — feedback welcome: [email protected]