This release includes breaking changes for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
+7 more
Summary
AI summaryUpdates Tests, Verification, and 2026-06-05 across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Bugfix | Medium |
Dashboard cumulative savings now persist across sessions in SQLite usage.db. Dashboard cumulative savings now persist across sessions in SQLite usage.db. Source: llm_adapter@2026-06-05 Confidence: high |
— |
| Bugfix | Medium |
Enforce-route deadlock recovery auto-pivots after 4 blocks and corrects threshold messaging. Enforce-route deadlock recovery auto-pivots after 4 blocks and corrects threshold messaging. Source: llm_adapter@2026-06-05 Confidence: low |
— |
| Bugfix | Low |
Coordination scoring no longer hijacks long substantive prompts; length gate and regex trimmed. Coordination scoring no longer hijacks long substantive prompts; length gate and regex trimmed. Source: llm_adapter@2026-06-05 Confidence: high |
— |
| Refactor | Low |
Removed extraneous f-string prefixes in escalation messages with no interpolation. Removed extraneous f-string prefixes in escalation messages with no interpolation. Source: llm_adapter@2026-06-05 Confidence: high |
— |
Full changelog
v10.1.2 — Dashboard persistence + enforce-route deadlock recovery + coordination length-gate (2026-06-05)
Three correctness fixes in the routing/enforcement pipeline. None change shipping APIs; all are surgical hook + session_spend edits.
Fixed
- Dashboard cumulative savings now persist across sessions.
SessionSpend.record_reclaimed()previously only updated the in-memorysession_spend.json, so subscription-funded savings (Claude Code Haiku/Sonnet routed via thesubscriptionprovider) showed up in the per-session "Net preserved" panel and vanished the moment the session ended. The fix appends one row per routed call to theclaude_usageSQLite table (~/.llm-router/usage.db), and_query_cumulative_savingsinsession-end.pynow UNIONs that table alongsideusageandsavings_statsfor the today/week/month/lifetime rollup. The query usesdate(timestamp, 'localtime')on both sides of the WHERE clause so the rollup is correct in the midnight-local-but-not-yet-midnight-UTC window. Write is best-effort: ifusage.dbdoesn't exist yet (first run beforecost.pyinitializes it) the write is silently skipped — tracking never crashes the router. enforce-route.pydeadlock recovery — auto-pivot + corrected threshold messaging. When the same MCP tool was blocked 3+ times within 2 minutes the hook now releases the route-lock and clears the pending tool, breaking would-be infinite loops where the model retried the same blocked call. The block message previously said/2while the actual auto-pivot threshold was/4; both are now consistent at/4, and the message documents the escape valves (LLM_ROUTER_ENFORCE=off, the auto-pivot itself). In smart mode, read-only Bash patterns (ls,find,git log,gh pr view, …) now pass through for code tasks so the model can investigate before routing, matching the existing Read/Glob/Grep/LS pass-through.- Coordination scoring no longer hijacks long substantive prompts. The heuristic classifier was scoring
coordinationfor multi-sentence prompts that happened to contain common English words like "continue", "run", "test", "verify", "check" — a real-world misfire routed a RouterArena optimization prompt toqwen2.5:7bwhich hallucinated a numpy/cProfile answer unrelated to the input. Two surgical changes: (1)COORDINATION_MAX_LEN = 150forces the coordination score to zero for any prompt over 150 characters inscore_categories— coordination prompts are short by nature ("y","yes proceed","push to main"); long prompts cannot be coordination regardless of which short coordination words they contain. (2) The coordination/intent regex was trimmed to strong git/deploy verbs (push,pull,deploy,release,publish,commit,merge,sync,fetch,rebase) plus short ack tokens (yes,ok,y,n,go ahead), removing the false-firing common words. The cache layer was cleared as a suspect during diagnosis — it already SHA-256s the full prompt and is keyed correctly; the misfire was fresh Ollama inference, not stale cache. - Lint cleanup. Removed extraneous f-string prefixes in escalation messages that had no interpolation.
Tests
tests/test_auto_route_signals.py(19 tests, all passing) — length-gate behavior, previously-misfired prompts no longer score coordination, legitimate short git prompts still win coordination, substantive prompts still classify as code/analyze/generate, end-to-endclassify_promptwith LLM classifiers disabled.- 35 cost tests pass; 52 enforce-route tests pass.
- Full suite: 2287 / 2287 pass.
Verification
- Dashboard end-to-end verified by direct INSERT into
claude_usagethen re-querying_query_cumulative_savings— the new row surfaces in today/week/month/lifetime totals with correct localtime handling. - Enforce-route deadlock recovery verified against 3-blocks-in-2-min trace (auto-pivot fires, lock releases, pending cleared).
- Coordination misfire verified against the original RouterArena prompt: pre-fix
coordination: 13(winner) → post-fixcoordination: 0(length gate) →code: 2wins.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About ypollak2/llm-router
Subscription-aware LLM router for Claude Code. Routes tasks to 20+ providers (OpenAI, Gemini, Groq, Ollama, Codex) based on complexity classification, Claude subscription pressure, and cost. Free tasks stay on Claude subscription; expensive tasks fall back to the cheapest capable model. Includes 30 MCP tools, 6 auto-routing hooks, semantic dedup cache, prompt caching, daily spend cap, and a live web dashboard.
Related context
Related tools
Beta — feedback welcome: [email protected]