Skip to content

ypollak2/llm-router

v9.4.0 Breaking

This release includes 1 breaking change for platform teams planning a safe upgrade.

Published 4d LLM Frameworks
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai-routing anthropic claude claude-code cost-optimization gemini
+7 more
litellm llm llm-router mcp-server model-router ollama openai

ReleasePort's take

Light signal
editorial:auto 4d

The release introduces several bugfixes and enhancements to usage tracking, adds new documentation and sponsorship features, and expands the test suite.

Why it matters: Bugfixes ensure accurate savings logging across routing calls and UI displays; deprecation of orphan `llm_usage.db` clarifies data storage. README updates improve discoverability; +24 tests raise overall code confidence to 1977 passing checks.

Summary

AI summary

Updates Migration, Tests, and README across a mixed release.

Changes in this release

Feature Low

README now hoists install CTA above the fold with tagline and collapsible Table of Contents covering 17 sections.

README now hoists install CTA above the fold with tagline and collapsible Table of Contents covering 17 sections.

Source: llm_adapter@2026-05-31

Confidence: high

Feature Low

`.github/FUNDING.yml` added to display GitHub sponsor button for `ypollak2`.

`.github/FUNDING.yml` added to display GitHub sponsor button for `ypollak2`.

Source: llm_adapter@2026-05-31

Confidence: high

Deprecation Low

Orphan `~/.llm-router/llm_usage.db` file is deprecated and can be safely deleted.

Orphan `~/.llm-router/llm_usage.db` file is deprecated and can be safely deleted.

Source: llm_adapter@2026-05-31

Confidence: high

Bugfix Medium

`cc-usage-track.py` redirects writes from orphan `llm_usage.db` to canonical `usage.db`.

`cc-usage-track.py` redirects writes from orphan `llm_usage.db` to canonical `usage.db`.

Source: llm_adapter@2026-05-31

Confidence: high

Bugfix Medium

Claude Code statusline shows live savings instead of always displaying `$0.00`.

Claude Code statusline shows live savings instead of always displaying `$0.00`.

Source: llm_adapter@2026-05-31

Confidence: high

Bugfix Medium

DIRECT routing savings now persist live via new llm_router.hooks.savings_logger module.

DIRECT routing savings now persist live via new llm_router.hooks.savings_logger module.

Source: llm_adapter@2026-05-31

Confidence: low

Bugfix Medium

`usage` table now records baseline_model, potential_cost_usd, and saved_usd columns on each routed call.

`usage` table now records baseline_model, potential_cost_usd, and saved_usd columns on each routed call.

Source: llm_adapter@2026-05-31

Confidence: low

Refactor Low

+24 new tests added across multiple test modules; full suite passes 1977 tests.

+24 new tests added across multiple test modules; full suite passes 1977 tests.

Source: llm_adapter@2026-05-31

Confidence: high

Full changelog

Fixed

  • DIRECT routing savings now persist liveauto-route.py was answering prompts via direct_executor (Ollama / Gemini / OpenAI) without writing any record. session-end.py's _sync_import_savings_log() had nothing to flush, so any session that relied entirely on DIRECT routing showed $0.00 saved in the dashboard. New llm_router.hooks.savings_logger module appends one JSONL record per successful DIRECT execution; auto-route.py calls it fire-and-forget after DIRECT SUCCESS.
  • usage table now records baseline_model / potential_cost_usd / saved_usd — these columns existed since v9.2.2 but log_usage's INSERT never populated them. Every routed call appeared to save nothing. The savings math (_claude_cost, _get_baseline_for_task) was already in cost.py — this release wires it into the write path with the cache-aware 4-component formula.
  • cc-usage-track.py redirected from orphan llm_usage.db to canonical usage.db — this hook was the only remaining writer of a stub DB that nothing else read. Every Agent subagent call landed in the orphan, invisible to the dashboard. Now writes to the full schema with baseline + savings columns populated. Baseline picker: Explore / general-purpose → Haiku, everything else → Sonnet.
  • Claude Code statusline shows live savings instead of $0.00statusline-command.sh only read the usage table with a hardcoded Opus baseline, so sessions driven by DIRECT routing showed nothing (DIRECT writes land in savings_log.jsonl and don't reach usage until session END). Now prefers the new saved_usd column when populated, falls back to the legacy Opus math for upgrader rows, and adds today's un-flushed savings_log.jsonl records to the live total.

Added

  • README: install CTA hoisted above the fold (pip install llm-routing block + "Works with Claude Code, Codex, Gemini CLI — no API keys required on Claude Pro/Max" tagline), collapsible Table of Contents covering 17 sections, Star History chart, Activity section embedding the Repobeats weekly contribution heatmap, GitHub Discussions badge in the header and footer.
  • .github/FUNDING.yml so the GitHub sponsor button shows on the repo page (sponsor: ypollak2).

Migration

  • Users with an existing ~/.llm-router/llm_usage.db file can safely delete it manually — nothing reads or writes to it anymore:

    rm ~/.llm-router/llm_usage.db
    
  • Historical usage table rows keep potential_cost_usd = saved_usd = 0.0 (no retroactive backfill). Only INSERTs after upgrading benefit from the new accurate baseline math.

Tests

  • +24 tests across tests/test_savings_logger.py, tests/test_cost.py, tests/test_cc_usage_track.py, tests/test_statusline_savings.py.
  • Full suite: 1977 passed, 0 failed.

Breaking Changes

  • Removal of orphan `llm_usage.db` file; users should delete `~/.llm-router/llm_usage.db` manually.

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track ypollak2/llm-router

Get notified when new releases ship.

Sign up free

About ypollak2/llm-router

Subscription-aware LLM router for Claude Code. Routes tasks to 20+ providers (OpenAI, Gemini, Groq, Ollama, Codex) based on complexity classification, Claude subscription pressure, and cost. Free tasks stay on Claude subscription; expensive tasks fall back to the cheapest capable model. Includes 30 MCP tools, 6 auto-routing hooks, semantic dedup cache, prompt caching, daily spend cap, and a live web dashboard.

All releases →

Related context

Earlier breaking changes

  • v9.2.0 Changes auto‑route directive from advisory "DO NOT SKIP" to hard constraint with explicit blocked tools list.
  • v9.2.0 Breaks permanent downgrade of enforcement after first Edit/Write; v13 now requires per‑turn routing.

Beta — feedback welcome: [email protected]