ypollak2/llm-router

v9.0.3 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

Published 2mo LLM Frameworks

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai-routing anthropic claude claude-code cost-optimization gemini

+7 more

litellm llm llm-router mcp-server model-router ollama openai

Summary

AI summary

Added Direct Execution hook, Zero-Token Routing to Claude, and an Ollama Agent Loop with file tools.

Changes in this release

Type	Severity	Summary	CVE
Feature
Feature	Medium	Adds Direct Execution hook to route queries to Ollama/Gemini/OpenAI. Adds Direct Execution hook to route queries to Ollama/Gemini/OpenAI. Source: llm_adapter@2026-05-25 Confidence: high	—
Feature	Medium	Adds Zero-Token Routing returning block decision to Claude with 0 token consumption. Adds Zero-Token Routing returning block decision to Claude with 0 token consumption. Source: llm_adapter@2026-05-25 Confidence: high	—
Feature	Medium	Adds Ollama Agent Loop providing file read/write/edit/search tools locally. Adds Ollama Agent Loop providing file read/write/edit/search tools locally. Source: llm_adapter@2026-05-25 Confidence: high	—
Feature	Medium	Adds 5-zone pressure‑aware monitoring for granular downshifting. Adds 5-zone pressure‑aware monitoring for granular downshifting. Source: llm_adapter@2026-05-25 Confidence: low	—
Bugfix
Bugfix	Medium	Fixes Gemini models being hidden when available via CLI. Fixes Gemini models being hidden when available via CLI. Source: llm_adapter@2026-05-25 Confidence: high	—
Bugfix	Medium	Ensures Codex is tried first in the chain during extreme quota pressure. Ensures Codex is tried first in the chain during extreme quota pressure. Source: llm_adapter@2026-05-25 Confidence: high	—
Bugfix	Medium	Improves test isolation by clearing subscription flags and using sys.executable for subprocess calls. Improves test isolation by clearing subscription flags and using sys.executable for subprocess calls. Source: llm_adapter@2026-05-25 Confidence: low	—

Full changelog

v9.0.3 — Zero-Claude Direct Execution & Mini-Agent Loop (2026-05-24)

Added

Direct Execution — UserPromptSubmit hook can now route queries directly to Ollama/Gemini/OpenAI.
Zero-Token Routing — simple prompts return {"decision": "block"} to Claude, consuming 0 subscription tokens.
Ollama Agent Loop — Ollama now has access to basic file tools (read, write, edit, search) via a local agent loop for simple file-op tasks.
Pressure-Aware Chains — new 5-zone pressure monitoring (green to critical) for more granular downshifting.

Fixed

Gemini Subscription Mode — fixed Gemini models being hidden even when available via CLI.
Test Isolation — tests now explicitly clear subscription flags and use sys.executable for subprocess calls, improving reliability across different developer environments.
Codex Priority — ensured Codex is tried at the absolute front of the chain during extreme quota pressure.

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track ypollak2/llm-router

Get notified when new releases ship.

About ypollak2/llm-router

Subscription-aware LLM router for Claude Code. Routes tasks to 20+ providers (OpenAI, Gemini, Groq, Ollama, Codex) based on complexity classification, Claude subscription pressure, and cost. Free tasks stay on Claude subscription; expensive tasks fall back to the cheapest capable model. Includes 30 MCP tools, 6 auto-routing hooks, semantic dedup cache, prompt caching, daily spend cap, and a live web dashboard.

All releases →

Related context

Related tools

Earlier breaking changes

v9.2.0 Changes auto‑route directive from advisory "DO NOT SKIP" to hard constraint with explicit blocked tools list.
v9.2.0 Breaks permanent downgrade of enforcement after first Edit/Write; v13 now requires per‑turn routing.