Skip to content

ypollak2/llm-router

v9.0.3 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

Published 10d LLM Frameworks
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai-routing anthropic claude claude-code cost-optimization gemini
+7 more
litellm llm llm-router mcp-server model-router ollama openai

Summary

AI summary

Added Direct Execution hook, Zero-Token Routing to Claude, and an Ollama Agent Loop with file tools.

Changes in this release

Feature Medium

Adds Direct Execution hook to route queries to Ollama/Gemini/OpenAI.

Adds Direct Execution hook to route queries to Ollama/Gemini/OpenAI.

Source: llm_adapter@2026-05-25

Confidence: high

Feature Medium

Adds Zero-Token Routing returning block decision to Claude with 0 token consumption.

Adds Zero-Token Routing returning block decision to Claude with 0 token consumption.

Source: llm_adapter@2026-05-25

Confidence: high

Feature Medium

Adds Ollama Agent Loop providing file read/write/edit/search tools locally.

Adds Ollama Agent Loop providing file read/write/edit/search tools locally.

Source: llm_adapter@2026-05-25

Confidence: high

Feature Medium

Adds 5-zone pressure‑aware monitoring for granular downshifting.

Adds 5-zone pressure‑aware monitoring for granular downshifting.

Source: llm_adapter@2026-05-25

Confidence: low

Bugfix Medium

Fixes Gemini models being hidden when available via CLI.

Fixes Gemini models being hidden when available via CLI.

Source: llm_adapter@2026-05-25

Confidence: high

Bugfix Medium

Ensures Codex is tried first in the chain during extreme quota pressure.

Ensures Codex is tried first in the chain during extreme quota pressure.

Source: llm_adapter@2026-05-25

Confidence: high

Bugfix Medium

Improves test isolation by clearing subscription flags and using sys.executable for subprocess calls.

Improves test isolation by clearing subscription flags and using sys.executable for subprocess calls.

Source: llm_adapter@2026-05-25

Confidence: low

Full changelog

v9.0.3 — Zero-Claude Direct Execution & Mini-Agent Loop (2026-05-24)

Added

  • Direct ExecutionUserPromptSubmit hook can now route queries directly to Ollama/Gemini/OpenAI.
  • Zero-Token Routing — simple prompts return {"decision": "block"} to Claude, consuming 0 subscription tokens.
  • Ollama Agent Loop — Ollama now has access to basic file tools (read, write, edit, search) via a local agent loop for simple file-op tasks.
  • Pressure-Aware Chains — new 5-zone pressure monitoring (green to critical) for more granular downshifting.

Fixed

  • Gemini Subscription Mode — fixed Gemini models being hidden even when available via CLI.
  • Test Isolation — tests now explicitly clear subscription flags and use sys.executable for subprocess calls, improving reliability across different developer environments.
  • Codex Priority — ensured Codex is tried at the absolute front of the chain during extreme quota pressure.

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track ypollak2/llm-router

Get notified when new releases ship.

Sign up free

About ypollak2/llm-router

Subscription-aware LLM router for Claude Code. Routes tasks to 20+ providers (OpenAI, Gemini, Groq, Ollama, Codex) based on complexity classification, Claude subscription pressure, and cost. Free tasks stay on Claude subscription; expensive tasks fall back to the cheapest capable model. Includes 30 MCP tools, 6 auto-routing hooks, semantic dedup cache, prompt caching, daily spend cap, and a live web dashboard.

All releases →

Related context

Earlier breaking changes

  • v9.2.0 Changes auto‑route directive from advisory "DO NOT SKIP" to hard constraint with explicit blocked tools list.
  • v9.2.0 Breaks permanent downgrade of enforcement after first Edit/Write; v13 now requires per‑turn routing.

Beta — feedback welcome: [email protected]