This release adds 3 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+7 more
Summary
AI summaryAdded Direct Execution hook, Zero-Token Routing to Claude, and an Ollama Agent Loop with file tools.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | Medium |
Adds Direct Execution hook to route queries to Ollama/Gemini/OpenAI. Adds Direct Execution hook to route queries to Ollama/Gemini/OpenAI. Source: llm_adapter@2026-05-25 Confidence: high |
— |
| Feature | Medium |
Adds Zero-Token Routing returning block decision to Claude with 0 token consumption. Adds Zero-Token Routing returning block decision to Claude with 0 token consumption. Source: llm_adapter@2026-05-25 Confidence: high |
— |
| Feature | Medium |
Adds Ollama Agent Loop providing file read/write/edit/search tools locally. Adds Ollama Agent Loop providing file read/write/edit/search tools locally. Source: llm_adapter@2026-05-25 Confidence: high |
— |
| Feature | Medium |
Adds 5-zone pressure‑aware monitoring for granular downshifting. Adds 5-zone pressure‑aware monitoring for granular downshifting. Source: llm_adapter@2026-05-25 Confidence: low |
— |
| Bugfix | Medium |
Fixes Gemini models being hidden when available via CLI. Fixes Gemini models being hidden when available via CLI. Source: llm_adapter@2026-05-25 Confidence: high |
— |
| Bugfix | Medium |
Ensures Codex is tried first in the chain during extreme quota pressure. Ensures Codex is tried first in the chain during extreme quota pressure. Source: llm_adapter@2026-05-25 Confidence: high |
— |
| Bugfix | Medium |
Improves test isolation by clearing subscription flags and using sys.executable for subprocess calls. Improves test isolation by clearing subscription flags and using sys.executable for subprocess calls. Source: llm_adapter@2026-05-25 Confidence: low |
— |
Full changelog
v9.0.3 — Zero-Claude Direct Execution & Mini-Agent Loop (2026-05-24)
Added
- Direct Execution —
UserPromptSubmithook can now route queries directly to Ollama/Gemini/OpenAI. - Zero-Token Routing — simple prompts return
{"decision": "block"}to Claude, consuming 0 subscription tokens. - Ollama Agent Loop — Ollama now has access to basic file tools (read, write, edit, search) via a local agent loop for simple file-op tasks.
- Pressure-Aware Chains — new 5-zone pressure monitoring (green to critical) for more granular downshifting.
Fixed
- Gemini Subscription Mode — fixed Gemini models being hidden even when available via CLI.
- Test Isolation — tests now explicitly clear subscription flags and use
sys.executablefor subprocess calls, improving reliability across different developer environments. - Codex Priority — ensured Codex is tried at the absolute front of the chain during extreme quota pressure.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About ypollak2/llm-router
Subscription-aware LLM router for Claude Code. Routes tasks to 20+ providers (OpenAI, Gemini, Groq, Ollama, Codex) based on complexity classification, Claude subscription pressure, and cost. Free tasks stay on Claude subscription; expensive tasks fall back to the cheapest capable model. Includes 30 MCP tools, 6 auto-routing hooks, semantic dedup cache, prompt caching, daily spend cap, and a live web dashboard.
Related context
Related tools
Beta — feedback welcome: [email protected]