This release adds 3 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+12 more
ReleasePort's take
Light signalRelease v0.7.8 introduces a configurable proxy with per-route timeouts and a ModelRouter skeleton.
Why it matters: Test the new configurable proxy and per‑route timeout settings in development before deploying to production.
Summary
AI summaryAdded configurable proxy with per-route timeouts and ModelRouter skeleton.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | Medium |
configurable proxy + per-route/backend timeouts configurable proxy + per-route/backend timeouts Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
external provider URL defaults + cluster-wide LiteLLM URL external provider URL defaults + cluster-wide LiteLLM URL Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Helm packaging, sample manifest, and concept doc for ModelRouter Helm packaging, sample manifest, and concept doc for ModelRouter Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
ModelRouterReconciler skeleton with spec validation ModelRouterReconciler skeleton with spec validation Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
reconcile router-proxy Deployment, Service, and ConfigMap reconcile router-proxy Deployment, Service, and ConfigMap Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
router-proxy binary with OpenAI streaming passthrough router-proxy binary with OpenAI streaming passthrough Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
router-proxy cluster e2e + runtime fail-closed 503 router-proxy cluster e2e + runtime fail-closed 503 Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
scaffold ModelRouter CRD types and deepcopy scaffold ModelRouter CRD types and deepcopy Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Bugfix | Medium |
close cloud-tier conns + drop local idle timeout close cloud-tier conns + drop local idle timeout Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Bugfix | Medium |
don't quarantine backends on per-attempt context deadline don't quarantine backends on per-attempt context deadline Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Bugfix | Medium |
unblock MicroShift SCC diagnostics + bump bootstrap timeout unblock MicroShift SCC diagnostics + bump bootstrap timeout Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Bugfix | Medium |
half-open circuit breaker on proxy + scale-to-zero status half-open circuit breaker on proxy + scale-to-zero status Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Bugfix | Medium |
preserve external annotations on reconciler Deployment updates preserve external annotations on reconciler Deployment updates Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Other | Medium |
add consumer-hardware model matrix guide add consumer-hardware model matrix guide Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Other | Medium |
land ModelRouter prominently in README for the 0.7.8 release land ModelRouter prominently in README for the 0.7.8 release Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Other | Medium |
air-gapped, OpenShift, macOS Metal guides + architecture refresh (Tier 1) air-gapped, OpenShift, macOS Metal guides + architecture refresh (Tier 1) Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Other | Medium |
drop stale "fifteen lines" claim in openshift-install Reference drop stale "fifteen lines" claim in openshift-install Reference Source: llm_adapter@2026-05-21 Confidence: low |
— |
Full changelog
0.7.8 (2026-05-14)
Features
- configurable proxy + per-route/backend timeouts (closes #457, #458) (#461) (03d222a)
- external provider URL defaults + cluster-wide LiteLLM URL (closes #438) (#451) (26cd5ae)
- Helm packaging, sample manifest, and concept doc for ModelRouter (#448) (a513fdc)
- ModelRouterReconciler skeleton with spec validation (#445) (9b1a259)
- reconcile router-proxy Deployment, Service, and ConfigMap (#447) (856ecc3)
- router-proxy binary with OpenAI streaming passthrough (#446) (942d09a)
- router-proxy cluster e2e + runtime fail-closed 503 (closes #430) (#450) (75151fa)
- scaffold ModelRouter CRD types and deepcopy (#442) (e6c60b3)
Bug Fixes
- close cloud-tier conns + drop local idle timeout (closes #459) (#460) (173c26a)
- don't quarantine backends on per-attempt context deadline (closes #462) (#463) (80ef9c8)
- e2e: unblock MicroShift SCC diagnostics + bump bootstrap timeout (#466) (0c793b7)
- half-open circuit breaker on proxy + scale-to-zero status (closes #452, #453) (#454) (ac9302c)
- preserve external annotations on reconciler Deployment updates (#468) (de580c1)
Documentation
- add consumer-hardware model matrix guide (#444) (dd07397)
- readme: land ModelRouter prominently for the 0.7.8 release (#464) (deb24bb)
- site: air-gapped, OpenShift, macOS Metal guides + architecture refresh (Tier 1) (#465) (5996a1e)
- site: drop stale "fifteen lines" claim in openshift-install Reference (#467) (ec52ca8)
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About LLMKube
Kubernetes operator for llama.cpp-native LLM inference with GPU scheduling, Apple Silicon Metal support, and OpenAI-compatible API.
Beta — feedback welcome: [email protected]