This release includes 1 breaking change for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
+12 more
ReleasePort's take
Moderate signalIn foreman v0.8.1 the requestTimeoutSeconds field now governs a loop‑wide budget and its default shifts from 600 to 3600 seconds.
Why it matters: The change alters timeout behavior for all agents; operators must review configurations because the new default extends timeouts by six times.
Summary
AI summaryUpdates Bug Fixes, ⚠ BREAKING CHANGES, and Miscellaneous across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Breaking | High |
foreman: requestTimeoutSeconds now sets loop-wide budget, default changes from 600 to 3600. foreman: requestTimeoutSeconds now sets loop-wide budget, default changes from 600 to 3600. Source: llm_adapter@2026-06-01 Confidence: low |
— |
| Feature | Medium |
inferenceservice: adds typed spec.ropeScaling for RoPE/YaRN context extension. inferenceservice: adds typed spec.ropeScaling for RoPE/YaRN context extension. Source: llm_adapter@2026-06-01 Confidence: high |
— |
| Bugfix | Medium |
foreman: recovers orphaned phase=Running tasks on agent restart. foreman: recovers orphaned phase=Running tasks on agent restart. Source: llm_adapter@2026-06-01 Confidence: high |
— |
| Bugfix | Medium |
foreman: splits per‑turn timeout from loop‑wide budget. foreman: splits per‑turn timeout from loop‑wide budget. Source: llm_adapter@2026-06-01 Confidence: high |
— |
| Bugfix | Medium |
foreman: adds warm‑path reviewer scheduling on macOS. foreman: adds warm‑path reviewer scheduling on macOS. Source: llm_adapter@2026-06-01 Confidence: high |
— |
| Bugfix | Medium |
metal-agent: prefers routable interface for host‑IP auto‑detect. metal-agent: prefers routable interface for host‑IP auto‑detect. Source: llm_adapter@2026-06-01 Confidence: high |
— |
Full changelog
0.8.1 (2026-06-01)
⚠ BREAKING CHANGES
- foreman: Agent.spec.requestTimeoutSeconds changes meaning from a per-request HTTP timeout to a loop-wide wall-clock budget, and its default moves from 600 to 3600. The former per-request bound is now the new Agent.spec.requestTurnTimeoutSeconds (default 120). Re-apply your Agent CRs after upgrade so existing Agents pick up explicit values.
Features
Bug Fixes
- foreman: recover orphaned phase=Running tasks on agent restart (#542) (#598) (6dd2c44)
- foreman: split per-turn timeout from loop-wide budget (#532) (#602) (41e7663)
- foreman: warm-path reviewer scheduling on macOS (#578, #579) (#597) (a94d1ef)
- metal-agent: prefer routable interface for host-IP auto-detect (#526) (#599) (c780795)
Documentation
- foreman: absolute paths in overview README cross-refs (fix llmkube-web prerender) (#596) (b5f6f94)
- foreman: move docs/foreman to docs/site/foreman + register in site nav (#594) (9fd85bb)
Miscellaneous
Breaking Changes
- Agent.spec.requestTimeoutSeconds now represents a loop-wide wall-clock budget (default 3600) instead of per-request HTTP timeout; the former behavior is moved to Agent.spec.requestTurnTimeoutSeconds (default 120). Re‑apply Agent CRs after upgrade.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About LLMKube
Kubernetes operator for llama.cpp-native LLM inference with GPU scheduling, Apple Silicon Metal support, and OpenAI-compatible API.
Related context
Related tools
Beta — feedback welcome: [email protected]