This release adds 3 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+12 more
ReleasePort's take
Light signalVersion v0.7.10 introduces the --llama-server-port flag for a fixed runtime port and adds several Foreman‑related features (capability‑aware scheduler, native agent loop) while fixing macOS‑Metal setup bugs.
Why it matters: The new --llama-server-port option lets operators lock the server to a static port; macOS‑Metal users benefit from corrected curl‑port derivation and host‑localhost fallback. All changes land in v0.7.10 released 2026‑05‑23.
Summary
AI summaryUpdates 0.7.10, Bug Fixes, and 2026-05-23 across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | Medium |
Adds --llama-server-port option for fixed runtime port. Adds --llama-server-port option for fixed runtime port. Source: llm_adapter@2026-05-23 Confidence: high |
— |
| Feature | Medium |
Adds make lint-all target for cross-architecture linting. Adds make lint-all target for cross-architecture linting. Source: llm_adapter@2026-05-23 Confidence: high |
— |
| Feature | Medium |
Introduces capability‑aware scheduler, AgenticTaskWatcher, and stub executor (Foreman v0.1 M2). Introduces capability‑aware scheduler, AgenticTaskWatcher, and stub executor (Foreman v0.1 M2). Source: llm_adapter@2026-05-23 Confidence: high |
— |
| Feature | Medium |
Gates Agent role on a verifier node in Foreman (M4). Gates Agent role on a verifier node in Foreman (M4). Source: llm_adapter@2026-05-23 Confidence: high |
— |
| Feature | Medium |
Adds native agent loop, Agent CRD, and coder role on M5 Max (Foreman M3). Adds native agent loop, Agent CRD, and coder role on M5 Max (Foreman M3). Source: llm_adapter@2026-05-23 Confidence: high |
— |
| Feature | Medium |
Scaffolds Foreman as an opt‑in add‑on (M0 + M1). Scaffolds Foreman as an opt‑in add‑on (M0 + M1). Source: llm_adapter@2026-05-23 Confidence: high |
— |
| Feature | Medium |
Adds AGENTS.md documentation file. Adds AGENTS.md documentation file. Source: llm_adapter@2026-05-23 Confidence: low |
— |
| Bugfix | Medium |
Reports Stopped phase when InferenceService.spec.replicas=0 on Metal path. Reports Stopped phase when InferenceService.spec.replicas=0 on Metal path. Source: llm_adapter@2026-05-23 Confidence: high |
— |
| Bugfix | Medium |
Updates broken bartowski phi‑4‑mini URL to renamed repository. Updates broken bartowski phi‑4‑mini URL to renamed repository. Source: llm_adapter@2026-05-23 Confidence: low |
— |
| Bugfix | Medium |
Derives curl port from Endpoints for macOS‑Metal (follow‑up to #513). Derives curl port from Endpoints for macOS‑Metal (follow‑up to #513). Source: llm_adapter@2026-05-23 Confidence: low |
— |
| Bugfix | Medium |
Replaces broken port‑forward step with host‑localhost curl for macOS‑Metal. Replaces broken port‑forward step with host‑localhost curl for macOS‑Metal. Source: llm_adapter@2026-05-23 Confidence: low |
— |
Full changelog
0.7.10 (2026-05-23)
Features
- add --llama-server-port for a fixed llama-server runtime port (#499) (cc30b0d)
- add make lint-all target for cross-arch linting (#508) (f57dd5b)
- capability-aware scheduler + AgenticTaskWatcher + stub executor (Foreman v0.1 M2) (#504) (74b3d6e)
- foreman: gate-role Agent on a verifier node (M4) (#518) (40a340e)
- foreman: native agent loop + Agent CRD + coder role on M5 Max (M3) (#509) (6661343)
- scaffold Foreman as an opt-in add-on (M0 + M1) (#501) (cd40491)
Bug Fixes
Documentation
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About LLMKube
Kubernetes operator for llama.cpp-native LLM inference with GPU scheduling, Apple Silicon Metal support, and OpenAI-compatible API.
Beta — feedback welcome: [email protected]