Skip to content

LLMKube

v0.8.0 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai apple-silicon autoscaling edge-computing gguf gpu
+12 more
self-hosted inference kubernetes llama-cpp llm local-llm metal mlx multi-gpu nvidia tgi vllm

Summary

AI summary

Updates 0.8.0, Bug Fixes, and Miscellaneous across a mixed release.

Changes in this release

Feature Medium

Adds Intel GPU (oneAPI/SYCL) support across controller, CLI, docs, and e2e.

Adds Intel GPU (oneAPI/SYCL) support across controller, CLI, docs, and e2e.

Source: llm_adapter@2026-05-28

Confidence: high

Feature Medium

Adds structured AgenticTaskFailureReason taxonomy to foreman API.

Adds structured AgenticTaskFailureReason taxonomy to foreman API.

Source: llm_adapter@2026-05-28

Confidence: high

Feature Medium

Adds observation masking for context-window management in foreman loop.

Adds observation masking for context-window management in foreman loop.

Source: llm_adapter@2026-05-28

Confidence: high

Feature Medium

Adds stuck-loop detector with nudge‑then‑force protocol to foreman loop.

Adds stuck-loop detector with nudge‑then‑force protocol to foreman loop.

Source: llm_adapter@2026-05-28

Confidence: high

Feature Medium

Adds fetch_issue tool replacing gh issue view subshell in foreman reviewer.

Adds fetch_issue tool replacing gh issue view subshell in foreman reviewer.

Source: llm_adapter@2026-05-28

Confidence: high

Feature Medium

Adds distinction between whitelist‑excluded and unknown tool calls in foreman tools.

Adds distinction between whitelist‑excluded and unknown tool calls in foreman tools.

Source: llm_adapter@2026-05-28

Confidence: high

Feature Medium

Adds hybrid cloud reviewer Agent with sovereignty toggles in foreman v0.2.

Adds hybrid cloud reviewer Agent with sovereignty toggles in foreman v0.2.

Source: llm_adapter@2026-05-28

Confidence: high

Feature Medium

Adds WorkloadSpec.reviewerAgentRefs (plural) and third pipeline stage in foreman v0.2.

Adds WorkloadSpec.reviewerAgentRefs (plural) and third pipeline stage in foreman v0.2.

Source: llm_adapter@2026-05-28

Confidence: high

Feature Medium

Adds repo‑map localization for coder Agents in foreman.

Adds repo‑map localization for coder Agents in foreman.

Source: llm_adapter@2026-05-28

Confidence: high

Feature Low

AgenticTask branches now include workload name.

AgenticTask branches now include workload name.

Source: granite4.1:30b@2026-05-28-audit

Confidence: low

Feature Low

Executor fetches GitHub issue body when payload prompt is empty.

Executor fetches GitHub issue body when payload prompt is empty.

Source: granite4.1:30b@2026-05-28-audit

Confidence: low

Bugfix Medium

Fixes executor to resolve InferenceService port from live Endpoints, not stale install‑time override.

Fixes executor to resolve InferenceService port from live Endpoints, not stale install‑time override.

Source: llm_adapter@2026-05-28

Confidence: high

Bugfix Medium

Includes gate_job_template.yaml in Docker context for foreman build.

Includes gate_job_template.yaml in Docker context for foreman build.

Source: granite4.1:30b@2026-05-28-audit

Confidence: low

Bugfix Medium

Routes reviewer-role GO through modelDecidedResult in executor.

Routes reviewer-role GO through modelDecidedResult in executor.

Source: granite4.1:30b@2026-05-28-audit

Confidence: low

Bugfix Medium

Force‑terminate returns clean Terminal envelope in foreman loop.

Force‑terminate returns clean Terminal envelope in foreman loop.

Source: granite4.1:30b@2026-05-28-audit

Confidence: low

Bugfix Medium

Adds ground‑truth filesTouched, bumps qwen maxTurns, and tightens confabulation defenses in reviewer.

Adds ground‑truth filesTouched, bumps qwen maxTurns, and tightens confabulation defenses in reviewer.

Source: granite4.1:30b@2026-05-28-audit

Confidence: low

Bugfix Medium

Adds ground‑truth issueAsk and caps qwen Section D in reviewer.

Adds ground‑truth issueAsk and caps qwen Section D in reviewer.

Source: granite4.1:30b@2026-05-28-audit

Confidence: low

Bugfix Medium

Implements role‑aware stuck‑loop detector and ensures non‑empty reviewer user prompt in reviewer.

Implements role‑aware stuck‑loop detector and ensures non‑empty reviewer user prompt in reviewer.

Source: granite4.1:30b@2026-05-28-audit

Confidence: low

Bugfix Medium

Cascades Workload rollup gate based on phase AND verdict, not phase alone.

Cascades Workload rollup gate based on phase AND verdict, not phase alone.

Source: granite4.1:30b@2026-05-28-audit

Confidence: low

Bugfix Low

Always emits content on non‑assistant messages in OAI handler.

Always emits content on non‑assistant messages in OAI handler.

Source: granite4.1:30b@2026-05-28-audit

Confidence: low

Bugfix Low

Adds cmd.WaitDelay and process‑group kill to prevent BashTool deadlocks on grandchild‑held pipes.

Adds cmd.WaitDelay and process‑group kill to prevent BashTool deadlocks on grandchild‑held pipes.

Source: granite4.1:30b@2026-05-28-audit

Confidence: low

Full changelog

0.8.0 (2026-05-28)

Features

  • foreman/api: structured AgenticTaskFailureReason taxonomy (#565) (6e72e85)
  • foreman/loop: observation masking for context-window management (#563) (d17c3e0)
  • foreman/loop: stuck-loop detector with nudge-then-force protocol (#544) (#569) (2172ece)
  • foreman/reviewer: fetch_issue tool replaces gh issue view subshell (#581) (0253e43)
  • foreman/tools: distinguish whitelist-excluded from unknown tool calls (#564) (089e9ca)
  • foreman/v0.2: hybrid cloud reviewer Agent + sovereignty toggles (#553) (65a7cb8)
  • foreman/v0.2: WorkloadSpec.reviewerAgentRefs (plural) + third pipeline stage (#551) (831ae8c)
  • foreman: add repo-map localization for coder Agents (#560) (#566) (f6bf8c0)
  • foreman: AgenticTask branches include workload name (#573) (#574) (2986906)
  • foreman: executor fetches GitHub issue body when payload prompt is empty (#571) (#572) (2b5bd31)
  • foreman: post-M4 stability follow-ups for v5-batch readiness (#535) (a841612)
  • foreman: v0.4 reviewer agent — tool-using reviewer with sharpened prompt + structured findings (#575) (#576) (06091a9)
  • foreman: workspace-scoped bash + WORKSPACE_ROOT contract (#567) (#568) (061eb41)
  • gpu: add Intel GPU (oneAPI/SYCL) support across controller, CLI, docs, and e2e (#557) (741ef5d)
  • metal-agent: InferenceService name allowlist for multi-Mac fleets (#555) (67361f3)

Bug Fixes

  • foreman/build: include gate_job_template.yaml in Docker context (#554) (def535d)
  • foreman/executor: resolve InferenceService port from live Endpoints, not stale install-time override (#550) (4351608)
  • foreman/executor: route reviewer-role GO through modelDecidedResult (#545) (16943a5)
  • foreman/loop: force-terminate returns clean Terminal envelope (#544 follow-up) (#570) (19500c9)
  • foreman/oai: always emit content on non-assistant messages (#562) (f6dc8e1)
  • foreman/reviewer: ground-truth filesTouched + bump qwen maxTurns + tighten confabulation defenses (#584) (b3e21f0)
  • foreman/reviewer: ground-truth issueAsk + cap qwen Section D (#587) (b66006c)
  • foreman/reviewer: role-aware stuck-loop detector + non-empty reviewer user prompt (rerun-7 follow-up) (#577) (19590f6)
  • foreman/tools: cmd.WaitDelay + process-group kill so BashTool can't deadlock on grandchild-held pipes (#547) (c12f6f8)
  • foreman: cascade + Workload rollup gate on phase AND verdict, not phase alone (#548) (1b72a7c)

Documentation

  • foreman: v0.8.0 release-prep docs + README Foreman section (#591) (5e41df1)
  • mention make lint-all in AGENTS.md and CONTRIBUTING.md (#588) (39da983)

Miscellaneous

  • release Foreman debut as 0.8.0 (take 2) (#593) (a8f0368)

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track LLMKube

Get notified when new releases ship.

Sign up free

About LLMKube

Kubernetes operator for llama.cpp-native LLM inference with GPU scheduling, Apple Silicon Metal support, and OpenAI-compatible API.

All releases →

Related context

Earlier breaking changes

  • v0.8.1 foreman: requestTimeoutSeconds now sets loop-wide budget, default changes from 600 to 3600.

Beta — feedback welcome: [email protected]