LLMKube

vforeman-0.8.0 scope: foreman Feature

This release adds 2 notable features for engineering teams evaluating rollout.

Published 1mo Containers & Orchestration

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai apple-silicon autoscaling edge-computing gguf gpu

+12 more

self-hosted inference kubernetes llama-cpp llm local-llm metal mlx multi-gpu nvidia tgi vllm

Summary

AI summary

Foreman adds opt‑in agentic workload scheduling to LLMKube.

Full changelog

Foreman is an opt-in add-on for LLMKube that schedules agentic workloads (Workload, AgenticTask) across a fleet of nodes (FleetNode). Installing LLMKube alone does not install or require Foreman. Foreman is a SIBLING chart to llmkube, not a subchart: install llmkube first (helm install llmkube defilantech/llmkube), then install foreman alongside it. They share no Helm relationship at packaging or install time; the only coupling is that the foreman-operator's RBAC reads inference.llmkube.dev CRDs that llmkube installs.

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track LLMKube

Get notified when new releases ship.

About LLMKube

Kubernetes operator for llama.cpp-native LLM inference with GPU scheduling, Apple Silicon Metal support, and OpenAI-compatible API.

All releases →

Related context

Related tools

Earlier breaking changes

v0.8.1 foreman: requestTimeoutSeconds now sets loop-wide budget, default changes from 600 to 3600.