Containarium

v0.16.0 Breaking

This release includes breaking changes for platform teams planning a safe upgrade.

Published 2mo Virtualization

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agent-native agent-runtime agent-sandbox agentic-ai ai-agents claude

+13 more

code-sandbox cursor ebpf gpu kubernetes llm lxc mcp model-context-protocol multi-tenant sandbox self-hosted ssh

Affected surfaces

auth breaking_upgrade

Summary

AI summary

Multi-pool architecture enables one sentinel to front multiple Containarium clusters with SNI routing.

Full changelog

Highlights

Multi-pool architecture — One sentinel can now front multiple independent Containarium clusters, each with its own primary VM, peers, core stack, and subdomain. SNI-based routing on the sentinel transparently dispatches inbound TLS to the right pool. See docs/MULTI-POOL.md.
GPU passthrough by PCI address — Containers are now pinned by stable PCI ID instead of DRM card minor index, surviving kernel-upgrade renumbering (which broke fts-5900x on 6.8.0-110 → 6.8.0-111).
Postgres restart policy on auto-detect — Closes a 12-day silent-outage path on prod where an OOM kill of containarium-core-postgres left it down indefinitely (Grafana went dark).
Lab pool SSH access — Install script now provisions the containarium-shell wrapper + /etc/motd; daemon stops writing ~/.hushlogin. SSH into containers on tunneled-primary pools (e.g. lab) now works end-to-end.

Added

Multi-pool architecture (PR #97 — slices 1-8):
- Pool tag on peers (--pool=<name> propagates through tunnel handshake to TunnelSpot / Backend; GET /sentinel/peers?pool=<name> filters).
- Pool-scoped peer discovery (PeerPool.discover() appends ?pool=).
- Primary self-registration (POST /sentinel/primaries at startup, 30s heartbeat, DELETE on shutdown; sentinel evicts after 90s missed heartbeats).
- SNI-based routing on the sentinel (peeks ClientHello, looks up primary in registry, falls back to legacy single-backend on no-SNI / unregistered hostname).
- Hostname aliases for app domains (--public-aliases foo.example,bar.example).
- Primary registration via tunnel handshake (containarium tunnel --public-hostname=… --public-aliases=… --public-port=443) — lets a primary behind NAT/Tailscale register itself without direct HTTP access to the sentinel.
- Token-bound pool authorization (--tunnel-token-policy <token>=<pool1>,<pool2>, repeatable; * = wildcard; legacy --tunnel-token keeps wildcard semantics).
- SNI router uses yamux for tunneled primaries (avoids loopback-alias listener conflicts on the sentinel).

Fixed

Tunnel handshake over-read corrupts yamux — json.NewDecoder over-read swallowed bytes that arrived in the same TCP packet as the JSON handshake (notably the yamux SYN). Fix: line-delimited JSON read, leaving subsequent bytes for yamux. Latent since the original tunnel implementation; surfaced under load by the slice 8 deploy.
Tunnel-promoted primaries decay after 90s TTL — PrimaryRegistry.All() evicted entries with BackendID set even though their lifetime is tied to the yamux session. Fix: skip TTL eviction when BackendID != "".
Lab pool bring-up — 5 corner cases caught while standing up the first tunneled primary:
- Subnet drift between --network-subnet flag and actual incusbr0 (Incus' EnsureNetwork is idempotent — won't change a pre-existing bridge's subnet). Daemon now queries GetNetworkSubnet("incusbr0") after init and uses that as authoritative.
- Port forwarder missing OUTPUT-chain DNAT (PREROUTING alone doesn't catch local-origin packets that tunneled primaries generate when forwarding to 127.0.0.1:443).
- route_localnet=1 not enabled (kernel default refuses to route 127.0.0.0/8 out a non-loopback interface). Now set at runtime + persisted via /etc/sysctl.d/99-containarium-route-localnet.conf.
- Caddy TLS app missing on first install (apps.tls is null on a fresh Caddy → PATCH returns 400). ProvisionTLS now calls ensureTLSApp first.
- Port forwarder ran before Caddy spawned. Re-run PortForwarder.SetupPortForwarding after EnsureCaddy succeeds.
Postgres restart policy missing on auto-detected containers (internal/server/dual_server.go) — auto-detect path skipped ensurePostgresRestartPolicy(). Re-applied; idempotent.
GPU device passthrough breaks across kernel upgrades (internal/incus/client.go, internal/container/manager.go) — new Client.ResolveGPUInputToPCI() resolves --gpu N into a stable PCI address at container creation time. Existing containers with id-based config aren't auto-migrated; manual fix: incus config device set <name> gpu pci=<addr>.
Lab pool SSH lands at host nologin instead of inside the container (scripts/install-lab-phase-b.sh, internal/container/jump_server.go) — install script now installs /usr/local/bin/containarium-shell and writes /etc/motd; daemon stops writing ~/.hushlogin. Existing per-container host users keep their stale .hushlogin until manually removed.

Upgrade notes

Existing GPU containers with gpu: { id: "0" } config aren't auto-migrated. After upgrade run: sudo incus config device set <container> gpu pci=$(sudo containarium gpu list | awk '/<vendor>/ {print $1}') (or pick the PCI address manually from lspci -nn | grep -i nvidia).
Existing per-container host users on backends still have ~/.hushlogin. To get the host MOTD back: sudo rm /home/<user>/.hushlogin.
Lab-style pools standing up on this release will auto-install containarium-shell + /etc/motd via scripts/install-lab-phase-b.sh. No action needed on existing pools that already have the wrapper.

Known follow-ups

## [0.15.0] heading was lost from CHANGELOG.md between the v0.15.0 bump commit and HEAD; some v0.15.0 content also got duplicated under [0.16.0]. Will be cleaned up in a follow-up.
See docs/MULTI-POOL.md "What's still ahead" for: pool-namespaced SSH usernames (silent collision fix), sshpiper restart not refreshing state, defensive nologin filter on /authorized-keys.

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track Containarium

Get notified when new releases ship.

About Containarium

All releases →