noonghunna/club-3090

v0.8.2 Feature

This release adds 4 notable features for engineering teams evaluating rollout.

Published 16d Model Serving & MLOps

✓ No known CVEs patched

✓ No known CVEs patched in this version

Summary

AI summary

Broad release touches 📝 Documentation, ✨ Features, pull, and 🐛 Bug fixes.

Full changelog

What this gives you

An honest "will it fit?" answer for any safetensors model — scripts/pull.sh <model> --profile-like <variant> --recommend returns a plain verdict (FITS / FITS-but-needs-acceptance / DOES-NOT-FIT) that tracks the real gate, carries the boot-fit≠runtime caveat, and never silently passes.
Failed pulls become fixable signal, not dead-ends — every hard-block leaves a redacted, path-scrubbed diagnostic; one consented command (scripts/pull.sh --submit-last, works with or without gh) sends it back so the gap can be closed. No telemetry, no auto-send.
~12 more model architectures recognised out-of-the-box — Phi, Gemma/Gemma3, Starcoder2, Cohere/Command-R, Mixtral, Qwen2/3-MoE and more no longer need --experimental-arch; genuine native models reach a clean serve verdict instead of a blanket refusal (per-repo remote-code still fail-closed — zero false-pass).
Optional non-NVIDIA hardware detection for the eval path (AMD ROCm / Apple), graceful-degrading — no change for NVIDIA users.
Bundled: N-GPU NVLink auto-detection (multi-4 + Gemma-4-26B dual composes) and a documentation restructure (quick-start README, GETTING_STARTED.md, FAQ).

Know before you use it

Generated composes are a minimal, known-safe starting point: capacity (context/KV/mem-util) is the reference profile's, not auto-tuned to your GPU — run --recommend (or tools/kv-calc.py --solve-max-ctx) for the real fit and tune ${MAX_MODEL_LEN}. An opt-in capacity optimiser is planned for a later release.
Advanced features (MTP / TurboQuant / DFlash) stay in the pre-baked Genesis composes — generated composes are non-Genesis by design.
GGUF / cross-engine generation is deferred to a separate §9 design-unlock — not in this release.

feat(pull): v0.8.2 STEP V5 — recommend UX + report-a-failed-pull doc + §9-reconciliation (c5b5e9b)
feat(nvlink): auto-detect NVLink on N-GPU topologies; add detection to multi4 + gemma-4-26b dual (8f8ec1c)
feat(pull): v0.8.2 STEP V4 — optional whichllm hw-detect subprocess (CONTRACT-3, hw-detect-only) (3917728)
feat(switch): v0.8.2 STEP V3 — switch.sh ↔ compose_registry parity (CONTRACT-2b-ii) (e6503bc)
feat(pull): v0.8.2 STEP V3 — arch-registry expansion + chat-template attribution/drift_guard (999c93f)
feat(pull): v0.8.2 STEP V2 — surface pointer + --submit-last/--submit (gh + gh-less, consented, F5 reuse) (e1cdcb5)
feat(pull): v0.8.2 STEP V1 — capture-on-hard-block pt1-gate emitter + BaseCaptureBundle protocol lift (20f1557)
feat(report): lspci PCIe/P2P diagnostics subsection (LnkSta/ACS/topology) (#148) (af2e45a)

fix(pull): v0.8.2 STEP V5 — recommend must not label a fits-clean model "DOES NOT FIT" (26949d7)
fix(pull): v0.8.2 STEP V3 — deliver CONTRACT-2's engine-supported broadening (TRC two-class) (d78b9a9)
fix(pull): v0.8.2 STEP V2 — gh-less issue body must not carry the absolute capture path (52451ca)
fix(launch): force LC_NUMERIC=C so the VRAM-budget printf survives comma-decimal locales (#159) (186dc93)
fix(deriver): correct stale "GGUF not supported until v0.8.1" message — now misleading post-v0.8.1-ship (344ab87)

docs(architecture): bring current-state docs up to v0.8.2 (recommend / submit on-ramp / arch-registry / hwdetect) (c5c8f46)
docs(generator): state plainly that generated-compose capacity is the reference profile's, NOT fit-adapted (247b1dc)
docs(pull): v0.8.2 STEP V6 — correct §9/headline to the true bundled release scope (b791271)
docs: fix duplicate MULTI_CARD.md entry in docs index (966a8d1)
docs: reorder docsindex (GSD first), add FAQ TOC + promote troubleshooting ladder, add tool-calling example (a891b39)
docs: add GETTING_STARTED.md, Gemma 4 model READMEs, restructure main README with quick start first (6368bae)
docs: fix stale NVLINK_MODE comment, INTERNALS.md cliff status, and dead companion repo link (28bd0e8)
docs(container-runtimes): Proxmox passthrough — NVLink is the fragile path, not Proxmox (#161) (3f066a0)
docs(benchmarks): add @hlo-world dual-3090 PCIe x4 dual-dflash-noviz row (#158) (135f2c4)
docs(upstream): froggeric v19 re-eval PASSED — ADOPTED (#150) (ec1fd65)

chore(chat-template): re-vendor latest froggeric Qwen3.6 template for re-eval (#150) (8a9ea6c)

[Pin: git checkout v0.8.2] · Full diff

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track noonghunna/club-3090

Get notified when new releases ship.

About noonghunna/club-3090

v0.8.7 Genesis vLLM composes deprecated; default to `vllm/minimal`.
v0.8.6 Compose paths moved to `models/<model>/<engine>/compose/<topology>/<quant>/<serving>.yml`.