Skip to content

Release history

noonghunna/club-3090 releases

All releases

28 shown

Config change
v0.8.7 Breaking risk
Breaking upgrade Dependencies

beellama DFlash, Qwen3.6‑35B‑A3B prod, vLLM v0.22.0, switch.sh overhaul

v0.8.6 Breaking risk
⚠ Upgrade required
  • `launch.sh` and `switch.sh` now derive launcher tables from the registry (single source of truth) and accept `/default` variants.
  • Complete repository‑wide migration documentation for the new compose path format.
Breaking changes
  • Compose paths moved to `models/**compose**/*.yml`; previous raw paths (e.g., `dual/turbo.yml`, `docker-compose.yml`) no longer resolve. Use `--variant` keys such as `bash scripts/launch.sh vllm/default`, `vllm/dual`, `ik-llama/iq4ks-mtp` or the new `/` paths directly.
Notable features
  • Added ik-llama PRISM-PRO-DQ and APEX-MTP presets conforming to the new / layout
Full changelog

⚠️ Breaking — compose paths moved. Composes now live at models/<model>/<engine>/compose/<topology>/<quant>/<serving>.yml (e.g. dual/autoround-int4/fp8-mtp.yml). Raw pre-v0.8.6 paths (dual/turbo.yml, docker-compose.yml, …) no longer resolve. Use the --variant keysbash scripts/launch.sh vllm/default (topology-autodetect), vllm/dual, ik-llama/iq4ks-mtp, … — or the new <quant>/ paths directly. launch.sh/switch.sh now derive from the registry (single source of truth) and accept <engine>/default + <engine>/<topology>/default.


v0.8.6 — 2026-05-26

✨ Features

  • feat(ik-llama): PRISM-PRO-DQ + APEX-MTP presets, conformed to the / layout (458c473)

🐛 Bug fixes

  • fix(profiles): cover the new ik-llama PRISM/APEX presets in the compat catalog (aa2a965)
  • fix: post-PR-A compose-path fixes for gpu-mode.sh + 2 patch READMEs (b23846e)
  • fix(registry+bench): sync vision defaults to the 2026-05-25 re-tune (#438) (b116750)
  • fix(vision): re-tune single-card vision defaults to measured-safe (1M-px + 160K/150K) (c9b7dd3)
  • fix(vllm/dual): pin to stable v0.21.0, drop all source overlays (#407 pin-drift) (cf1f14f)

📝 Documentation

  • docs+scripts: finish / path migration across full repo sweep (eaa7a8c)
  • docs(switch): correct ik-llama/iq4ks-mtp usage comment 262K -> 200K (2135230)

🧹 Maintenance

  • refactor(launch): derive launcher tables from the registry + /default resolver (a0520e2)
  • refactor(compose): insert layer + make the registry the single source of truth (9821c94)

[Pin: git checkout v0.8.6] · Full diff

No immediate action
v0.8.5 New feature

kv-calc breakdown, ik-llama ctx increase, docs updates

No immediate action
v0.8.4 Mixed

Features + bug fixes

No immediate action
v0.8.3 Maintenance

Documentation + benchmarks + llama.cpp update

No immediate action
v0.8.2 New feature

Fit check, fixable pulls, more models, NVLink, docs

No immediate action
v0.8.1 Bug fix

Argparse fix + docs corrections

Review required
v0.8.0 Breaking risk
Dependencies

Universal pull for safetensors models

No immediate action
v0.7.4 Bug fix

Overlay tolerance fix + PCIe docs

No immediate action
v0.7.3 New feature

kv-calc, AWQ/MTP, estate boot, MoE composes, profiles

No immediate action
v0.7.2 New feature

Hardware topology advisor

No immediate action
v0.7.1 New feature

Throughput metrics + tuning knobs + CI simplification

No immediate action
v0.7.0 New feature

Diagnose triage + estate planner + docs

No immediate action
v0.6.3 Mixed

NVLink unification + script fix + docs

No immediate action
v0.6.2 Bug fix

TP fix

No immediate action
v0.6.1 New feature

Hardware-aware launcher + kv-calc.py upgrade

No immediate action
v0.6.0 New feature

hardware-aware setup picker

No immediate action
v0.5.4 Maintenance

Routine maintenance and dependency updates.

No immediate action
v0.5.3 New feature

submit-bench flow

No immediate action
v0.5.2 New feature

hardware-aware compose preflight

No immediate action
v0.5.1 Bug fix

Qwen fix + docs clarification

No immediate action
v0.5.0 New feature

Chat-template + vllm overlay + TQ3 filter

No immediate action
v0.4.0 New feature

Rebench report enhancements + faster soak

No immediate action
v0.3.3 Maintenance

Routine maintenance and dependency updates.

No immediate action
v0.3.2 Mixed

Auto‑resolve localhost + changelog automation

No immediate action
v0.3.1 Bug fix

Capture delta.reasoning

Review required
v0.3.0 New feature

Git SHA stamp + MODEL_DIR prompt

No immediate action
v2026.05.09 Breaking risk

Qwen, Gemma, benchmarks, tooling, docs

Beta — feedback welcome: [email protected]