Find the best local LLM for your hardware, ranked by benchmarks

v0.5.4 Bugfix

This release fixes issues for SREs watching stability and regressions.

Published 2mo LLM Frameworks

✓ No known CVEs patched

✓ No known CVEs patched in this version

Topics

ai apple-silicon benchmarks cli gguf gpu

+7 more

huggingface inference llm local-llm ollama python vram

Summary

AI summary

Fixed false CPU‑only recommendations for Strix Halo/Ryzen AI MAX APUs by using the shared memory pool.

Type	Severity	Summary	CVE
Feature	Medium	Detect and model STRXLGEN, Radeon 8050S, Radeon 8060S with 256 GB/s bandwidth estimate. Detect and model STRXLGEN, Radeon 8050S, Radeon 8060S with 256 GB/s bandwidth estimate. Source: llm_adapter@2026-05-21 Confidence: high	—
Bugfix	Medium	Fix Strix Halo / Ryzen AI MAX shared-memory APU handling. Fix Strix Halo / Ryzen AI MAX shared-memory APU handling. Source: llm_adapter@2026-05-21 Confidence: high	—
Bugfix	Medium	Use shared system-memory pool for fit checks to avoid false recommendations. Use shared system-memory pool for fit checks to avoid false recommendations. Source: llm_adapter@2026-05-21 Confidence: high	—

Full changelog

Fix Strix Halo / Ryzen AI MAX shared-memory APU handling.
Detect and model STRXLGEN, Radeon 8050S, Radeon 8060S, and related names with a 256 GB/s bandwidth estimate.
Use the shared system-memory pool for fit checks to avoid false CPU-only, 99%-offload, and 0 tok/s recommendations on these systems.

CI green: lint, test (3.11), test (3.12), test (3.13).
Local verification: ruff check, ruff format --check, pytest, and whichllm --version.

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Find the best local LLM for your hardware, ranked by benchmarks

Get notified when new releases ship.

About Find the best local LLM for your hardware, ranked by benchmarks