This release includes breaking changes for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
+7 more
ReleasePort's take
Light signalRelease v0.5.3 fixes a KeyError crash in `whichllm run` transformers chat and updates GPU detection/fallback features.
Why it matters: Addresses a critical bug that caused crashes when invoking the transformers chat path; ensures reliable operation for developers using whichllm on Linux Intel, NVIDIA, or Apple Silicon GPUs.
Summary
AI summaryFixed transformers chat crash by passing tokenizer mappings to model.generate, preventing KeyError: 'shape'.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | Medium |
Linux Intel integrated GPU detection via /sys/class/drm. Linux Intel integrated GPU detection via /sys/class/drm. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
NVIDIA nvidia-smi fallback detection when pynvml missing or NVML fails. NVIDIA nvidia-smi fallback detection when pynvml missing or NVML fails. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Apple-prefixed Apple Silicon simulator aliases support. Apple-prefixed Apple Silicon simulator aliases support. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Bugfix | Medium |
Fixed `whichllm run` transformers chat path to avoid KeyError: 'shape'. Fixed `whichllm run` transformers chat path to avoid KeyError: 'shape'. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Bugfix | Medium |
RTX 5060 Ti bandwidth now reports 448 GB/s. RTX 5060 Ti bandwidth now reports 448 GB/s. Source: llm_adapter@2026-05-21 Confidence: low |
— |
Full changelog
What's Changed
Added
- Linux Intel integrated GPU detection via
/sys/class/drm, so Intel iGPU systems are no longer treated as CPU-only by default. - NVIDIA
nvidia-smifallback detection when pynvml is missing, NVML init fails, or NVML reports no devices. - Apple-prefixed Apple Silicon simulator aliases, so
--gpu "Apple M3 Max"works like--gpu "M3 Max".
Fixed
- Fixed the
whichllm runtransformers chat path by passing tokenizer mappings intomodel.generate(**inputs), avoiding theKeyError: 'shape'crash. - RTX 5060 Ti bandwidth lookup now reports 448 GB/s instead of
N/A.
Docs and maintenance
- Updated install guidance toward
uvx/uv tool install. - Removed the old marketing note and added sponsor metadata.
Verification
uv run pytest— 138 passeduv run --with ruff ruff check .— passeduv run --with ruff ruff format --check .— passeduv run whichllm --version— 0.5.3uv run --with build python -m build— built wheel and sdist
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
Track Find the best local LLM for your hardware, ranked by benchmarks
Get notified when new releases ship.
Sign up freeAbout Find the best local LLM for your hardware, ranked by benchmarks
All releases →Related context
Related tools
Beta — feedback welcome: [email protected]