Skip to content

This release adds 3 notable features for engineering teams evaluating rollout.

Published 2mo LLM Frameworks
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai apple-silicon benchmarks cli gguf gpu
+7 more
huggingface inference llm local-llm ollama python vram

Summary

AI summary

Added one-command chat (whichllm run) and Python snippet generation (whichllm snippet).

Full changelog

What's New

whichllm run — One-command chat

Download and chat with any model instantly. Auto-creates an isolated environment, installs dependencies, and starts an interactive session — zero manual setup.

whichllm run "qwen 2.5 1.5b gguf"
whichllm run  # auto-picks the best model for your hardware

Supports all formats: GGUF, AWQ, GPTQ, FP16/BF16.

whichllm snippet — Ready-to-run Python code

Print a copy-paste Python script for any model.

whichllm snippet "qwen 7b"

Improvements

  • Smarter model search: auto-picks top match by downloads instead of erroring on ambiguous queries
  • Shared helpers for model loading and search across commands
  • Refactored plan command to use shared search logic

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Find the best local LLM for your hardware, ranked by benchmarks

Get notified when new releases ship.

Sign up free

About Find the best local LLM for your hardware, ranked by benchmarks

All releases →

Beta — feedback welcome: [email protected]