This release adds 4 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
Summary
AI summaryAdded Claude Opus 4.8 as default model with new pricing and introduced native Ollama API beta.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | Medium |
Adds `/model browse` command to list curated local coding models. Adds `/model browse` command to list curated local coding models. Source: llm_adapter@2026-05-31 Confidence: high |
— |
| Feature | Medium |
Adds `/model rm <name>` command to remove locally installed models. Adds `/model rm <name>` command to remove locally installed models. Source: llm_adapter@2026-05-31 Confidence: high |
— |
| Feature | Medium |
Shows on-disk size for each model in the `/model` picker UI. Shows on-disk size for each model in the `/model` picker UI. Source: llm_adapter@2026-05-31 Confidence: high |
— |
| Feature | Medium |
Adds native Ollama API (beta, opt‑in) in `/settings` for direct `/api/chat` routing. Adds native Ollama API (beta, opt‑in) in `/settings` for direct `/api/chat` routing. Source: llm_adapter@2026-05-31 Confidence: high |
— |
| Feature | Medium |
Adds Claude Opus 4.8 as the new default model in Anthropic provider (pricing $5/$25 per 1M tokens). Adds Claude Opus 4.8 as the new default model in Anthropic provider (pricing $5/$25 per 1M tokens). Source: llm_adapter@2026-05-31 Confidence: high |
— |
| Feature | Medium |
Adds Gemini 3.5 Flash (`gemini-3.5-flash`) replacing preview Flash in Google provider (pricing $1.50/$9 per 1M tokens). Adds Gemini 3.5 Flash (`gemini-3.5-flash`) replacing preview Flash in Google provider (pricing $1.50/$9 per 1M tokens). Source: llm_adapter@2026-05-31 Confidence: high |
— |
| Feature | Medium |
Updates pricing tables (`/cost`, dashboard) for new models while keeping legacy models for backward compatibility. Updates pricing tables (`/cost`, dashboard) for new models while keeping legacy models for backward compatibility. Source: llm_adapter@2026-05-31 Confidence: high |
— |
| Feature | Low |
Configurable `ollamaKeepAlive` (default 30 minutes) and `ollamaNumCtx` settings in native Ollama API integration. Configurable `ollamaKeepAlive` (default 30 minutes) and `ollamaNumCtx` settings in native Ollama API integration. Source: granite4.1:30b@2026-05-31-audit Confidence: low |
— |
Full changelog
New models (Claude Opus 4.8, Gemini 3.5 Flash) plus a better local-model experience: browse a curated catalog of coding models, remove models, and see on-disk sizes — all from
/model.
Added
- Claude Opus 4.8 — added to the Anthropic provider and set as the new
default model. Pricing: $5 / $25 per 1M tokens (input / output), 1M context. - Gemini 3.5 Flash (
gemini-3.5-flash) — replaces the preview Flash in the
Google provider list. Pricing: $1.50 / $9.00 per 1M tokens, 1M context. /model browse(Ollama) — a curated catalog of recommended local coding
models (Qwen2.5 Coder, DeepSeek Coder V2, Llama 3.1, DeepSeek R1, …) with
parameter sizes, rough VRAM, and an agent-mode suitability hint. Pick one to
pull it. Mirrors the MCP/skills catalog pattern./model rm <name>(Ollama) — remove a locally-installed model to reclaim
disk, without leaving Codeep. Remote-server guard like/model pull.- On-disk size in
/modelpicker — the Ollama model list now shows each
model's size on disk alongside the agent-mode hint. - Native Ollama API (beta, opt-in) — set Ollama Native API (beta) → On
in/settingsto route Ollama through its native/api/chatendpoint instead
of the OpenAI-compatible/v1shim. Honorsnum_ctx(the model uses its
full context window instead of Ollama's small default) andkeep_alive
(keeps the model resident, avoiding reload latency every turn). Tunable via
ollamaKeepAlive(default30m) andollamaNumCtx(0= auto-detect via
/api/show). Off by default — existing transport unchanged unless you opt
in. Verified against Ollama 0.24 (chat, streaming, usage, native tool calls);
marked beta while it gets coverage across more models and longer sessions.
Please report issues at https://github.com/VladoIvankovic/Codeep/issues —
feedback decides when it becomes the default.
Notes
- Pricing tables (
/cost, dashboard) updated so the new models bill at the
right rates. Previous models (Opus 4.7 / 4.6, Flash preview) stay listed for
back-compat. VS Code/Zed inherit the new catalog automatically over ACP; the
native macOS / iOS apps get the same update via the shared CodeepCore catalog. /model browseand/model rmshell out to the localollamabinary, so
they only run when Ollama is local (remote servers get an SSH hint instead).
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About Codeep
All releases →Related context
Related tools
Beta — feedback welcome: [email protected]