Codeep

v2.4.0 Feature

This release adds 4 notable features for engineering teams evaluating rollout.

Published 1mo AI Agents & Assistants

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai ai-agent ai-agents ai-tools cli-app

Summary

AI summary

Added Claude Opus 4.8 as default model with new pricing and introduced native Ollama API beta.

Changes in this release

Type	Severity	Summary	CVE
Feature
Feature	Medium	Adds `/model browse` command to list curated local coding models. Adds `/model browse` command to list curated local coding models. Source: llm_adapter@2026-05-31 Confidence: high	—
Feature	Medium	Adds `/model rm <name>` command to remove locally installed models. Adds `/model rm <name>` command to remove locally installed models. Source: llm_adapter@2026-05-31 Confidence: high	—
Feature	Medium	Shows on-disk size for each model in the `/model` picker UI. Shows on-disk size for each model in the `/model` picker UI. Source: llm_adapter@2026-05-31 Confidence: high	—
Feature	Medium	Adds native Ollama API (beta, opt‑in) in `/settings` for direct `/api/chat` routing. Adds native Ollama API (beta, opt‑in) in `/settings` for direct `/api/chat` routing. Source: llm_adapter@2026-05-31 Confidence: high	—
Feature	Medium	Adds Claude Opus 4.8 as the new default model in Anthropic provider (pricing $5/$25 per 1M tokens). Adds Claude Opus 4.8 as the new default model in Anthropic provider (pricing $5/$25 per 1M tokens). Source: llm_adapter@2026-05-31 Confidence: high	—
Feature	Medium	Adds Gemini 3.5 Flash (`gemini-3.5-flash`) replacing preview Flash in Google provider (pricing $1.50/$9 per 1M tokens). Adds Gemini 3.5 Flash (`gemini-3.5-flash`) replacing preview Flash in Google provider (pricing $1.50/$9 per 1M tokens). Source: llm_adapter@2026-05-31 Confidence: high	—
Feature	Medium	Updates pricing tables (`/cost`, dashboard) for new models while keeping legacy models for backward compatibility. Updates pricing tables (`/cost`, dashboard) for new models while keeping legacy models for backward compatibility. Source: llm_adapter@2026-05-31 Confidence: high	—
Feature	Low	Configurable `ollamaKeepAlive` (default 30 minutes) and `ollamaNumCtx` settings in native Ollama API integration. Configurable `ollamaKeepAlive` (default 30 minutes) and `ollamaNumCtx` settings in native Ollama API integration. Source: granite4.1:30b@2026-05-31-audit Confidence: low	—

Full changelog

New models (Claude Opus 4.8, Gemini 3.5 Flash) plus a better local-model experience: browse a curated catalog of coding models, remove models, and see on-disk sizes — all from /model.

Added

Claude Opus 4.8 — added to the Anthropic provider and set as the new
default model. Pricing: $5 / $25 per 1M tokens (input / output), 1M context.
Gemini 3.5 Flash (gemini-3.5-flash) — replaces the preview Flash in the
Google provider list. Pricing: $1.50 / $9.00 per 1M tokens, 1M context.
/model browse (Ollama) — a curated catalog of recommended local coding
models (Qwen2.5 Coder, DeepSeek Coder V2, Llama 3.1, DeepSeek R1, …) with
parameter sizes, rough VRAM, and an agent-mode suitability hint. Pick one to
pull it. Mirrors the MCP/skills catalog pattern.
/model rm <name> (Ollama) — remove a locally-installed model to reclaim
disk, without leaving Codeep. Remote-server guard like /model pull.
On-disk size in /model picker — the Ollama model list now shows each
model's size on disk alongside the agent-mode hint.
Native Ollama API (beta, opt-in) — set Ollama Native API (beta) → On
in /settings to route Ollama through its native /api/chat endpoint instead
of the OpenAI-compatible /v1 shim. Honors num_ctx (the model uses its
full context window instead of Ollama's small default) and keep_alive
(keeps the model resident, avoiding reload latency every turn). Tunable via
ollamaKeepAlive (default 30m) and ollamaNumCtx (0 = auto-detect via
/api/show). Off by default — existing transport unchanged unless you opt
in. Verified against Ollama 0.24 (chat, streaming, usage, native tool calls);
marked beta while it gets coverage across more models and longer sessions.
Please report issues at https://github.com/VladoIvankovic/Codeep/issues —
feedback decides when it becomes the default.

Notes

Pricing tables (/cost, dashboard) updated so the new models bill at the
right rates. Previous models (Opus 4.7 / 4.6, Flash preview) stay listed for
back-compat. VS Code/Zed inherit the new catalog automatically over ACP; the
native macOS / iOS apps get the same update via the shared CodeepCore catalog.
/model browse and /model rm shell out to the local ollama binary, so
they only run when Ollama is local (remote servers get an SSH hint instead).

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track Codeep

Get notified when new releases ship.

About Codeep

All releases →

Related context

Related tools

Earlier breaking changes

v2.8.0 `codeep account push` and `account sync` no longer transfer API keys unless `/keysync on` is enabled
v2.4.1 MiniMax M3 replaces MiniMax-M2.7 as default model across all providers.
v2.0.0 McpServer protocol now optional fields `command`, `args`, plus new `url` and `headers`; version bumped to 2.0.0.