Skip to content

Codeep

v2.4.0 Feature

This release adds 4 notable features for engineering teams evaluating rollout.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai ai-agent ai-agents ai-tools cli-app

Summary

AI summary

Added Claude Opus 4.8 as default model with new pricing and introduced native Ollama API beta.

Changes in this release

Feature Medium

Adds `/model browse` command to list curated local coding models.

Adds `/model browse` command to list curated local coding models.

Source: llm_adapter@2026-05-31

Confidence: high

Feature Medium

Adds `/model rm <name>` command to remove locally installed models.

Adds `/model rm <name>` command to remove locally installed models.

Source: llm_adapter@2026-05-31

Confidence: high

Feature Medium

Shows on-disk size for each model in the `/model` picker UI.

Shows on-disk size for each model in the `/model` picker UI.

Source: llm_adapter@2026-05-31

Confidence: high

Feature Medium

Adds native Ollama API (beta, opt‑in) in `/settings` for direct `/api/chat` routing.

Adds native Ollama API (beta, opt‑in) in `/settings` for direct `/api/chat` routing.

Source: llm_adapter@2026-05-31

Confidence: high

Feature Medium

Adds Claude Opus 4.8 as the new default model in Anthropic provider (pricing $5/$25 per 1M tokens).

Adds Claude Opus 4.8 as the new default model in Anthropic provider (pricing $5/$25 per 1M tokens).

Source: llm_adapter@2026-05-31

Confidence: high

Feature Medium

Adds Gemini 3.5 Flash (`gemini-3.5-flash`) replacing preview Flash in Google provider (pricing $1.50/$9 per 1M tokens).

Adds Gemini 3.5 Flash (`gemini-3.5-flash`) replacing preview Flash in Google provider (pricing $1.50/$9 per 1M tokens).

Source: llm_adapter@2026-05-31

Confidence: high

Feature Medium

Updates pricing tables (`/cost`, dashboard) for new models while keeping legacy models for backward compatibility.

Updates pricing tables (`/cost`, dashboard) for new models while keeping legacy models for backward compatibility.

Source: llm_adapter@2026-05-31

Confidence: high

Feature Low

Configurable `ollamaKeepAlive` (default 30 minutes) and `ollamaNumCtx` settings in native Ollama API integration.

Configurable `ollamaKeepAlive` (default 30 minutes) and `ollamaNumCtx` settings in native Ollama API integration.

Source: granite4.1:30b@2026-05-31-audit

Confidence: low

Full changelog

New models (Claude Opus 4.8, Gemini 3.5 Flash) plus a better local-model experience: browse a curated catalog of coding models, remove models, and see on-disk sizes — all from /model.

Added

  • Claude Opus 4.8 — added to the Anthropic provider and set as the new
    default model. Pricing: $5 / $25 per 1M tokens (input / output), 1M context.
  • Gemini 3.5 Flash (gemini-3.5-flash) — replaces the preview Flash in the
    Google provider list. Pricing: $1.50 / $9.00 per 1M tokens, 1M context.
  • /model browse (Ollama) — a curated catalog of recommended local coding
    models (Qwen2.5 Coder, DeepSeek Coder V2, Llama 3.1, DeepSeek R1, …) with
    parameter sizes, rough VRAM, and an agent-mode suitability hint. Pick one to
    pull it. Mirrors the MCP/skills catalog pattern.
  • /model rm <name> (Ollama) — remove a locally-installed model to reclaim
    disk, without leaving Codeep. Remote-server guard like /model pull.
  • On-disk size in /model picker — the Ollama model list now shows each
    model's size on disk alongside the agent-mode hint.
  • Native Ollama API (beta, opt-in) — set Ollama Native API (beta) → On
    in /settings to route Ollama through its native /api/chat endpoint instead
    of the OpenAI-compatible /v1 shim. Honors num_ctx (the model uses its
    full context window instead of Ollama's small default) and keep_alive
    (keeps the model resident, avoiding reload latency every turn). Tunable via
    ollamaKeepAlive (default 30m) and ollamaNumCtx (0 = auto-detect via
    /api/show). Off by default — existing transport unchanged unless you opt
    in. Verified against Ollama 0.24 (chat, streaming, usage, native tool calls);
    marked beta while it gets coverage across more models and longer sessions.
    Please report issues at https://github.com/VladoIvankovic/Codeep/issues —
    feedback decides when it becomes the default.

Notes

  • Pricing tables (/cost, dashboard) updated so the new models bill at the
    right rates. Previous models (Opus 4.7 / 4.6, Flash preview) stay listed for
    back-compat. VS Code/Zed inherit the new catalog automatically over ACP; the
    native macOS / iOS apps get the same update via the shared CodeepCore catalog.
  • /model browse and /model rm shell out to the local ollama binary, so
    they only run when Ollama is local (remote servers get an SSH hint instead).

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Codeep

Get notified when new releases ship.

Sign up free

About Codeep

All releases →

Related context

Earlier breaking changes

  • v2.4.1 MiniMax M3 replaces MiniMax-M2.7 as default model across all providers.
  • v2.0.0 McpServer protocol now optional fields `command`, `args`, plus new `url` and `headers`; version bumped to 2.0.0.

Beta — feedback welcome: [email protected]