ollama releases - releaseport

No immediate action

v0.32.4 Bug fix 2d

Qwen3 MoE fix + performance boost

Open

No immediate action

v0.32.3 Bug fix 4d

Stalled model downloads

Open

No immediate action

v0.32.1 Breaking risk 11d

MLX cache leak fix

Open

No immediate action

v0.32.0 Breaking risk 16d

Interactive agent experience

Open

No immediate action

v0.31.2 Bug fix 20d

Non‑UTF‑8 path fix

Open

No immediate action

v0.31.1 Maintenance 26d

Routine maintenance and dependency updates.

Open

No immediate action

v0.30.11 Mixed 1mo

sm_86 support + speculative decoding + Vulkan fix

Open

No immediate action

v0.30.10 Maintenance 1mo

Routine maintenance and dependency updates.

Open

No immediate action

v0.30.9 Breaking risk 1mo

Context‑window size error

Open

No immediate action

v0.30.8 Bug fix 1mo

`ollama launch` fix

Open

No immediate action

v0.30.7 New feature 1mo

Hermes Desktop

Open

No immediate action

v0.30.6 New feature 1mo

NVFP4 quantization improvement

Open

No immediate action

v0.30.5 Bug fix 1mo

Gemma crash fix

Open

No immediate action

v0.30.4 Bug fix 1mo

Gemma crash

Open

No immediate action

v0.30.3 New feature 1mo

Gemma 4‑12B support

Open

No immediate action

v0.30.2 Mixed 1mo

Cline CLI, Qwen integration, LLM improvements

Open

No immediate action

v0.23.3 Bug fix 2mo

macOS 26 leak fix

Open

v0.23.2 Breaking risk 2mo

⚠ Upgrade required

Use `ollama launch claude-desktop --restore` to re-enable Claude Desktop after upgrade.

Breaking changes

`ollama launch` no longer includes Claude Desktop

Notable features

/api/show responses are now cached, improving median latency by ~6.7x
Improved backup workflow when managing launch integrations
Cleaner image generation layout in the MLX runner

Full changelog

What's Changed

ollama launch no longer includes Claude Desktop due to the third-party integration being limited to Anthropic models.
Use ollama launch claude-desktop --restore to restore Claude Desktop to its normal state.
/api/show responses are now cached, improving median latency by ~6.7x which will increase load speed for integrations like VS Code.
Improved backup workflow when managing launch integrations
Cleaner image generation layout in the MLX runner

Full Changelog: https://github.com/ollama/ollama/compare/v0.23.1...v0.23.2

View release on GitHub

v0.23.1 New feature 2mo

Notable features

Gemma 4 MTP speculative decoding support on Macs (up to 2x speed increase for Gemma 4 31B coding tasks)
MLX and MLX-C updated with threading fixes
Go runtime bumped to version 1.26

Full changelog

Gemma 4 MTP (Multi-token Processing) for the MLX runner

Gemma 4 MTP speculative decoding is now supported on Macs. This can give over a 2x speed increase for the Gemma 4 31B model on coding tasks.

ollama run gemma4:31b-coding-mtp-bf16

What's Changed

Update MLX and MLX-C with threading fixes by @dhiltgen in https://github.com/ollama/ollama/pull/15845
go: bump to 1.26 by @ParthSareen in https://github.com/ollama/ollama/pull/15904
Add Gemma 4 MTP speculative decoding by @pdevine in https://github.com/ollama/ollama/pull/15980

Full Changelog: https://github.com/ollama/ollama/compare/v0.23.0...v0.23.1-rc0

View release on GitHub

v0.23.0 New feature 2mo

Notable features

Claude Desktop now supported via `ollama launch claude`
Ollama app surfaces featured models from server-driven recommendations
Claude Cowork and Claude Code integrated within the Claude Desktop App

Full changelog

Claude Desktop

Claude Desktop is now supported with Ollama Launch.

Claude Cowork and Claude Code are supported within the Claude Desktop App.

ollama launch claude-desktop

Claude Cowork

Claude Code

Claude Code on the terminal can still be accessed through the CLI with:

ollama launch claude

Not supported yet

Web Search (coming soon)
Extensions

What's Changed

Launch Claude Desktop with ollama launch claude
The Ollama app now surfaces featured models from server-driven recommendations
Fixed OpenClaw gateway timeout on Windows by enforcing IPv4 loopback (thanks @UniquePratham)
Hardened Metal initialization to gracefully handle ggml kernel compilation failures

New Contributors

@UniquePratham made their first contribution in https://github.com/ollama/ollama/pull/15726

Full Changelog: https://github.com/ollama/ollama/compare/v0.22.1...v0.23.0

View release on GitHub

v0.22.1 Bug fix 2mo

Notable features

Model recommendations updated without Ollama update
Desktop app launch page aligned with `ollama launch` integrations
Gemma 4 renderer improvements for thinking and tool calling

Full changelog

What's Changed

Updated the Gemma 4 renderer for thinking and tool calling improvements
Model recommendations are now updated without updating Ollama
Aligned the desktop app's launch page with ollama launch integrations
Fixed the Poolside integration title in ollama launch

Full Changelog: https://github.com/ollama/ollama/compare/v0.22.0...v0.22.1

View release on GitHub

v0.22.0 Feature 2mo

Notable features

NVIDIA's Nemotron 3 Omni model
Poolside's Laguna XS.2 open-weight coding model

Full changelog

New models

NVIDIA's Nemotron 3 Omni
Poolside's first open-weight coding model - Laguna XS.2

Full Changelog: https://github.com/ollama/ollama/compare/v0.21.2...v0.22.0

View release on GitHub

v0.21.3-rc0 New feature 3mo

Notable features

API accepts "max" as a think value
OpenAI responses map reasoning effort to think

Full changelog

What's Changed

api: accept "max" as a think value by @ParthSareen in https://github.com/ollama/ollama/pull/15787
openai: map responses reasoning effort to think by @ParthSareen in https://github.com/ollama/ollama/pull/15789

Full Changelog: https://github.com/ollama/ollama/compare/v0.21.2...v0.21.3-rc0

View release on GitHub

v0.21.2 Maintenance 3mo

Notable features

Improved OpenClaw onboarding flow
Canonical ordering of recommended models
Web search plugin bundling

Full changelog

What's Changed

Improved reliability of the OpenClaw onboarding flow in ollama launch
Recommended models in ollama launch now appear in a fixed, canonical order
OpenClaw integration now bundles Ollama's web search plugin in OpenClaw

New Contributors

@madflow made their first contribution in https://github.com/ollama/ollama/pull/15733

Full Changelog: https://github.com/ollama/ollama/compare/v0.21.1...v0.21.2

View release on GitHub

v0.21.1 New feature 3mo

Notable features

Kimi CLI integration
MLX logprobs support
Faster MLX sampling with fused top-P/top-K

Full changelog

What's Changed

Kimi CLI

You can now install and run the Kimi CLI through Ollama.

ollama launch kimi --model kimi-k2.6:cloud

Kimi CLI with Kimi K2.6 excels at long horizon agentic execution tasks through a multi-agent system.

MLX runner adds logprobs support for compatible models
Faster MLX sampling with fused top-P and top-K in a single sort pass, plus repeat penalties applied in the sampler
Improved MLX prompt tokenization by moving tokenization into request handler goroutines
Better MLX thread safety for array management
GLM4 MoE Lite performance improvement with a fused sigmoid router head
Fixed model picker showing stale model after switching chats in the macOS app
Fixed structured outputs for Gemma 4 when think=false

Full Changelog: https://github.com/ollama/ollama/compare/v0.21.0...v0.21.1

View release on GitHub

v0.21.0 Mixed 3mo

Notable features

Copilot CLI integration
Hermes integration
OpenCode inline configuration support

Full changelog

What's Changed

launch: skip unchanged integration rewrite configration by @hoyyeva in https://github.com/ollama/ollama/pull/15491
launch/openclaw: fix --yes flag behaviour to skip channels configuration by @hoyyeva in https://github.com/ollama/ollama/pull/15589
launch: OpenCode inline config by @hoyyeva in https://github.com/ollama/ollama/pull/15586
launch: add hermes by @ParthSareen in https://github.com/ollama/ollama/pull/15569
launch: always list cloud recommendations first by @hoyyeva in https://github.com/ollama/ollama/pull/15593
cmd/launch: add Copilot CLI integration by @scaryrawr in https://github.com/ollama/ollama/pull/15583

New Contributors

@scaryrawr made their first contribution in https://github.com/ollama/ollama/pull/15583

Full Changelog: https://github.com/ollama/ollama/compare/v0.20.8-rc0...v0.21.0

View release on GitHub

v0.20.7 Bug fix 3mo

Minor fixes and improvements.

View release on GitHub

v0.20.6 Maintenance 3mo

Minor fixes and improvements.

View release on GitHub

v0.20.5 New feature 3mo

Notable features

OpenClaw channel setup for WhatsApp, Telegram, Discord, and other messaging platforms
Flash attention support for Gemma 4 on compatible GPUs
Improved OpenCode install detection

View release on GitHub

v0.20.4 New feature 3mo

Notable features

mlx: Improved M5 performance using NAX
gemma4: Flash attention enabled

View release on GitHub

v0.20.3 Mixed 3mo

Notable features

Added latest models to Ollama App
Gemma 4 tool calling improvements
OpenClaw fixes for launching TUI

Full changelog

What's Changed

Gemma 4 Tool Calling improvements
Added latest models to Ollama App
OpenClaw fixes for launching TUI

Full Changelog: https://github.com/ollama/ollama/compare/v0.20.2...v0.20.3

View release on GitHub

v0.20.2 Maintenance 3mo

Minor fixes and improvements.

Full changelog

What's Changed

app: default app home view to new chat instead of launch by @jmorganca in https://github.com/ollama/ollama/pull/15312

Full Changelog: https://github.com/ollama/ollama/compare/v0.20.1...v0.20.2

View release on GitHub

v0.20.0 New feature 3mo

Notable features

Gemma 4 model family: E2B, E4B, 26B (MoE), and 31B (Dense) variants now available
MLX pipeline now respects tokenizer add_bos_token setting
SentencePiece-style BPE tokenizer support

Full changelog

Gemma 4

Effective 2B (E2B)

ollama run gemma4:e2b

Effective 4B (E4B)

ollama run gemma4:e4b

26B (Mixture of Experts model with 4B active parameters)

ollama run gemma4:26b

31B (Dense)

ollama run gemma4:31b

What's Changed

docs: update pi docs by @ParthSareen in https://github.com/ollama/ollama/pull/15152
mlx: respect tokenizer add_bos_token setting in pipeline by @dhiltgen in https://github.com/ollama/ollama/pull/15185
tokenizer: add SentencePiece-style BPE support by @dhiltgen in https://github.com/ollama/ollama/pull/15162

Full Changelog: https://github.com/ollama/ollama/compare/v0.19.0...v0.20.0-rc0

View release on GitHub

v0.19.0 New feature 4mo

Notable features

Apple Silicon builds now use the MLX framework for unified memory performance
`ollama launch pi` includes a web search plugin that leverages Ollama's web search
KV cache hit rates for the Anthropic-compatible API were improved

View release on GitHub

v0.18.4-rc0 Bug fix 4mo

Flash attention disabled for grok, KV cache memory leak fixed, periodic snapshot scheduling added, and VSCode documentation updated to improve inference reliability and developer tooling.

View release on GitHub

v0.18.3 New feature 4mo

Notable features

Ollama models (local and cloud) now available in Visual Studio Code via GitHub Copilot
GLM parser improvements for tool calls
OpenClaw integration improvements for gateway checks

Full changelog

Visual Studio Code

Microsoft Visual Studio Code now directly integrates with Ollama via GitHub Copilot.

If you have Ollama installed, any local or cloud model from Ollama can be selected for use within visual studio code.

What's Changed

GLM parser improvements for tool calls
OpenClaw integration improvements for gateway checks

Full Changelog: https://github.com/ollama/ollama/compare/v0.18.2...v0.18.3

View release on GitHub

v0.18.2 Bug fix 4mo

Ensured OpenClaw requires npm and git, fixed CLI model-flag handling, corrected websearch package registration, and resolved cache breakages that slowed Claude Code locally.

View release on GitHub

v0.18.1 New feature 4mo

⚠ Upgrade required

Web search in local models requires `ollama signin` authentication
Headless mode via `ollama launch` requires `--model` flag; use `--yes` to auto-pull model and skip selectors
`ollama launch openclaw` now uses official Ollama auth and model provider

Notable features

Web search and web fetch plugins for OpenClaw — models can search the web and fetch readable content (no JavaScript execution)
Non-interactive (headless) mode for `ollama launch` with `--model` and `--yes` flags for Docker, CI/CD, and script automation
Official Ollama auth and model provider integration for OpenClaw

Full changelog

Web Search and Fetch in OpenClaw

Ollama now ships with web search and web fetch plugin for OpenClaw. This allows Ollama's models (local or cloud) to search the web for the latest content and news. This also allows OpenClaw with Ollama to be able to fetch the web and extract readable content for processing. This feature does not execute JavaScript.

When using local models with web search in OpenClaw, ensure you are signed into Ollama with ollama signin

ollama launch openclaw

You can install web search directly into OpenClaw as a plugin if you already have OpenClaw configured and working:

Ollama web search plugin

openclaw plugins install @ollama/openclaw-web-search

Non-interactive (headless) mode for ollama launch

ollama launch can now run in non-interactive mode.

Perfect for:

Docker/containers: spin up an integration as a pipeline step to run evals, test prompts, or validate model behavior as part of your build. Tear it down when the job ends.
CI/CD: Generate code reviews, security checks, and other tasks within your CI
Scripts/automation: Kick off automated tasks with Ollama and claude code
--model must be specified to run in headless mode
--yes flag will auto-pull the model and skip any selectors

Try with: ollama launch claude --model kimi-k2.5:cloud --yes -- -p "how does this repository work?"

Use non-interactive mode in OpenClaw

You can ask your OpenClaw to run tasks using claude with subagents:

ollama launch claude --model kimi-k2.5:cloud --yes -- -p "how does this repository work?" using a subagent

What's Changed

ollama launch openclaw will now use the official Ollama auth and model provider for OpenClaw
Improvements to Ollama's benchmarking tool in ./cmd/bench
ollama launch openclaw will now skip --install-daemon when systemd is unavailable

Full Changelog: https://github.com/ollama/ollama/compare/v0.18.0...v0.18.1

View release on GitHub

v0.18.0 New feature 4mo

Notable features

2x faster Kimi-K2.5 performance
Nemotron-3-Super 122B model
Non-interactive task support

Full changelog

Ollama 0.18 includes improved performance for OpenClaw and Ollama’s cloud models, including the new Nemotron-3-Super model by NVIDIA designed for high-performance agentic reasoning tasks.

Improved OpenClaw performance with Kimi-K2.5

This release of Ollama improves performance of cloud models and their reliability.

Up to 2x faster speeds with Kimi-K2.5
Tool calling accuracy has been improved

ollama launch openclaw --model kimi-k2.5

Ollama is now a provider in OpenClaw

Ollama can now be selected as an authentication and model provider during OpenClaw onboarding (thanks @BruceMacD for contributing and @steipete for reviewing!)

openclaw onboard --auth-choice ollama

More information: https://docs.openclaw.ai/providers/ollama

Nemotron-3-Super

Nemotron-3-Super: is a new 122B parameter model with strong reasoning and tool calling capability, while having top performance when run on modern hardware:

ollama run nemotron-3-super:cloud
ollama run nemotron-3-super to run locally (requires 96GB+ of VRAM)

Nemotron-3-Super scores highest of any open model on PinchBench, a benchmark suite that measures how successful models are at completing tasks when used with OpenClaw.

ollama launch openclaw --model nemotron-3-super:cloud

Or using OpenClaw’s onboarding:

openclaw onboard \
	--auth-choice ollama \
	--custom-model-id nemotron-3-super:cloud

Non-interactive task support

ollama launch now supports non-interactive tasks by passing in --yes. This enables using Claude, Codex, Pi and more in scripts, GitHub Actions, and other non-interactive environments.

ollama launch claude \
	--model glm-5:cloud \
	--yes \
	-- "Do a quick code review of this pull request and respond on GitHub with a comment summarizing your feedback."

Lower latency on MiniMax-M2.5 and Qwen3.5 on Ollama’s cloud

For customers in North America, MiniMax-M2.5 and Qwen3.5 on Ollama’s cloud now respond much faster, up to 10x and up to 2x faster respectively, and often in less than a second. This is ideal for tasks that require a fast Time To First Token (TTFT) when needing quick answers from OpenClaw or quick back-to-back coding tasks.

ollama launch claude --model minimax-m2.5

Driver updates required for ROCm 7

This version of Ollama ships with ROCm 7, and requires updating drivers to the latest version for continued support.

What's Changed

Ollama's cloud models no longer require downloading via ollama pull. Setting :cloud as a tag will now automatically connect to cloud models.
New --yes flag for ollama launch that skips all prompts, making it possible to run AI assistants and other tools in non-interactive environments
Fixed issue where "Reset to Defaults" in Ollama's app would disable downloading automatic updates.
Ollama will now ensure context compaction occurs at the correct context length for each model when using ollama launch claude

New Contributors

@flipbit03 made their first contribution in https://github.com/ollama/ollama/pull/14821
@shivamtiwari3 made their first contribution in https://github.com/ollama/ollama/pull/14825

Full Changelog: https://github.com/ollama/ollama/compare/v0.17.7...v0.18.0

View release on GitHub

v0.17.8-rc4 Bug fix 4mo

Unclosed argument tags in GLM calls were repaired, cloud model stub handling restored, localhost connections fixed, Docker builds accelerated, and int4 groupsize 64 added, boosting reliability and performance in deployments.

View release on GitHub

v0.17.7 Mixed 4mo

Notable features

Context length support for compaction when using ollama launch
Thinking level values now correctly interpreted in the API for all thinking models

Full changelog

What's Changed

Allow thinking levels such as "medium" to correctly interpreted in Ollama's API for all thinking models
Add context length to support compaction when using ollama launch

Full Changelog: https://github.com/ollama/ollama/compare/v0.17.6...v0.17.7

View release on GitHub

v0.17.6 Bug fix 4mo

The update corrects OCR prompt rendering and fixes parsing issues for Qwen 3.5 models, restoring expected functionality for OCR and tool integration.

View release on GitHub

v0.17.5 Bug fix 4mo

Qwen 3.5 models now support multiple sizes, with fixes for GPU/CPU splitting crashes, repetition errors, and memory/MLX issues, improving reliability and performance for developers deploying mixed-hardware inference.

View release on GitHub

v0.17.4 New feature 5mo

Notable features

Qwen 3.5 multimodal models
LFM 2 hybrid on-device models
Tool call indices in parallel calls

Full changelog

New models

Qwen 3.5: a family of open-source multimodal models that delivers exceptional utility and performance.
LFM 2: LFM2 is a family of hybrid models designed for on-device deployment. LFM2-24B-A2B is the largest model in the family, scaling the architecture to 24 billion parameters while keeping inference efficient.

Note: for users on 0.17.1, this version will not automatically update. Re-downloading is required to receive the latest version of Ollama.

What's Changed

Tool call indices will now be included in parallel tool calls

Full Changelog: https://github.com/ollama/ollama/compare/v0.17.3...v0.17.4

View release on GitHub

v0.17.3 Bug fix 5mo

The update fixes a parsing error that prevented tool calls in Qwen 3 and Qwen 3.5 models from being recognized when generated during the model's reasoning phase.

View release on GitHub

v0.17.2 Bug fix 5mo

Fixed a Windows crash that occurred after downloading updates, improving stability for local installations.

View release on GitHub

v0.17.1 Breaking risk 5mo

⚠ Upgrade required

MLX engine users: ollama create with unquantized models will no longer apply affine quantization by default

Breaking changes

ollama create command no longer defaults to affine quantization for unquantized models when using the MLX engine

Notable features

Nemotron architecture support in engine
Web search capabilities for models that support tools
Configuration option to disable automatic update downloading

Full changelog

What's Changed

Nemotron architecture support in Ollama's engine
MLX engine now has improved memory usage
Ollama's app will now allow models that support tools to use web search capabilities
Improved LFM2 and LFM2.5 models in Ollama's engine
ollama create will no longer default to affine quantization for unquantized models when using the MLX engine
Added configuration for disabling automatic update downloading

Full Changelog: https://github.com/ollama/ollama/compare/v0.17.0...v0.17.1

View release on GitHub

v0.17.0 New feature 5mo

Notable features

OpenClaw integration
Web search in cloud models
Improved tokenizer performance

Full changelog

OpenClaw

OpenClaw can now be installed and configured automatically via Ollama, making it the easiest way to get up and running with OpenClaw with open models like Kimi-K2.5, GLM-5, and Minimax-M2.5.

Get started

ollama launch openclaw

Web search in OpenClaw

When using cloud models, websearch is enabled - allowing OpenClaw to search the internet.

What's Changed

Improved tokenizer performance
Ollama's macOS and Windows apps will now default to a context length based on available VRAM

New Contributors

@natl-set made their first contribution in https://github.com/ollama/ollama/pull/14322

Full Changelog: https://github.com/ollama/ollama/compare/v0.16.3...v0.17.0

View release on GitHub

v0.16.3 New feature 5mo

Notable features

ollama launch cline subcommand added for Cline CLI integration
ollama launch now always displays the model picker
MLX runner now supports Gemma 3, Llama, and Qwen 3 architectures

Full changelog

What's Changed

New ollama launch cline added for the Cline CLI
ollama launch <integration> will now always show the model picker
Added Gemma 3, Llama and Qwen 3 architectures to MLX runner

New Contributors

@hellosaumil made their first contribution in https://github.com/ollama/ollama/pull/14271

Full Changelog: https://github.com/ollama/ollama/compare/v0.16.2...v0.16.3

View release on GitHub

v0.16.2 Mixed 5mo

Notable features

New `OLLAMA_NO_CLOUD` environment variable and app setting to disable cloud models for sensitive tasks

Full changelog

What's Changed

ollama launch claude now supports searching the web when using :cloud models
Fixed rendering issue when running ollama in PowerShell
New setting in Ollama's app makes it easier to disable cloud models for sensitive and private tasks where data cannot leave your computer. For Linux or when running ollama serve manually, set OLLAMA_NO_CLOUD=1.
Fixed issue where experimental image generation models would not run in 0.16.0 and 0.16.1

Full Changelog: https://github.com/ollama/ollama/compare/v0.16.1...v0.16.2-rc0

View release on GitHub

v0.16.1 Bug fix 5mo

Installation scripts on macOS and Windows now provide smoother password prompts and progress feedback, and image generation models honor the load timeout setting.

View release on GitHub

v0.16.0 New feature 5mo

Notable features

GLM-5 reasoning model (744B total parameters, 40B active)
MiniMax-M2.5 model for productivity and coding
Text prompt editing in editor via Ctrl+G

Full changelog

New models

GLM-5: A strong reasoning and agentic model from Z.ai with 744B total parameters (40B active), built for complex systems engineering and long-horizon tasks.
MiniMax-M2.5: a new state-of-the-art large language model designed for real-world productivity and coding tasks.

New `ollama`

The new ollama command makes it easy to launch your favorite apps with models using Ollama

What's Changed

Launch Pi with ollama launch pi
Improvements to Ollama's MLX runner to support GLM-4.7-Flash
Ctrl+G will now allow for editing text prompts in a text editor when running a model

Full Changelog: https://github.com/ollama/ollama/compare/v0.15.6...v0.16.0

View release on GitHub

v0.15.6 Bug fix 5mo

Fixes context limit crashes for droid launches, corrects image handling bugs, and automatically downloads missing models to prevent errors.

View release on GitHub

v0.15.5 New feature 5mo

Notable features

Qwen3-Coder-Next coding model
GLM-OCR document understanding
ollama launch subagent support

Full changelog

New models

Qwen3-Coder-Next: a coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development.
GLM-OCR: GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.

Improvements to `ollama launch`

ollama launch can now be provided arguments, for example ollama launch claude -- --resume
ollama launch will now work run subagents when using ollama launch claude
Ollama will now set context limits for a set of models when using ollama launch opencode

What's Changed

Sub-agent support for ollama launch for planning, deep research, and similar tasks
ollama signin will now open a browser window to make signing in easier
Ollama will now default to the following context lengths based on VRAM:
- < 24 GiB VRAM: 4,096 context
- 24-48 GiB VRAM: 32,768 context
- >= 48 GiB VRAM: 262,144 context
GLM-4.7-Flash support on Ollama's experimental MLX engine
ollama signin will now open the browser to the connect page
Fixed off by one error when using num_predict in the API
Fixed issue where tokens from a previous sequence would be returned when hitting num_predict

New Contributors

@avukmirovich made their first contribution in https://github.com/ollama/ollama/pull/13934

Full Changelog: https://github.com/ollama/ollama/compare/v0.15.4...v0.15.5

View release on GitHub

v0.15.4 New feature 5mo

Notable features

ollama launch openclaw now enters standard onboarding flow if previously incomplete

Full changelog

What's Changed

ollama launch openclaw will now enter the standard OpenClaw onboarding flow if this has not yet been completed.

Full Changelog: https://github.com/ollama/ollama/compare/v0.15.3...v0.15.4

View release on GitHub

v0.15.3 Breaking risk 5mo

Breaking changes

Command `ollama launch clawdbot` renamed to `ollama launch openclaw`

Notable features

Improved tool calling for Ministral models
ollama launch now respects the OLLAMA_HOST environment variable

Full changelog

What's Changed

Renamed ollama launch clawdbot to ollama launch openclaw to reflect the project's new name
Improved tool calling for Ministral models
docs: add clawdbot by @ParthSareen in https://github.com/ollama/ollama/pull/13925
cmd/config: Use envconfig.Host() for base API in launch config packages by @gabe-l-hart in https://github.com/ollama/ollama/pull/13937
ollama launch will now use the value of OLLAMA_HOST when running it

New Contributors

@MBerguer made their first contribution in https://github.com/ollama/ollama/pull/13971
@taronsung made their first contribution in https://github.com/ollama/ollama/pull/13965
@noureldin-azzab made their first contribution in https://github.com/ollama/ollama/pull/13961
@dhirajlochib made their first contribution in https://github.com/ollama/ollama/pull/13645
@ThanhNguyxn made their first contribution in https://github.com/ollama/ollama/pull/13979

Full Changelog: https://github.com/ollama/ollama/compare/v0.15.2...v0.15.3

View release on GitHub

v0.15.2 New feature 6mo

Notable features

New `ollama launch clawdbot` command for launching Clawdbot using Ollama models

Full changelog

What's Changed

New ollama launch clawdbot command for launching Clawdbot using Ollama models

Full Changelog: https://github.com/ollama/ollama/compare/v0.15.1...v0.15.2

View release on GitHub

v0.15.1 Bug fix 6mo

Improved performance and correctness of GLM-4.7-Flash, resolved macOS and arm64 Linux slowdowns, and corrected launch detection for Claude, preventing configuration errors.

View release on GitHub

v0.15.0 Mixed 6mo

Notable features

New `ollama launch` command for Claude Code, Codex, OpenCode, and Droid integration
Multi-line strings with `"""` now work in `ollama run`
Ctrl + J and Shift + Enter support for inserting newlines in `ollama run`

Full changelog

`ollama launch`

A new ollama launch command to use Ollama's models with Claude Code, Codex, OpenCode, and Droid without separate configuration.

What's Changed

New ollama launch command for Claude Code, Codex, OpenCode, and Droid
Fixed issue where creating multi-line strings with """ would not work when using ollama run
Ctrl+J and Shift+Enter now work for inserting newlines in ollama run
Reduced memory usage for GLM-4.7-Flash models

View release on GitHub

v0.14.3 New feature 6mo

Notable features

Z-Image Turbo (Alibaba): 6B text-to-image model for photorealistic image generation
Flux.2 Klein (Black Forest Labs): fastest image generation model to date
/api/generate endpoint now supports image generation

Full changelog

Z-Image Turbo: 6 billion parameter text-to-image model from Alibaba’s Tongyi Lab. It generates high-quality photorealistic images.
Flux.2 Klein: Black Forest Labs’ fastest image-generation models to date.

New models

GLM-4.7-Flash: As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.
LFM2.5-1.2B-Thinking: LFM2.5 is a new family of hybrid models designed for on-device deployment.

What's Changed

Fixed issue where Ollama's macOS app would interrupt system shutdown
Fixed ollama create and ollama show commands for experimental models
The /api/generate API can now be used for image generation
Fixed minor issues in Nemotron-3-Nano tool parsing
Fixed issue where removing an image generation model would cause it to first load
Fixed issue where ollama rm would only stop the first model in the list if it were running

Full Changelog: https://github.com/ollama/ollama/compare/v0.14.2...v0.14.3

View release on GitHub

v0.14.2 New feature 6mo

Notable features

TranslateGemma: new open translation model collection supporting 55 languages, built on Gemma 3
CLI: Shift + Enter (or Ctrl + j) now enters newlines
Improved `/v1/responses` API conformance to OpenResponses specification

Full changelog

New models

TranslateGemma: A new collection of open translation models built on Gemma 3, helping people communicate across 55 languages.

What's Changed

Shift + Enter (or Ctrl + j) will now enter a newline in Ollama's CLI
Improve /v1/responses API to better confirm to OpenResponses specification

New Contributors

@yuhongsun96 made their first contribution in https://github.com/ollama/ollama/pull/13135
@koaning made their first contribution in https://github.com/ollama/ollama/pull/13326

Full Changelog: https://github.com/ollama/ollama/compare/v0.14.1...v0.14.2

View release on GitHub

v0.14.1 New feature 6mo

Notable features

Experimental image generation models (Z-Image-Turbo)
More models in development (Qwen-Image, GLM-Image)

Full changelog

Image generation models (experimental)

Experimental image generation models are available for macOS and Linux (CUDA) in Ollama:

Available models

Z-Image-Turbo

ollama run x/z-image-turbo

Note: x is a username on ollama.com where experimental models are uploaded

More models coming soon:

Qwen-Image-2512
Qwen-Image-Edit-2511
GLM-Image

What's Changed

fix macOS auto-update signature verification failure

New Contributors

@joshxfi made their first contribution in https://github.com/ollama/ollama/pull/13711
@maternion made their first contribution in https://github.com/ollama/ollama/pull/13709

Full Changelog: https://github.com/ollama/ollama/compare/v0.14.0...v0.14.1

View release on GitHub

v0.14.0 New feature 6mo

⚠ Upgrade required

Linux install bundles now use zst compression — ensure your tooling supports zst decompression.
Modelfiles can now declare a minimum Ollama version via the REQUIRES command; models using this feature will require v0.14.0 or later.

Notable features

Experimental agent loop with bash tool via `ollama run --experimental`
Anthropic API compatibility: /v1/messages endpoint support
New Modelfile REQUIRES command for declaring minimum Ollama version

View release on GitHub

All releases

What's Changed

Gemma 4 MTP (Multi-token Processing) for the MLX runner

What's Changed

Claude Desktop

Claude Cowork

Claude Code

Not supported yet

What's Changed

New Contributors

What's Changed

New models

What's Changed

What's Changed

New Contributors

What's Changed

Kimi CLI

What's Changed

New Contributors

What's Changed

What's Changed

Gemma 4

What's Changed

Visual Studio Code

What's Changed

Web Search and Fetch in OpenClaw

Ollama web search plugin

Non-interactive (headless) mode for ollama launch

Use non-interactive mode in OpenClaw

What's Changed

Improved OpenClaw performance with Kimi-K2.5

Ollama is now a provider in OpenClaw

Nemotron-3-Super

Non-interactive task support

Lower latency on MiniMax-M2.5 and Qwen3.5 on Ollama’s cloud

Driver updates required for ROCm 7

What's Changed

New Contributors

What's Changed

New models

What's Changed

What's Changed

OpenClaw

Get started

Web search in OpenClaw

What's Changed

New Contributors

What's Changed

New Contributors

What's Changed

New models

New ollama

What's Changed

New models

Improvements to ollama launch

What's Changed

New Contributors

What's Changed

What's Changed

New Contributors

What's Changed

ollama launch

What's Changed

New models

What's Changed

New models

What's Changed

New Contributors

Image generation models (experimental)

Available models

What's Changed

New Contributors

New `ollama`

Improvements to `ollama launch`

`ollama launch`