Skip to content

Release history

ollama releases

Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

All releases

50 shown

No immediate action
v0.30.4 Bug fix

Gemma crash

No immediate action
v0.30.3 New feature

Gemma 4‑12B support

No immediate action
v0.30.2 Mixed

Cline CLI, Qwen integration, LLM improvements

No immediate action
v0.23.3 Bug fix

macOS 26 leak fix

v0.23.2 Breaking risk
⚠ Upgrade required
  • Use `ollama launch claude-desktop --restore` to re-enable Claude Desktop after upgrade.
Breaking changes
  • `ollama launch` no longer includes Claude Desktop
Notable features
  • /api/show responses are now cached, improving median latency by ~6.7x
  • Improved backup workflow when managing launch integrations
  • Cleaner image generation layout in the MLX runner
Full changelog

What's Changed

  • ollama launch no longer includes Claude Desktop due to the third-party integration being limited to Anthropic models.
  • Use ollama launch claude-desktop --restore to restore Claude Desktop to its normal state.
  • /api/show responses are now cached, improving median latency by ~6.7x which will increase load speed for integrations like VS Code.
  • Improved backup workflow when managing launch integrations
  • Cleaner image generation layout in the MLX runner

Full Changelog: https://github.com/ollama/ollama/compare/v0.23.1...v0.23.2

v0.23.1 New feature
Notable features
  • Gemma 4 MTP speculative decoding support on Macs (up to 2x speed increase for Gemma 4 31B coding tasks)
  • MLX and MLX-C updated with threading fixes
  • Go runtime bumped to version 1.26
Full changelog

Gemma 4 MTP (Multi-token Processing) for the MLX runner

Gemma 4 MTP speculative decoding is now supported on Macs. This can give over a 2x speed increase for the Gemma 4 31B model on coding tasks.

ollama run gemma4:31b-coding-mtp-bf16

What's Changed

  • Update MLX and MLX-C with threading fixes by @dhiltgen in https://github.com/ollama/ollama/pull/15845
  • go: bump to 1.26 by @ParthSareen in https://github.com/ollama/ollama/pull/15904
  • Add Gemma 4 MTP speculative decoding by @pdevine in https://github.com/ollama/ollama/pull/15980

Full Changelog: https://github.com/ollama/ollama/compare/v0.23.0...v0.23.1-rc0

v0.23.0 New feature
Notable features
  • Claude Desktop now supported via `ollama launch claude`
  • Ollama app surfaces featured models from server-driven recommendations
  • Claude Cowork and Claude Code integrated within the Claude Desktop App
Full changelog

Claude Desktop

Claude Desktop is now supported with Ollama Launch.

Claude Cowork and Claude Code are supported within the Claude Desktop App.

ollama launch claude-desktop

Claude Cowork

Claude Code

Claude Code on the terminal can still be accessed through the CLI with:

ollama launch claude

Not supported yet

  • Web Search (coming soon)
  • Extensions

What's Changed

  • Launch Claude Desktop with ollama launch claude
  • The Ollama app now surfaces featured models from server-driven recommendations
  • Fixed OpenClaw gateway timeout on Windows by enforcing IPv4 loopback (thanks @UniquePratham)
  • Hardened Metal initialization to gracefully handle ggml kernel compilation failures

New Contributors

  • @UniquePratham made their first contribution in https://github.com/ollama/ollama/pull/15726

Full Changelog: https://github.com/ollama/ollama/compare/v0.22.1...v0.23.0

v0.22.1 Bug fix
Notable features
  • Model recommendations updated without Ollama update
  • Desktop app launch page aligned with `ollama launch` integrations
  • Gemma 4 renderer improvements for thinking and tool calling
Full changelog

What's Changed

  • Updated the Gemma 4 renderer for thinking and tool calling improvements
  • Model recommendations are now updated without updating Ollama
  • Aligned the desktop app's launch page with ollama launch integrations
  • Fixed the Poolside integration title in ollama launch

Full Changelog: https://github.com/ollama/ollama/compare/v0.22.0...v0.22.1

v0.22.0 Feature
Notable features
  • NVIDIA's Nemotron 3 Omni model
  • Poolside's Laguna XS.2 open-weight coding model
Full changelog

New models

Full Changelog: https://github.com/ollama/ollama/compare/v0.21.2...v0.22.0

v0.21.3-rc0 New feature
Notable features
  • API accepts "max" as a think value
  • OpenAI responses map reasoning effort to think
Full changelog

What's Changed

  • api: accept "max" as a think value by @ParthSareen in https://github.com/ollama/ollama/pull/15787
  • openai: map responses reasoning effort to think by @ParthSareen in https://github.com/ollama/ollama/pull/15789

Full Changelog: https://github.com/ollama/ollama/compare/v0.21.2...v0.21.3-rc0

v0.21.2 Maintenance
Notable features
  • Improved OpenClaw onboarding flow
  • Canonical ordering of recommended models
  • Web search plugin bundling
Full changelog

What's Changed

  • Improved reliability of the OpenClaw onboarding flow in ollama launch
  • Recommended models in ollama launch now appear in a fixed, canonical order
  • OpenClaw integration now bundles Ollama's web search plugin in OpenClaw

New Contributors

  • @madflow made their first contribution in https://github.com/ollama/ollama/pull/15733

Full Changelog: https://github.com/ollama/ollama/compare/v0.21.1...v0.21.2

v0.21.1 New feature
Notable features
  • Kimi CLI integration
  • MLX logprobs support
  • Faster MLX sampling with fused top-P/top-K
Full changelog

What's Changed

Kimi CLI

You can now install and run the Kimi CLI through Ollama.

ollama launch kimi --model kimi-k2.6:cloud

Kimi CLI with Kimi K2.6 excels at long horizon agentic execution tasks through a multi-agent system.

  • MLX runner adds logprobs support for compatible models
  • Faster MLX sampling with fused top-P and top-K in a single sort pass, plus repeat penalties applied in the sampler
  • Improved MLX prompt tokenization by moving tokenization into request handler goroutines
  • Better MLX thread safety for array management
  • GLM4 MoE Lite performance improvement with a fused sigmoid router head
  • Fixed model picker showing stale model after switching chats in the macOS app
  • Fixed structured outputs for Gemma 4 when think=false

Full Changelog: https://github.com/ollama/ollama/compare/v0.21.0...v0.21.1

v0.21.0 Mixed
Notable features
  • Copilot CLI integration
  • Hermes integration
  • OpenCode inline configuration support
Full changelog

What's Changed

  • launch: skip unchanged integration rewrite configration by @hoyyeva in https://github.com/ollama/ollama/pull/15491
  • launch/openclaw: fix --yes flag behaviour to skip channels configuration by @hoyyeva in https://github.com/ollama/ollama/pull/15589
  • launch: OpenCode inline config by @hoyyeva in https://github.com/ollama/ollama/pull/15586
  • launch: add hermes by @ParthSareen in https://github.com/ollama/ollama/pull/15569
  • launch: always list cloud recommendations first by @hoyyeva in https://github.com/ollama/ollama/pull/15593
  • cmd/launch: add Copilot CLI integration by @scaryrawr in https://github.com/ollama/ollama/pull/15583

New Contributors

  • @scaryrawr made their first contribution in https://github.com/ollama/ollama/pull/15583

Full Changelog: https://github.com/ollama/ollama/compare/v0.20.8-rc0...v0.21.0

v0.20.5 New feature
Notable features
  • OpenClaw channel setup for WhatsApp, Telegram, Discord, and other messaging platforms
  • Flash attention support for Gemma 4 on compatible GPUs
  • Improved OpenCode install detection
v0.20.4 New feature
Notable features
  • mlx: Improved M5 performance using NAX
  • gemma4: Flash attention enabled
v0.20.3 Mixed
Notable features
  • Added latest models to Ollama App
  • Gemma 4 tool calling improvements
  • OpenClaw fixes for launching TUI
Full changelog

What's Changed

  • Gemma 4 Tool Calling improvements
  • Added latest models to Ollama App
  • OpenClaw fixes for launching TUI

Full Changelog: https://github.com/ollama/ollama/compare/v0.20.2...v0.20.3

v0.20.2 Maintenance

Minor fixes and improvements.

Full changelog

What's Changed

  • app: default app home view to new chat instead of launch by @jmorganca in https://github.com/ollama/ollama/pull/15312

Full Changelog: https://github.com/ollama/ollama/compare/v0.20.1...v0.20.2

v0.20.0 New feature
Notable features
  • Gemma 4 model family: E2B, E4B, 26B (MoE), and 31B (Dense) variants now available
  • MLX pipeline now respects tokenizer add_bos_token setting
  • SentencePiece-style BPE tokenizer support
Full changelog

Gemma 4

Effective 2B (E2B)

ollama run gemma4:e2b

Effective 4B (E4B)

ollama run gemma4:e4b

26B (Mixture of Experts model with 4B active parameters)

ollama run gemma4:26b

31B (Dense)

ollama run gemma4:31b

What's Changed

  • docs: update pi docs by @ParthSareen in https://github.com/ollama/ollama/pull/15152
  • mlx: respect tokenizer add_bos_token setting in pipeline by @dhiltgen in https://github.com/ollama/ollama/pull/15185
  • tokenizer: add SentencePiece-style BPE support by @dhiltgen in https://github.com/ollama/ollama/pull/15162

Full Changelog: https://github.com/ollama/ollama/compare/v0.19.0...v0.20.0-rc0

v0.19.0 New feature
Notable features
  • Apple Silicon builds now use the MLX framework for unified memory performance
  • `ollama launch pi` includes a web search plugin that leverages Ollama's web search
  • KV cache hit rates for the Anthropic-compatible API were improved
v0.18.4-rc0 Bug fix

Flash attention disabled for grok, KV cache memory leak fixed, periodic snapshot scheduling added, and VSCode documentation updated to improve inference reliability and developer tooling.

v0.18.3 New feature
Notable features
  • Ollama models (local and cloud) now available in Visual Studio Code via GitHub Copilot
  • GLM parser improvements for tool calls
  • OpenClaw integration improvements for gateway checks
Full changelog

Visual Studio Code

Microsoft Visual Studio Code now directly integrates with Ollama via GitHub Copilot.

If you have Ollama installed, any local or cloud model from Ollama can be selected for use within visual studio code.

What's Changed

  • GLM parser improvements for tool calls
  • OpenClaw integration improvements for gateway checks

Full Changelog: https://github.com/ollama/ollama/compare/v0.18.2...v0.18.3

v0.18.2 Bug fix

Ensured OpenClaw requires npm and git, fixed CLI model-flag handling, corrected websearch package registration, and resolved cache breakages that slowed Claude Code locally.

v0.18.1 New feature
⚠ Upgrade required
  • Web search in local models requires `ollama signin` authentication
  • Headless mode via `ollama launch` requires `--model` flag; use `--yes` to auto-pull model and skip selectors
  • `ollama launch openclaw` now uses official Ollama auth and model provider
Notable features
  • Web search and web fetch plugins for OpenClaw — models can search the web and fetch readable content (no JavaScript execution)
  • Non-interactive (headless) mode for `ollama launch` with `--model` and `--yes` flags for Docker, CI/CD, and script automation
  • Official Ollama auth and model provider integration for OpenClaw
Full changelog

Web Search and Fetch in OpenClaw

Ollama now ships with web search and web fetch plugin for OpenClaw. This allows Ollama's models (local or cloud) to search the web for the latest content and news. This also allows OpenClaw with Ollama to be able to fetch the web and extract readable content for processing. This feature does not execute JavaScript.

When using local models with web search in OpenClaw, ensure you are signed into Ollama with ollama signin

ollama launch openclaw

You can install web search directly into OpenClaw as a plugin if you already have OpenClaw configured and working:

Ollama web search plugin

openclaw plugins install @ollama/openclaw-web-search

Non-interactive (headless) mode for ollama launch

ollama launch can now run in non-interactive mode.

Perfect for:

  • Docker/containers: spin up an integration as a pipeline step to run evals, test prompts, or validate model behavior as part of your build. Tear it down when the job ends.

  • CI/CD: Generate code reviews, security checks, and other tasks within your CI

  • Scripts/automation: Kick off automated tasks with Ollama and claude code

  • --model must be specified to run in headless mode

  • --yes flag will auto-pull the model and skip any selectors

Try with: ollama launch claude --model kimi-k2.5:cloud --yes -- -p "how does this repository work?"

Use non-interactive mode in OpenClaw

You can ask your OpenClaw to run tasks using claude with subagents:

ollama launch claude --model kimi-k2.5:cloud --yes -- -p "how does this repository work?" using a subagent

What's Changed

  • ollama launch openclaw will now use the official Ollama auth and model provider for OpenClaw
  • Improvements to Ollama's benchmarking tool in ./cmd/bench
  • ollama launch openclaw will now skip --install-daemon when systemd is unavailable

Full Changelog: https://github.com/ollama/ollama/compare/v0.18.0...v0.18.1

v0.18.0 New feature
Notable features
  • 2x faster Kimi-K2.5 performance
  • Nemotron-3-Super 122B model
  • Non-interactive task support
Full changelog

Ollama 0.18 includes improved performance for OpenClaw and Ollama’s cloud models, including the new Nemotron-3-Super model by NVIDIA designed for high-performance agentic reasoning tasks.

Improved OpenClaw performance with Kimi-K2.5

This release of Ollama improves performance of cloud models and their reliability.

  • Up to 2x faster speeds with Kimi-K2.5
  • Tool calling accuracy has been improved
ollama launch openclaw --model kimi-k2.5

Ollama is now a provider in OpenClaw

Ollama can now be selected as an authentication and model provider during OpenClaw onboarding (thanks @BruceMacD for contributing and @steipete for reviewing!)

openclaw onboard --auth-choice ollama

More information: https://docs.openclaw.ai/providers/ollama

Nemotron-3-Super

Nemotron-3-Super: is a new 122B parameter model with strong reasoning and tool calling capability, while having top performance when run on modern hardware:

  • ollama run nemotron-3-super:cloud
  • ollama run nemotron-3-super to run locally (requires 96GB+ of VRAM)

Nemotron-3-Super scores highest of any open model on PinchBench, a benchmark suite that measures how successful models are at completing tasks when used with OpenClaw.

ollama launch openclaw --model nemotron-3-super:cloud

Or using OpenClaw’s onboarding:

openclaw onboard \
	--auth-choice ollama \
	--custom-model-id nemotron-3-super:cloud

Non-interactive task support

ollama launch now supports non-interactive tasks by passing in --yes. This enables using Claude, Codex, Pi and more in scripts, GitHub Actions, and other non-interactive environments.

ollama launch claude \
	--model glm-5:cloud \
	--yes \
	-- "Do a quick code review of this pull request and respond on GitHub with a comment summarizing your feedback."

Lower latency on MiniMax-M2.5 and Qwen3.5 on Ollama’s cloud

For customers in North America, MiniMax-M2.5 and Qwen3.5 on Ollama’s cloud now respond much faster, up to 10x and up to 2x faster respectively, and often in less than a second. This is ideal for tasks that require a fast Time To First Token (TTFT) when needing quick answers from OpenClaw or quick back-to-back coding tasks.

ollama launch claude --model minimax-m2.5

Driver updates required for ROCm 7

This version of Ollama ships with ROCm 7, and requires updating drivers to the latest version for continued support.

What's Changed

  • Ollama's cloud models no longer require downloading via ollama pull. Setting :cloud as a tag will now automatically connect to cloud models.
  • New --yes flag for ollama launch that skips all prompts, making it possible to run AI assistants and other tools in non-interactive environments
  • Fixed issue where "Reset to Defaults" in Ollama's app would disable downloading automatic updates.
  • Ollama will now ensure context compaction occurs at the correct context length for each model when using ollama launch claude

New Contributors

  • @flipbit03 made their first contribution in https://github.com/ollama/ollama/pull/14821
  • @shivamtiwari3 made their first contribution in https://github.com/ollama/ollama/pull/14825

Full Changelog: https://github.com/ollama/ollama/compare/v0.17.7...v0.18.0

v0.17.8-rc4 Bug fix

Unclosed argument tags in GLM calls were repaired, cloud model stub handling restored, localhost connections fixed, Docker builds accelerated, and int4 groupsize 64 added, boosting reliability and performance in deployments.

v0.17.7 Mixed
Notable features
  • Context length support for compaction when using ollama launch
  • Thinking level values now correctly interpreted in the API for all thinking models
Full changelog

What's Changed

  • Allow thinking levels such as "medium" to correctly interpreted in Ollama's API for all thinking models
  • Add context length to support compaction when using ollama launch

Full Changelog: https://github.com/ollama/ollama/compare/v0.17.6...v0.17.7

v0.17.6 Bug fix

The update corrects OCR prompt rendering and fixes parsing issues for Qwen 3.5 models, restoring expected functionality for OCR and tool integration.

v0.17.5 Bug fix

Qwen 3.5 models now support multiple sizes, with fixes for GPU/CPU splitting crashes, repetition errors, and memory/MLX issues, improving reliability and performance for developers deploying mixed-hardware inference.

v0.17.4 New feature
Notable features
  • Qwen 3.5 multimodal models
  • LFM 2 hybrid on-device models
  • Tool call indices in parallel calls
Full changelog

New models

  • Qwen 3.5: a family of open-source multimodal models that delivers exceptional utility and performance.
  • LFM 2: LFM2 is a family of hybrid models designed for on-device deployment. LFM2-24B-A2B is the largest model in the family, scaling the architecture to 24 billion parameters while keeping inference efficient.

Note: for users on 0.17.1, this version will not automatically update. Re-downloading is required to receive the latest version of Ollama.

What's Changed

  • Tool call indices will now be included in parallel tool calls

Full Changelog: https://github.com/ollama/ollama/compare/v0.17.3...v0.17.4

v0.17.3 Bug fix

The update fixes a parsing error that prevented tool calls in Qwen 3 and Qwen 3.5 models from being recognized when generated during the model's reasoning phase.

v0.17.2 Bug fix

Fixed a Windows crash that occurred after downloading updates, improving stability for local installations.

v0.17.1 Breaking risk
⚠ Upgrade required
  • MLX engine users: ollama create with unquantized models will no longer apply affine quantization by default
Breaking changes
  • ollama create command no longer defaults to affine quantization for unquantized models when using the MLX engine
Notable features
  • Nemotron architecture support in engine
  • Web search capabilities for models that support tools
  • Configuration option to disable automatic update downloading
Full changelog

What's Changed

  • Nemotron architecture support in Ollama's engine
  • MLX engine now has improved memory usage
  • Ollama's app will now allow models that support tools to use web search capabilities
  • Improved LFM2 and LFM2.5 models in Ollama's engine
  • ollama create will no longer default to affine quantization for unquantized models when using the MLX engine
  • Added configuration for disabling automatic update downloading

Full Changelog: https://github.com/ollama/ollama/compare/v0.17.0...v0.17.1

v0.17.0 New feature
Notable features
  • OpenClaw integration
  • Web search in cloud models
  • Improved tokenizer performance
Full changelog

OpenClaw

OpenClaw can now be installed and configured automatically via Ollama, making it the easiest way to get up and running with OpenClaw with open models like Kimi-K2.5, GLM-5, and Minimax-M2.5.

Get started

ollama launch openclaw

Web search in OpenClaw

When using cloud models, websearch is enabled - allowing OpenClaw to search the internet.

What's Changed

  • Improved tokenizer performance
  • Ollama's macOS and Windows apps will now default to a context length based on available VRAM

New Contributors

  • @natl-set made their first contribution in https://github.com/ollama/ollama/pull/14322

Full Changelog: https://github.com/ollama/ollama/compare/v0.16.3...v0.17.0

v0.16.3 New feature
Notable features
  • ollama launch cline subcommand added for Cline CLI integration
  • ollama launch now always displays the model picker
  • MLX runner now supports Gemma 3, Llama, and Qwen 3 architectures
Full changelog

What's Changed

  • New ollama launch cline added for the Cline CLI
  • ollama launch <integration> will now always show the model picker
  • Added Gemma 3, Llama and Qwen 3 architectures to MLX runner

New Contributors

  • @hellosaumil made their first contribution in https://github.com/ollama/ollama/pull/14271

Full Changelog: https://github.com/ollama/ollama/compare/v0.16.2...v0.16.3

v0.16.2 Mixed
Notable features
  • New `OLLAMA_NO_CLOUD` environment variable and app setting to disable cloud models for sensitive tasks
Full changelog

What's Changed

  • ollama launch claude now supports searching the web when using :cloud models
  • Fixed rendering issue when running ollama in PowerShell
  • New setting in Ollama's app makes it easier to disable cloud models for sensitive and private tasks where data cannot leave your computer. For Linux or when running ollama serve manually, set OLLAMA_NO_CLOUD=1.
  • Fixed issue where experimental image generation models would not run in 0.16.0 and 0.16.1

Full Changelog: https://github.com/ollama/ollama/compare/v0.16.1...v0.16.2-rc0

v0.16.1 Bug fix

Installation scripts on macOS and Windows now provide smoother password prompts and progress feedback, and image generation models honor the load timeout setting.

v0.16.0 New feature
Notable features
  • GLM-5 reasoning model (744B total parameters, 40B active)
  • MiniMax-M2.5 model for productivity and coding
  • Text prompt editing in editor via Ctrl+G
Full changelog

New models

  • GLM-5: A strong reasoning and agentic model from Z.ai with 744B total parameters (40B active), built for complex systems engineering and long-horizon tasks.
  • MiniMax-M2.5: a new state-of-the-art large language model designed for real-world productivity and coding tasks.

New ollama

The new ollama command makes it easy to launch your favorite apps with models using Ollama

What's Changed

  • Launch Pi with ollama launch pi
  • Improvements to Ollama's MLX runner to support GLM-4.7-Flash
  • Ctrl+G will now allow for editing text prompts in a text editor when running a model

Full Changelog: https://github.com/ollama/ollama/compare/v0.15.6...v0.16.0

v0.15.6 Bug fix

Fixes context limit crashes for droid launches, corrects image handling bugs, and automatically downloads missing models to prevent errors.

v0.15.5 New feature
Notable features
  • Qwen3-Coder-Next coding model
  • GLM-OCR document understanding
  • ollama launch subagent support
Full changelog

New models

  • Qwen3-Coder-Next: a coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development.
  • GLM-OCR: GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.

Improvements to ollama launch

  • ollama launch can now be provided arguments, for example ollama launch claude -- --resume
  • ollama launch will now work run subagents when using ollama launch claude
  • Ollama will now set context limits for a set of models when using ollama launch opencode

What's Changed

  • Sub-agent support for ollama launch for planning, deep research, and similar tasks
  • ollama signin will now open a browser window to make signing in easier
  • Ollama will now default to the following context lengths based on VRAM:
    • < 24 GiB VRAM: 4,096 context
    • 24-48 GiB VRAM: 32,768 context
    • >= 48 GiB VRAM: 262,144 context
  • GLM-4.7-Flash support on Ollama's experimental MLX engine
  • ollama signin will now open the browser to the connect page
  • Fixed off by one error when using num_predict in the API
  • Fixed issue where tokens from a previous sequence would be returned when hitting num_predict

New Contributors

  • @avukmirovich made their first contribution in https://github.com/ollama/ollama/pull/13934

Full Changelog: https://github.com/ollama/ollama/compare/v0.15.4...v0.15.5

v0.15.4 New feature
Notable features
  • ollama launch openclaw now enters standard onboarding flow if previously incomplete
Full changelog

What's Changed

  • ollama launch openclaw will now enter the standard OpenClaw onboarding flow if this has not yet been completed.

Full Changelog: https://github.com/ollama/ollama/compare/v0.15.3...v0.15.4

v0.15.3 Breaking risk
Breaking changes
  • Command `ollama launch clawdbot` renamed to `ollama launch openclaw`
Notable features
  • Improved tool calling for Ministral models
  • ollama launch now respects the OLLAMA_HOST environment variable
Full changelog

What's Changed

  • Renamed ollama launch clawdbot to ollama launch openclaw to reflect the project's new name
  • Improved tool calling for Ministral models
  • docs: add clawdbot by @ParthSareen in https://github.com/ollama/ollama/pull/13925
  • cmd/config: Use envconfig.Host() for base API in launch config packages by @gabe-l-hart in https://github.com/ollama/ollama/pull/13937
  • ollama launch will now use the value of OLLAMA_HOST when running it

New Contributors

  • @MBerguer made their first contribution in https://github.com/ollama/ollama/pull/13971
  • @taronsung made their first contribution in https://github.com/ollama/ollama/pull/13965
  • @noureldin-azzab made their first contribution in https://github.com/ollama/ollama/pull/13961
  • @dhirajlochib made their first contribution in https://github.com/ollama/ollama/pull/13645
  • @ThanhNguyxn made their first contribution in https://github.com/ollama/ollama/pull/13979

Full Changelog: https://github.com/ollama/ollama/compare/v0.15.2...v0.15.3

v0.15.2 New feature
Notable features
  • New `ollama launch clawdbot` command for launching Clawdbot using Ollama models
Full changelog

What's Changed

  • New ollama launch clawdbot command for launching Clawdbot using Ollama models

Full Changelog: https://github.com/ollama/ollama/compare/v0.15.1...v0.15.2

v0.15.1 Bug fix

Improved performance and correctness of GLM-4.7-Flash, resolved macOS and arm64 Linux slowdowns, and corrected launch detection for Claude, preventing configuration errors.

v0.15.0 Mixed
Notable features
  • New `ollama launch` command for Claude Code, Codex, OpenCode, and Droid integration
  • Multi-line strings with `"""` now work in `ollama run`
  • Ctrl + J and Shift + Enter support for inserting newlines in `ollama run`
Full changelog

ollama launch

A new ollama launch command to use Ollama's models with Claude Code, Codex, OpenCode, and Droid without separate configuration.

What's Changed

  • New ollama launch command for Claude Code, Codex, OpenCode, and Droid
  • Fixed issue where creating multi-line strings with """ would not work when using ollama run
  • Ctrl+J and Shift+Enter now work for inserting newlines in ollama run
  • Reduced memory usage for GLM-4.7-Flash models
v0.14.3 New feature
Notable features
  • Z-Image Turbo (Alibaba): 6B text-to-image model for photorealistic image generation
  • Flux.2 Klein (Black Forest Labs): fastest image generation model to date
  • /api/generate endpoint now supports image generation
Full changelog
  • Z-Image Turbo: 6 billion parameter text-to-image model from Alibaba’s Tongyi Lab. It generates high-quality photorealistic images.
  • Flux.2 Klein: Black Forest Labs’ fastest image-generation models to date.

New models

  • GLM-4.7-Flash: As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.
  • LFM2.5-1.2B-Thinking: LFM2.5 is a new family of hybrid models designed for on-device deployment.

What's Changed

  • Fixed issue where Ollama's macOS app would interrupt system shutdown
  • Fixed ollama create and ollama show commands for experimental models
  • The /api/generate API can now be used for image generation
  • Fixed minor issues in Nemotron-3-Nano tool parsing
  • Fixed issue where removing an image generation model would cause it to first load
  • Fixed issue where ollama rm would only stop the first model in the list if it were running

Full Changelog: https://github.com/ollama/ollama/compare/v0.14.2...v0.14.3

v0.14.2 New feature
Notable features
  • TranslateGemma: new open translation model collection supporting 55 languages, built on Gemma 3
  • CLI: Shift + Enter (or Ctrl + j) now enters newlines
  • Improved `/v1/responses` API conformance to OpenResponses specification
Full changelog

New models

  • TranslateGemma: A new collection of open translation models built on Gemma 3, helping people communicate across 55 languages.

What's Changed

  • Shift + Enter (or Ctrl + j) will now enter a newline in Ollama's CLI
  • Improve /v1/responses API to better confirm to OpenResponses specification

New Contributors

  • @yuhongsun96 made their first contribution in https://github.com/ollama/ollama/pull/13135
  • @koaning made their first contribution in https://github.com/ollama/ollama/pull/13326

Full Changelog: https://github.com/ollama/ollama/compare/v0.14.1...v0.14.2

v0.14.1 New feature
Notable features
  • Experimental image generation models (Z-Image-Turbo)
  • More models in development (Qwen-Image, GLM-Image)
Full changelog

Image generation models (experimental)

Experimental image generation models are available for macOS and Linux (CUDA) in Ollama:

Available models

ollama run x/z-image-turbo

Note: x is a username on ollama.com where experimental models are uploaded

More models coming soon:

  1. Qwen-Image-2512
  2. Qwen-Image-Edit-2511
  3. GLM-Image

What's Changed

  • fix macOS auto-update signature verification failure

New Contributors

  • @joshxfi made their first contribution in https://github.com/ollama/ollama/pull/13711
  • @maternion made their first contribution in https://github.com/ollama/ollama/pull/13709

Full Changelog: https://github.com/ollama/ollama/compare/v0.14.0...v0.14.1

v0.14.0 New feature
⚠ Upgrade required
  • Linux install bundles now use zst compression — ensure your tooling supports zst decompression.
  • Modelfiles can now declare a minimum Ollama version via the REQUIRES command; models using this feature will require v0.14.0 or later.
Notable features
  • Experimental agent loop with bash tool via `ollama run --experimental`
  • Anthropic API compatibility: /v1/messages endpoint support
  • New Modelfile REQUIRES command for declaring minimum Ollama version

Beta — feedback welcome: [email protected]