Release history
ollama releases
Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.
All releases
50 shown
- Use `ollama launch claude-desktop --restore` to re-enable Claude Desktop after upgrade.
- `ollama launch` no longer includes Claude Desktop
- /api/show responses are now cached, improving median latency by ~6.7x
- Improved backup workflow when managing launch integrations
- Cleaner image generation layout in the MLX runner
Full changelog
What's Changed
ollama launchno longer includes Claude Desktop due to the third-party integration being limited to Anthropic models.- Use
ollama launch claude-desktop --restoreto restore Claude Desktop to its normal state. /api/showresponses are now cached, improving median latency by ~6.7x which will increase load speed for integrations like VS Code.- Improved backup workflow when managing launch integrations
- Cleaner image generation layout in the MLX runner
Full Changelog: https://github.com/ollama/ollama/compare/v0.23.1...v0.23.2
- Gemma 4 MTP speculative decoding support on Macs (up to 2x speed increase for Gemma 4 31B coding tasks)
- MLX and MLX-C updated with threading fixes
- Go runtime bumped to version 1.26
Full changelog
Gemma 4 MTP (Multi-token Processing) for the MLX runner
Gemma 4 MTP speculative decoding is now supported on Macs. This can give over a 2x speed increase for the Gemma 4 31B model on coding tasks.
ollama run gemma4:31b-coding-mtp-bf16
What's Changed
- Update MLX and MLX-C with threading fixes by @dhiltgen in https://github.com/ollama/ollama/pull/15845
- go: bump to 1.26 by @ParthSareen in https://github.com/ollama/ollama/pull/15904
- Add Gemma 4 MTP speculative decoding by @pdevine in https://github.com/ollama/ollama/pull/15980
Full Changelog: https://github.com/ollama/ollama/compare/v0.23.0...v0.23.1-rc0
- Claude Desktop now supported via `ollama launch claude`
- Ollama app surfaces featured models from server-driven recommendations
- Claude Cowork and Claude Code integrated within the Claude Desktop App
Full changelog
Claude Desktop
Claude Desktop is now supported with Ollama Launch.
Claude Cowork and Claude Code are supported within the Claude Desktop App.
ollama launch claude-desktop
Claude Cowork
Claude Code
Claude Code on the terminal can still be accessed through the CLI with:
ollama launch claude
Not supported yet
- Web Search (coming soon)
- Extensions
What's Changed
- Launch Claude Desktop with
ollama launch claude - The Ollama app now surfaces featured models from server-driven recommendations
- Fixed OpenClaw gateway timeout on Windows by enforcing IPv4 loopback (thanks @UniquePratham)
- Hardened Metal initialization to gracefully handle ggml kernel compilation failures
New Contributors
- @UniquePratham made their first contribution in https://github.com/ollama/ollama/pull/15726
Full Changelog: https://github.com/ollama/ollama/compare/v0.22.1...v0.23.0
- Model recommendations updated without Ollama update
- Desktop app launch page aligned with `ollama launch` integrations
- Gemma 4 renderer improvements for thinking and tool calling
Full changelog
What's Changed
- Updated the Gemma 4 renderer for thinking and tool calling improvements
- Model recommendations are now updated without updating Ollama
- Aligned the desktop app's launch page with
ollama launchintegrations - Fixed the Poolside integration title in
ollama launch
Full Changelog: https://github.com/ollama/ollama/compare/v0.22.0...v0.22.1
- NVIDIA's Nemotron 3 Omni model
- Poolside's Laguna XS.2 open-weight coding model
Full changelog
New models
- NVIDIA's Nemotron 3 Omni
- Poolside's first open-weight coding model - Laguna XS.2
Full Changelog: https://github.com/ollama/ollama/compare/v0.21.2...v0.22.0
- API accepts "max" as a think value
- OpenAI responses map reasoning effort to think
Full changelog
What's Changed
- api: accept "max" as a think value by @ParthSareen in https://github.com/ollama/ollama/pull/15787
- openai: map responses reasoning effort to think by @ParthSareen in https://github.com/ollama/ollama/pull/15789
Full Changelog: https://github.com/ollama/ollama/compare/v0.21.2...v0.21.3-rc0
- Improved OpenClaw onboarding flow
- Canonical ordering of recommended models
- Web search plugin bundling
Full changelog
What's Changed
- Improved reliability of the OpenClaw onboarding flow in
ollama launch - Recommended models in
ollama launchnow appear in a fixed, canonical order - OpenClaw integration now bundles Ollama's web search plugin in OpenClaw
New Contributors
- @madflow made their first contribution in https://github.com/ollama/ollama/pull/15733
Full Changelog: https://github.com/ollama/ollama/compare/v0.21.1...v0.21.2
- Kimi CLI integration
- MLX logprobs support
- Faster MLX sampling with fused top-P/top-K
Full changelog
What's Changed
Kimi CLI
You can now install and run the Kimi CLI through Ollama.
ollama launch kimi --model kimi-k2.6:cloud
Kimi CLI with Kimi K2.6 excels at long horizon agentic execution tasks through a multi-agent system.
- MLX runner adds logprobs support for compatible models
- Faster MLX sampling with fused top-P and top-K in a single sort pass, plus repeat penalties applied in the sampler
- Improved MLX prompt tokenization by moving tokenization into request handler goroutines
- Better MLX thread safety for array management
- GLM4 MoE Lite performance improvement with a fused sigmoid router head
- Fixed model picker showing stale model after switching chats in the macOS app
- Fixed structured outputs for Gemma 4 when
think=false
Full Changelog: https://github.com/ollama/ollama/compare/v0.21.0...v0.21.1
- Copilot CLI integration
- Hermes integration
- OpenCode inline configuration support
Full changelog
What's Changed
- launch: skip unchanged integration rewrite configration by @hoyyeva in https://github.com/ollama/ollama/pull/15491
- launch/openclaw: fix --yes flag behaviour to skip channels configuration by @hoyyeva in https://github.com/ollama/ollama/pull/15589
- launch: OpenCode inline config by @hoyyeva in https://github.com/ollama/ollama/pull/15586
- launch: add hermes by @ParthSareen in https://github.com/ollama/ollama/pull/15569
- launch: always list cloud recommendations first by @hoyyeva in https://github.com/ollama/ollama/pull/15593
- cmd/launch: add Copilot CLI integration by @scaryrawr in https://github.com/ollama/ollama/pull/15583
New Contributors
- @scaryrawr made their first contribution in https://github.com/ollama/ollama/pull/15583
Full Changelog: https://github.com/ollama/ollama/compare/v0.20.8-rc0...v0.21.0
- OpenClaw channel setup for WhatsApp, Telegram, Discord, and other messaging platforms
- Flash attention support for Gemma 4 on compatible GPUs
- Improved OpenCode install detection
- mlx: Improved M5 performance using NAX
- gemma4: Flash attention enabled
- Added latest models to Ollama App
- Gemma 4 tool calling improvements
- OpenClaw fixes for launching TUI
Full changelog
What's Changed
- Gemma 4 Tool Calling improvements
- Added latest models to Ollama App
- OpenClaw fixes for launching TUI
Full Changelog: https://github.com/ollama/ollama/compare/v0.20.2...v0.20.3
Minor fixes and improvements.
Full changelog
What's Changed
- app: default app home view to new chat instead of launch by @jmorganca in https://github.com/ollama/ollama/pull/15312
Full Changelog: https://github.com/ollama/ollama/compare/v0.20.1...v0.20.2
- Gemma 4 model family: E2B, E4B, 26B (MoE), and 31B (Dense) variants now available
- MLX pipeline now respects tokenizer add_bos_token setting
- SentencePiece-style BPE tokenizer support
Full changelog
Gemma 4
Effective 2B (E2B)
ollama run gemma4:e2b
Effective 4B (E4B)
ollama run gemma4:e4b
26B (Mixture of Experts model with 4B active parameters)
ollama run gemma4:26b
31B (Dense)
ollama run gemma4:31b
What's Changed
- docs: update pi docs by @ParthSareen in https://github.com/ollama/ollama/pull/15152
- mlx: respect tokenizer add_bos_token setting in pipeline by @dhiltgen in https://github.com/ollama/ollama/pull/15185
- tokenizer: add SentencePiece-style BPE support by @dhiltgen in https://github.com/ollama/ollama/pull/15162
Full Changelog: https://github.com/ollama/ollama/compare/v0.19.0...v0.20.0-rc0
- Apple Silicon builds now use the MLX framework for unified memory performance
- `ollama launch pi` includes a web search plugin that leverages Ollama's web search
- KV cache hit rates for the Anthropic-compatible API were improved
Flash attention disabled for grok, KV cache memory leak fixed, periodic snapshot scheduling added, and VSCode documentation updated to improve inference reliability and developer tooling.
- Ollama models (local and cloud) now available in Visual Studio Code via GitHub Copilot
- GLM parser improvements for tool calls
- OpenClaw integration improvements for gateway checks
Full changelog
Visual Studio Code
Microsoft Visual Studio Code now directly integrates with Ollama via GitHub Copilot.
If you have Ollama installed, any local or cloud model from Ollama can be selected for use within visual studio code.
What's Changed
- GLM parser improvements for tool calls
- OpenClaw integration improvements for gateway checks
Full Changelog: https://github.com/ollama/ollama/compare/v0.18.2...v0.18.3
Ensured OpenClaw requires npm and git, fixed CLI model-flag handling, corrected websearch package registration, and resolved cache breakages that slowed Claude Code locally.
- Web search in local models requires `ollama signin` authentication
- Headless mode via `ollama launch` requires `--model` flag; use `--yes` to auto-pull model and skip selectors
- `ollama launch openclaw` now uses official Ollama auth and model provider
- Web search and web fetch plugins for OpenClaw — models can search the web and fetch readable content (no JavaScript execution)
- Non-interactive (headless) mode for `ollama launch` with `--model` and `--yes` flags for Docker, CI/CD, and script automation
- Official Ollama auth and model provider integration for OpenClaw
Full changelog
Web Search and Fetch in OpenClaw
Ollama now ships with web search and web fetch plugin for OpenClaw. This allows Ollama's models (local or cloud) to search the web for the latest content and news. This also allows OpenClaw with Ollama to be able to fetch the web and extract readable content for processing. This feature does not execute JavaScript.
When using local models with web search in OpenClaw, ensure you are signed into Ollama with ollama signin
ollama launch openclaw
You can install web search directly into OpenClaw as a plugin if you already have OpenClaw configured and working:
Ollama web search plugin
openclaw plugins install @ollama/openclaw-web-search
Non-interactive (headless) mode for ollama launch
ollama launch can now run in non-interactive mode.
Perfect for:
-
Docker/containers: spin up an integration as a pipeline step to run evals, test prompts, or validate model behavior as part of your build. Tear it down when the job ends.
-
CI/CD: Generate code reviews, security checks, and other tasks within your CI
-
Scripts/automation: Kick off automated tasks with Ollama and claude code
-
--modelmust be specified to run in headless mode -
--yesflag will auto-pull the model and skip any selectors
Try with: ollama launch claude --model kimi-k2.5:cloud --yes -- -p "how does this repository work?"
Use non-interactive mode in OpenClaw
You can ask your OpenClaw to run tasks using claude with subagents:
ollama launch claude --model kimi-k2.5:cloud --yes -- -p "how does this repository work?" using a subagent
What's Changed
ollama launch openclawwill now use the official Ollama auth and model provider for OpenClaw- Improvements to Ollama's benchmarking tool in
./cmd/bench ollama launch openclawwill now skip--install-daemonwhen systemd is unavailable
Full Changelog: https://github.com/ollama/ollama/compare/v0.18.0...v0.18.1
- 2x faster Kimi-K2.5 performance
- Nemotron-3-Super 122B model
- Non-interactive task support
Full changelog
Ollama 0.18 includes improved performance for OpenClaw and Ollama’s cloud models, including the new Nemotron-3-Super model by NVIDIA designed for high-performance agentic reasoning tasks.
Improved OpenClaw performance with Kimi-K2.5
This release of Ollama improves performance of cloud models and their reliability.
- Up to 2x faster speeds with Kimi-K2.5
- Tool calling accuracy has been improved
ollama launch openclaw --model kimi-k2.5
Ollama is now a provider in OpenClaw
Ollama can now be selected as an authentication and model provider during OpenClaw onboarding (thanks @BruceMacD for contributing and @steipete for reviewing!)
openclaw onboard --auth-choice ollama
More information: https://docs.openclaw.ai/providers/ollama
Nemotron-3-Super
Nemotron-3-Super: is a new 122B parameter model with strong reasoning and tool calling capability, while having top performance when run on modern hardware:
ollama run nemotron-3-super:cloudollama run nemotron-3-superto run locally (requires 96GB+ of VRAM)
Nemotron-3-Super scores highest of any open model on PinchBench, a benchmark suite that measures how successful models are at completing tasks when used with OpenClaw.
ollama launch openclaw --model nemotron-3-super:cloud
Or using OpenClaw’s onboarding:
openclaw onboard \
--auth-choice ollama \
--custom-model-id nemotron-3-super:cloud
Non-interactive task support
ollama launch now supports non-interactive tasks by passing in --yes. This enables using Claude, Codex, Pi and more in scripts, GitHub Actions, and other non-interactive environments.
ollama launch claude \
--model glm-5:cloud \
--yes \
-- "Do a quick code review of this pull request and respond on GitHub with a comment summarizing your feedback."
Lower latency on MiniMax-M2.5 and Qwen3.5 on Ollama’s cloud
For customers in North America, MiniMax-M2.5 and Qwen3.5 on Ollama’s cloud now respond much faster, up to 10x and up to 2x faster respectively, and often in less than a second. This is ideal for tasks that require a fast Time To First Token (TTFT) when needing quick answers from OpenClaw or quick back-to-back coding tasks.
ollama launch claude --model minimax-m2.5
Driver updates required for ROCm 7
This version of Ollama ships with ROCm 7, and requires updating drivers to the latest version for continued support.
What's Changed
- Ollama's cloud models no longer require downloading via
ollama pull. Setting:cloudas a tag will now automatically connect to cloud models. - New
--yesflag forollama launchthat skips all prompts, making it possible to run AI assistants and other tools in non-interactive environments - Fixed issue where "Reset to Defaults" in Ollama's app would disable downloading automatic updates.
- Ollama will now ensure context compaction occurs at the correct context length for each model when using
ollama launch claude
New Contributors
- @flipbit03 made their first contribution in https://github.com/ollama/ollama/pull/14821
- @shivamtiwari3 made their first contribution in https://github.com/ollama/ollama/pull/14825
Full Changelog: https://github.com/ollama/ollama/compare/v0.17.7...v0.18.0
Unclosed argument tags in GLM calls were repaired, cloud model stub handling restored, localhost connections fixed, Docker builds accelerated, and int4 groupsize 64 added, boosting reliability and performance in deployments.
- Context length support for compaction when using ollama launch
- Thinking level values now correctly interpreted in the API for all thinking models
Full changelog
What's Changed
- Allow thinking levels such as
"medium"to correctly interpreted in Ollama's API for all thinking models - Add context length to support compaction when using
ollama launch
Full Changelog: https://github.com/ollama/ollama/compare/v0.17.6...v0.17.7
The update corrects OCR prompt rendering and fixes parsing issues for Qwen 3.5 models, restoring expected functionality for OCR and tool integration.
Qwen 3.5 models now support multiple sizes, with fixes for GPU/CPU splitting crashes, repetition errors, and memory/MLX issues, improving reliability and performance for developers deploying mixed-hardware inference.
- Qwen 3.5 multimodal models
- LFM 2 hybrid on-device models
- Tool call indices in parallel calls
Full changelog
New models
- Qwen 3.5: a family of open-source multimodal models that delivers exceptional utility and performance.
- LFM 2: LFM2 is a family of hybrid models designed for on-device deployment. LFM2-24B-A2B is the largest model in the family, scaling the architecture to 24 billion parameters while keeping inference efficient.
Note: for users on 0.17.1, this version will not automatically update. Re-downloading is required to receive the latest version of Ollama.
What's Changed
- Tool call indices will now be included in parallel tool calls
Full Changelog: https://github.com/ollama/ollama/compare/v0.17.3...v0.17.4
The update fixes a parsing error that prevented tool calls in Qwen 3 and Qwen 3.5 models from being recognized when generated during the model's reasoning phase.
Fixed a Windows crash that occurred after downloading updates, improving stability for local installations.
- MLX engine users: ollama create with unquantized models will no longer apply affine quantization by default
- ollama create command no longer defaults to affine quantization for unquantized models when using the MLX engine
- Nemotron architecture support in engine
- Web search capabilities for models that support tools
- Configuration option to disable automatic update downloading
Full changelog
What's Changed
- Nemotron architecture support in Ollama's engine
- MLX engine now has improved memory usage
- Ollama's app will now allow models that support tools to use web search capabilities
- Improved LFM2 and LFM2.5 models in Ollama's engine
ollama createwill no longer default to affine quantization for unquantized models when using the MLX engine- Added configuration for disabling automatic update downloading
Full Changelog: https://github.com/ollama/ollama/compare/v0.17.0...v0.17.1
- OpenClaw integration
- Web search in cloud models
- Improved tokenizer performance
Full changelog
OpenClaw
OpenClaw can now be installed and configured automatically via Ollama, making it the easiest way to get up and running with OpenClaw with open models like Kimi-K2.5, GLM-5, and Minimax-M2.5.
Get started
ollama launch openclaw
Web search in OpenClaw
When using cloud models, websearch is enabled - allowing OpenClaw to search the internet.
What's Changed
- Improved tokenizer performance
- Ollama's macOS and Windows apps will now default to a context length based on available VRAM
New Contributors
- @natl-set made their first contribution in https://github.com/ollama/ollama/pull/14322
Full Changelog: https://github.com/ollama/ollama/compare/v0.16.3...v0.17.0
- ollama launch cline subcommand added for Cline CLI integration
- ollama launch now always displays the model picker
- MLX runner now supports Gemma 3, Llama, and Qwen 3 architectures
Full changelog
What's Changed
- New
ollama launch clineadded for the Cline CLI ollama launch <integration>will now always show the model picker- Added Gemma 3, Llama and Qwen 3 architectures to MLX runner
New Contributors
- @hellosaumil made their first contribution in https://github.com/ollama/ollama/pull/14271
Full Changelog: https://github.com/ollama/ollama/compare/v0.16.2...v0.16.3
- New `OLLAMA_NO_CLOUD` environment variable and app setting to disable cloud models for sensitive tasks
Full changelog
What's Changed
ollama launch claudenow supports searching the web when using:cloudmodels- Fixed rendering issue when running
ollamain PowerShell - New setting in Ollama's app makes it easier to disable cloud models for sensitive and private tasks where data cannot leave your computer. For Linux or when running
ollama servemanually, setOLLAMA_NO_CLOUD=1. - Fixed issue where experimental image generation models would not run in 0.16.0 and 0.16.1
Full Changelog: https://github.com/ollama/ollama/compare/v0.16.1...v0.16.2-rc0
Installation scripts on macOS and Windows now provide smoother password prompts and progress feedback, and image generation models honor the load timeout setting.
- GLM-5 reasoning model (744B total parameters, 40B active)
- MiniMax-M2.5 model for productivity and coding
- Text prompt editing in editor via Ctrl+G
Full changelog
New models
- GLM-5: A strong reasoning and agentic model from Z.ai with 744B total parameters (40B active), built for complex systems engineering and long-horizon tasks.
- MiniMax-M2.5: a new state-of-the-art large language model designed for real-world productivity and coding tasks.
New ollama
The new ollama command makes it easy to launch your favorite apps with models using Ollama
What's Changed
- Launch Pi with
ollama launch pi - Improvements to Ollama's MLX runner to support GLM-4.7-Flash
- Ctrl+G will now allow for editing text prompts in a text editor when running a model
Full Changelog: https://github.com/ollama/ollama/compare/v0.15.6...v0.16.0
Fixes context limit crashes for droid launches, corrects image handling bugs, and automatically downloads missing models to prevent errors.
- Qwen3-Coder-Next coding model
- GLM-OCR document understanding
- ollama launch subagent support
Full changelog
New models
- Qwen3-Coder-Next: a coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development.
- GLM-OCR: GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.
Improvements to ollama launch
ollama launchcan now be provided arguments, for exampleollama launch claude -- --resumeollama launchwill now work run subagents when usingollama launch claude- Ollama will now set context limits for a set of models when using
ollama launch opencode
What's Changed
- Sub-agent support for
ollama launchfor planning, deep research, and similar tasks ollama signinwill now open a browser window to make signing in easier- Ollama will now default to the following context lengths based on VRAM:
- < 24 GiB VRAM: 4,096 context
- 24-48 GiB VRAM: 32,768 context
- >= 48 GiB VRAM: 262,144 context
- GLM-4.7-Flash support on Ollama's experimental MLX engine
ollama signinwill now open the browser to the connect page- Fixed off by one error when using
num_predictin the API - Fixed issue where tokens from a previous sequence would be returned when hitting
num_predict
New Contributors
- @avukmirovich made their first contribution in https://github.com/ollama/ollama/pull/13934
Full Changelog: https://github.com/ollama/ollama/compare/v0.15.4...v0.15.5
- ollama launch openclaw now enters standard onboarding flow if previously incomplete
Full changelog
What's Changed
ollama launch openclawwill now enter the standard OpenClaw onboarding flow if this has not yet been completed.
Full Changelog: https://github.com/ollama/ollama/compare/v0.15.3...v0.15.4
- Command `ollama launch clawdbot` renamed to `ollama launch openclaw`
- Improved tool calling for Ministral models
- ollama launch now respects the OLLAMA_HOST environment variable
Full changelog
What's Changed
- Renamed
ollama launch clawdbottoollama launch openclawto reflect the project's new name - Improved tool calling for Ministral models
- docs: add clawdbot by @ParthSareen in https://github.com/ollama/ollama/pull/13925
- cmd/config: Use envconfig.Host() for base API in launch config packages by @gabe-l-hart in https://github.com/ollama/ollama/pull/13937
ollama launchwill now use the value ofOLLAMA_HOSTwhen running it
New Contributors
- @MBerguer made their first contribution in https://github.com/ollama/ollama/pull/13971
- @taronsung made their first contribution in https://github.com/ollama/ollama/pull/13965
- @noureldin-azzab made their first contribution in https://github.com/ollama/ollama/pull/13961
- @dhirajlochib made their first contribution in https://github.com/ollama/ollama/pull/13645
- @ThanhNguyxn made their first contribution in https://github.com/ollama/ollama/pull/13979
Full Changelog: https://github.com/ollama/ollama/compare/v0.15.2...v0.15.3
- New `ollama launch clawdbot` command for launching Clawdbot using Ollama models
Full changelog
What's Changed
- New
ollama launch clawdbotcommand for launching Clawdbot using Ollama models
Full Changelog: https://github.com/ollama/ollama/compare/v0.15.1...v0.15.2
Improved performance and correctness of GLM-4.7-Flash, resolved macOS and arm64 Linux slowdowns, and corrected launch detection for Claude, preventing configuration errors.
- New `ollama launch` command for Claude Code, Codex, OpenCode, and Droid integration
- Multi-line strings with `"""` now work in `ollama run`
- Ctrl + J and Shift + Enter support for inserting newlines in `ollama run`
Full changelog
ollama launch
A new ollama launch command to use Ollama's models with Claude Code, Codex, OpenCode, and Droid without separate configuration.
What's Changed
- New
ollama launchcommand for Claude Code, Codex, OpenCode, and Droid - Fixed issue where creating multi-line strings with
"""would not work when usingollama run - Ctrl+J and Shift+Enter now work for inserting newlines in
ollama run - Reduced memory usage for GLM-4.7-Flash models
- Z-Image Turbo (Alibaba): 6B text-to-image model for photorealistic image generation
- Flux.2 Klein (Black Forest Labs): fastest image generation model to date
- /api/generate endpoint now supports image generation
Full changelog
- Z-Image Turbo: 6 billion parameter text-to-image model from Alibaba’s Tongyi Lab. It generates high-quality photorealistic images.
- Flux.2 Klein: Black Forest Labs’ fastest image-generation models to date.
New models
- GLM-4.7-Flash: As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.
- LFM2.5-1.2B-Thinking: LFM2.5 is a new family of hybrid models designed for on-device deployment.
What's Changed
- Fixed issue where Ollama's macOS app would interrupt system shutdown
- Fixed
ollama createandollama showcommands for experimental models - The
/api/generateAPI can now be used for image generation - Fixed minor issues in Nemotron-3-Nano tool parsing
- Fixed issue where removing an image generation model would cause it to first load
- Fixed issue where
ollama rmwould only stop the first model in the list if it were running
Full Changelog: https://github.com/ollama/ollama/compare/v0.14.2...v0.14.3
- TranslateGemma: new open translation model collection supporting 55 languages, built on Gemma 3
- CLI: Shift + Enter (or Ctrl + j) now enters newlines
- Improved `/v1/responses` API conformance to OpenResponses specification
Full changelog
New models
- TranslateGemma: A new collection of open translation models built on Gemma 3, helping people communicate across 55 languages.
What's Changed
- Shift + Enter (or Ctrl + j) will now enter a newline in Ollama's CLI
- Improve
/v1/responsesAPI to better confirm to OpenResponses specification
New Contributors
- @yuhongsun96 made their first contribution in https://github.com/ollama/ollama/pull/13135
- @koaning made their first contribution in https://github.com/ollama/ollama/pull/13326
Full Changelog: https://github.com/ollama/ollama/compare/v0.14.1...v0.14.2
- Experimental image generation models (Z-Image-Turbo)
- More models in development (Qwen-Image, GLM-Image)
Full changelog
Image generation models (experimental)
Experimental image generation models are available for macOS and Linux (CUDA) in Ollama:
Available models
ollama run x/z-image-turbo
Note:
xis a username on ollama.com where experimental models are uploaded
More models coming soon:
- Qwen-Image-2512
- Qwen-Image-Edit-2511
- GLM-Image
What's Changed
- fix macOS auto-update signature verification failure
New Contributors
- @joshxfi made their first contribution in https://github.com/ollama/ollama/pull/13711
- @maternion made their first contribution in https://github.com/ollama/ollama/pull/13709
Full Changelog: https://github.com/ollama/ollama/compare/v0.14.0...v0.14.1
- Linux install bundles now use zst compression — ensure your tooling supports zst decompression.
- Modelfiles can now declare a minimum Ollama version via the REQUIRES command; models using this feature will require v0.14.0 or later.
- Experimental agent loop with bash tool via `ollama run --experimental`
- Anthropic API compatibility: /v1/messages endpoint support
- New Modelfile REQUIRES command for declaring minimum Ollama version