Skip to content

ollama

v0.23.1 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

Published 29d Model Serving & MLOps
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

deepseek gemma gemma3 glm go gpt-oss
+8 more
llama llama3 llm llms minimax mistral ollama qwen

Summary

AI summary

Gemma 4 MTP speculative decoding now supported on Macs for up to 2x faster coding tasks.

Full changelog

Gemma 4 MTP (Multi-token Processing) for the MLX runner

Gemma 4 MTP speculative decoding is now supported on Macs. This can give over a 2x speed increase for the Gemma 4 31B model on coding tasks.

ollama run gemma4:31b-coding-mtp-bf16

What's Changed

  • Update MLX and MLX-C with threading fixes by @dhiltgen in https://github.com/ollama/ollama/pull/15845
  • go: bump to 1.26 by @ParthSareen in https://github.com/ollama/ollama/pull/15904
  • Add Gemma 4 MTP speculative decoding by @pdevine in https://github.com/ollama/ollama/pull/15980

Full Changelog: https://github.com/ollama/ollama/compare/v0.23.0...v0.23.1-rc0

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track ollama

Get notified when new releases ship.

Sign up free

About ollama

Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

All releases →

Related context

Beta — feedback welcome: [email protected]