This release adds 3 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
Summary
AI summaryUpdates 🔥 What's New, v0.12.0, and https://github.com/DenisovAV/flutter_gemma across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | Low |
Swift APIs natively integrate LiteRT-LM into iOS apps with Metal GPU acceleration. Swift APIs natively integrate LiteRT-LM into iOS apps with Metal GPU acceleration. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Low |
Web JavaScript APIs run models in browsers with high performance via WebGPU/CPU. Web JavaScript APIs run models in browsers with high performance via WebGPU/CPU. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Low |
LiteRT-LM CLI now supports NPU backend across Linux, macOS, and Windows. LiteRT-LM CLI now supports NPU backend across Linux, macOS, and Windows. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Low |
Community-maintained Flutter APIs enable cross-platform Flutter applications using flutter_gemma package. Community-maintained Flutter APIs enable cross-platform Flutter applications using flutter_gemma package. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Low |
NPU support for Intel OpenVINO added with --backend=npu flag in CLI and Python API. NPU support for Intel OpenVINO added with --backend=npu flag in CLI and Python API. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Low |
Add --max-num-tokens option to benchmark command in CLI. Add --max-num-tokens option to benchmark command in CLI. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Low |
Community-maintained Flutter APIs enable cross-platform apps using flutter_gemma package. Community-maintained Flutter APIs enable cross-platform apps using flutter_gemma package. Source: granite4.1:30b@2026-05-19-audit Confidence: high |
— |
| Feature | Low |
New Python API function to construct Message object. New Python API function to construct Message object. Source: granite4.1:30b@2026-05-19-audit Confidence: low |
— |
| Bugfix | Low |
Pin CLI version with corresponding API version (0.12.0 CLI uses 0.12.0 API). Pin CLI version with corresponding API version (0.12.0 CLI uses 0.12.0 API). Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Low |
Correct GPU activation type in Python API, restoring prefill speed to normal. Correct GPU activation type in Python API, restoring prefill speed to normal. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Low |
Propagate cache_dir setting to vision and audio backends in Python API. Propagate cache_dir setting to vision and audio backends in Python API. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
Full changelog
🔥 What's New (v0.12.0)
- 🚀 Swift APIs: Natively integrate LiteRT-LM into iOS applications with Metal GPU acceleration.
- 🚀 Web JavaScript APIs: Run models inside web browsers with high performance via web GPU/CPU.
- LiteRT-LM CLI Update: The command-line interface now supports NPU, besides CPU and GPU backends across Linux, macOS, and Windows.
- 🚀 Community-Maintained Flutter APIs: Build cross-platform Flutter applications using the community flutter_gemma package.
Features and bug fixes:
CLI
- [feature] NPU support for Intel OpenVINO with --backend=npu.
- [feature] Add --max-num-tokens (context length) to benchmark
- [bugfix] Pin CLI version with API version. (0.12.0 CLI uses 0.12.0 API)
Python API
- [feature] NPU support for Intel OpenVINO.
- [feature] New API to construct Message object.
- [bugfix] Correct the GPU activation type. Prefill speed back to normal (was limited to 50%).
- [bugfix] Propagate cache_dir to vision and audio backend.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About LiteRT-LM
All releases →Related context
Related tools
Beta — feedback welcome: [email protected]