This release includes 1 breaking change for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
+13 more
Affected surfaces
ReleasePort's take
Light signalThe IAI_MCP_EMBED_QUANTIZE environment variable now strictly accepts only "int8" (lowercase) or be unset; any other value will cause the daemon to crash at startup.
Why it matters: If you set IAI_MCP_EMBED_QUANTIZE to an unsupported value, your deployment will fail on launch. Update configuration before upgrade.
Summary
AI summaryIAI_MCP_EMBED_QUANTIZE now accepts only int8 or unset, crashing on other values.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Breaking | Medium |
IAI_MCP_EMBED_QUANTIZE accepts only int8 (lowercase) or unset; other values crash daemon at startup. IAI_MCP_EMBED_QUANTIZE accepts only int8 (lowercase) or unset; other values crash daemon at startup. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Opt-in int8 embedding quantization via IAI_MCP_EMBED_QUANTIZE=int8. Opt-in int8 embedding quantization via IAI_MCP_EMBED_QUANTIZE=int8. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Contradiction-aware temporal validity adds valid_from and valid_to to memory_recall hits. Contradiction-aware temporal validity adds valid_from and valid_to to memory_recall hits. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Deterministic overnight_digest produces consistent shapes with structured zeroed default when no REM cycle runs. Deterministic overnight_digest produces consistent shapes with structured zeroed default when no REM cycle runs. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Glama TDQS lifted from C to B; MCP tools now declare annotations and structured outputSchema. Glama TDQS lifted from C to B; MCP tools now declare annotations and structured outputSchema. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Codex CLI can be a capture target with iai-mcp capture-hooks install --target codex|claude|all. Codex CLI can be a capture target with iai-mcp capture-hooks install --target codex|claude|all. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
BENCHMARKS.md added, covering eight project benchmarks from M-01 token budget to M-08 LongMemEval-S. BENCHMARKS.md added, covering eight project benchmarks from M-01 token budget to M-08 LongMemEval-S. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Low |
overnight_digest key is always present in memory_recall responses with a zeroed default when no REM cycle has run. overnight_digest key is always present in memory_recall responses with a zeroed default when no REM cycle has run. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
| Feature | Low |
Ambient capture now works in both Claude Code and Codex sessions. Ambient capture now works in both Claude Code and Codex sessions. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
| Deprecation | Medium |
camouflaging_status outputSchema field names changed: formality_trend → trajectory_slope, anomaly_score → current_mean, new sample_count. camouflaging_status outputSchema field names changed: formality_trend → trajectory_slope, anomaly_score → current_mean, new sample_count. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Bugfix | Low |
Fixes camouflaging_status outputSchema field-name mismatch. Fixes camouflaging_status outputSchema field-name mismatch. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
| Bugfix | Low |
New valid_from / valid_to keys in recall hits are additive with default None; strict JSON‑Schema consumers need to widen. New valid_from / valid_to keys in recall hits are additive with default None; strict JSON‑Schema consumers need to widen. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
Full changelog
What's new
Opt-in int8 embedding quantization — IAI_MCP_EMBED_QUANTIZE=int8. Default fp32 path unchanged. Round-trip cos ≥ 0.99 on bge-small-en-v1.5.
Contradiction-aware temporal validity — memory_recall hits and anti-hits now carry derived valid_from / valid_to. Records contradicted by newer records are downweighted (not hidden) at recall time.
Deterministic overnight_digest — same inputs produce the same shape. The overnight_digest key is now always present in memory_recall responses with a structured zeroed default when no REM cycle has run.
Glama TDQS lifted from C to B — every MCP tool now declares annotations (readOnlyHint, destructiveHint, idempotentHint, openWorldHint, title) and a structured outputSchema. Fixes the camouflaging_status outputSchema field-name mismatch.
Codex CLI as a capture target — iai-mcp capture-hooks install --target codex|claude|all. Ambient capture now works in both Claude Code and Codex sessions.
BENCHMARKS.md — public methodology covering the eight project benchmarks (M-01 token budget through M-08 LongMemEval-S).
Heads up
IAI_MCP_EMBED_QUANTIZEaccepts onlyint8(lowercase) or unset. Any other value crashes the daemon at startup. Intentional, no silent fallback.- New
valid_from/valid_tokeys in recall hits are additive (defaultNone). Strict JSON-Schema consumers withadditionalProperties: falseneed to widen. camouflaging_statusoutputSchema field names changed:formality_trend→trajectory_slope,anomaly_score→current_mean, newsample_count. Permissive consumers were already tolerant.
Thanks
Reddit user u/BeginningReflection4 — feedback and testing that shaped this release.
Full release notes: CHANGELOG.md
Breaking Changes
- `IAI_MCP_EMBED_QUANTIZE` now accepts only `int8` (lowercase) or unset; any other value causes the daemon to crash at startup.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About CodeAbra/iai-mcp
All releases →Related context
Related tools
Beta — feedback welcome: [email protected]