This release includes 3 breaking changes for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Summary
AI summaryMigrated embedding model to Gemma2 reducing memory footprint from ~3 GB to ~545 MB.
Full changelog
Release v0.8.0: The Gemma Migration & Extreme Memory Optimization 🚀
This release marks a fundamental shift in the architecture of memory-mcp, solving the most critical issue reported by users: out-of-memory (OOM) crashes during codebase indexing.
What's New:
- Model Migration (Qwen → Gemma2): We migrated the default embedding model from
Qwen3-1.5Btounsloth/embeddinggemma-300m-qat-q4_0-unquantized. This drops the memory footprint of the model from ~3GB down to just ~545MB while maintaining top-tier retrieval performance! - Zero Configuration: The new Gemma model is fully open. You no longer need a HuggingFace account, HF_TOKEN, or any license agreements to run
memory-mcp. It just works out of the box. - Mimalloc Allocator: Replaced the system allocator with
mimalloc. This drastically reduces memory fragmentation (especially on Alpine/Musl) and significantly boosts multi-threaded processing speeds. - SurrealDB Stability (Throttling): We implemented smart batch-throttling during indexation. The indexer now pauses for 100-150ms after inserting vectors, completely eliminating
Transaction write conflict(OCC Retries) inside SurrealDB. - 2048d Vectors: The model natively generates and searches against 2048-dimensional vectors with
last_token_poolingfor immense accuracy. The database schema dynamically rebuilds itsHNSWindices to accommodate the new dimension. - Hardware Acceleration: Native release builds now enable
x86-64-v3target optimizations, speeding up the underlying tensor math via AVX2. - Cleanup: Removed the broken
acceleratefeature from Cargo to ensure proper compilation on Linux.
Performance:
On a standard system, the container now sits comfortably at ~350MB of RAM usage (down from ~4GB!) during massive codebase indexation, keeping your system fast and responsive.
Breaking Changes
- Removed `accelerate` feature from Cargo
- Changed default embedding model from `Qwen3-1.5B` to `unsloth/embeddinggemma-300m-qat-q4_0-unquantized` (memory footprint drop ~545 MB)
- Database schema dynamically rebuilds HNSW indices for 2048‑dimensional vectors
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About pomazanbohdan/memory-mcp-1file
A self-contained Memory server with single-binary architecture (embedded DB & models, no dependencies). Provides persistent semantic and graph-based memory for AI agents.
Related context
Beta — feedback welcome: [email protected]