This release fixes issues for SREs watching stability and regressions.
✓ No known CVEs patched in this version
Summary
AI summaryFixed OOM crash during Dart/Flutter project indexing by truncating BERT tokens to 512 and reducing batch size.
Full changelog
🛡️ Fix OOM Crash During Code Indexing
Fixed server crash that occurred ~23% through Dart/Flutter project indexing (140/605 files) due to uncontrolled memory growth in the BERT embedding pipeline.
Root Cause
BERT Self-Attention is O(n²) in sequence length. Without token truncation, chunks with ~1500 tokens created ~12.4 GB attention tensors per batch of 32 on CPU — guaranteed OOM.
What is Fixed
| Parameter | Before | After |
|-----------|--------|-------|
| Token truncation | ❌ None (unbounded) | ✅ 512 (BERT max) |
| Batch size | 32 (GPU-oriented) | 8 (CPU-optimized) |
| Peak attention RAM | ~12.4 GB | ~360 MB |
| Peak tensor RAM | ~141 MB | ~12 MB |
| Worker panic handling | ❌ Silently dropped | ✅ Logged |
| Queue capacity | 5000 | 1000 |
Changes
- engine.rs: Add
MAX_SEQ_LEN=512truncation to bothembed()andembed_batch()— prevents O(n²) attention memory explosion - worker.rs: Reduce embedding batch size from 32 to 8 (CPU-optimal for BERT models)
- main.rs: Catch worker panics via nested
tokio::spawn— prevents silent connection hangs - adaptive_queue.rs: Reduce channel capacity from 5000 to 1000 — bounds queue memory
Install
npx memory-mcp-1file
Full Changelog: https://github.com/pomazanbohdan/memory-mcp-1file/compare/v0.2.5...v0.2.6
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About pomazanbohdan/memory-mcp-1file
A self-contained Memory server with single-binary architecture (embedded DB & models, no dependencies). Provides persistent semantic and graph-based memory for AI agents.
Related context
Beta — feedback welcome: [email protected]