pomazanbohdan/memory-mcp-1file

v0.2.6 Bugfix

This release fixes issues for SREs watching stability and regressions.

Published 5mo MCP Data & Storage

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Summary

AI summary

Fixed OOM crash during Dart/Flutter project indexing by truncating BERT tokens to 512 and reducing batch size.

Full changelog

🛡️ Fix OOM Crash During Code Indexing

Fixed server crash that occurred ~23% through Dart/Flutter project indexing (140/605 files) due to uncontrolled memory growth in the BERT embedding pipeline.

Root Cause

BERT Self-Attention is O(n²) in sequence length. Without token truncation, chunks with ~1500 tokens created ~12.4 GB attention tensors per batch of 32 on CPU — guaranteed OOM.

What is Fixed

| Parameter | Before | After |
|-----------|--------|-------|
| Token truncation | ❌ None (unbounded) | ✅ 512 (BERT max) |
| Batch size | 32 (GPU-oriented) | 8 (CPU-optimized) |
| Peak attention RAM | ~12.4 GB | ~360 MB |
| Peak tensor RAM | ~141 MB | ~12 MB |
| Worker panic handling | ❌ Silently dropped | ✅ Logged |
| Queue capacity | 5000 | 1000 |

Changes

engine.rs: Add MAX_SEQ_LEN=512 truncation to both embed() and embed_batch() — prevents O(n²) attention memory explosion
worker.rs: Reduce embedding batch size from 32 to 8 (CPU-optimal for BERT models)
main.rs: Catch worker panics via nested tokio::spawn — prevents silent connection hangs
adaptive_queue.rs: Reduce channel capacity from 5000 to 1000 — bounds queue memory

Install

npx memory-mcp-1file

Full Changelog: https://github.com/pomazanbohdan/memory-mcp-1file/compare/v0.2.5...v0.2.6

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track pomazanbohdan/memory-mcp-1file

Get notified when new releases ship.

About pomazanbohdan/memory-mcp-1file

A self-contained Memory server with single-binary architecture (embedded DB & models, no dependencies). Provides persistent semantic and graph-based memory for AI agents.

All releases →