Release history
memvid releases
All releases
10 shown
- Added `stale_index_skips` counter to `SearchResponse`
- Exposed `stale_index_skips` in Python and Node SDK responses
Full changelog
v2.0.139
Release Date: March 13, 2026
Overview
This release completes the fix for MV005 (Time index track is invalid: frame id out of range) by extending graceful stale frame_id handling to all search paths — find(), ask(), Tantivy, lex fallback, vec search, and temporal metadata lookups.
🐛 Bug Fixes
MV005 crash in find() and ask() search paths — Issue #196 (continued)
- v2.0.138 fixed the
timeline.rspath only; stale frame_ids in Tantivy evaluation, snippet assembly, lex fallback, and temporal metadata lookups still caused hard crashes - All 5 search-path locations now gracefully skip stale frame_ids with
tracing::warn!instead of returningMV005errors - Added
stale_index_skipscounter onSearchResponseso callers can detect index degradation - Exposed
stale_index_skipsin both Python and Node SDK responses (only present when > 0) - CLI search command updated to include the new field
Root cause: Search indexes (Tantivy/lex) can hold frame_ids that no longer exist in toc.frames due to instant_index writing WAL sequence numbers, stale on-disk lex segments, or reopening files where the index loaded from a prior state.
📚 Related Issues
- #196 —
ask()fails with "Time index track is invalid: frame id out of range" - #204 —
commit()raisesAttributeErroron_MemvidCore(tracked, not addressed in this release)
Fixed `lex_enabled`/`vec_enabled` state loss on file reopen and prevented `ask()` crashes from out‑of‑range frame IDs.
Full changelog
v2.0.138
Release Date: March 3, 2026
Overview
This release fixes two SDK issues: lex_enabled/vec_enabled state not persisting after re-opening .mv2 files, and ask() crashing with "frame id out of range" on files with inconsistent time indexes.
🐛 Bug Fixes
lex_enabled/vec_enabled reset to None on re-open — Issue #194
- Added
lex_enabledandvec_enabledfields to the coreStatsstruct so SDKs can read runtime search engine state - Python SDK now auto-detects
vec_availablefrom disk state when opening a file (matches the pattern already used inask()) stats()in both Python and Node SDKs now includeslex_enabledandvec_enabledkeys- Previously,
use()with defaultenable_vec=Falsewould reportvec_available=falseeven when the file contained a vec index, causingfind(mode="semantic")to error immediately
ask() "frame id out of range" on fresh .mv2 files — Issue #196
build_timeline()now gracefully skips time index entries that reference out-of-range frame IDs instead of returning a hard error- Logs a
tracing::warn!for observability when skipping invalid entries - This matches the existing graceful-skip pattern used in
ask.rsfallback timeline responses
📚 Related Issues
- #194 —
lex_enabled/vec_enabledreset toNoneon re-open - #196 —
ask()"frame id out of range" on fresh .mv2 files
- Removed vulnerable SheetJS `[email protected]` dependency from `@memvid/sdk` (CVE-2024-22363, CVE-2023-30533).
- Removed SheetJS `[email protected]` dependency – addresses CVE-2024-22363 and CVE-2023-30533
- CVE-2023-30533
- Structured XLSX extraction pipeline with table detection, OOXML metadata parsing, and semantic chunking.
- New `XlsxReader::extract_structured()` API providing high‑accuracy spreadsheet extraction.
Full changelog
v2.0.157
Release Date: February 15, 2026
Overview
This release adds a structured XLSX extraction pipeline with table detection, OOXML metadata parsing, and semantic chunking. It also removes a vulnerable xlsx (SheetJS) dependency from the Node SDK, fixes the CLI deploy pipeline for proprietary crate handling, and includes clippy/lint fixes and documentation updates.
🚀 New Features
Structured XLSX Extraction Pipeline (memvid-core)
- New
XlsxReader::extract_structured()API for high-accuracy spreadsheet extraction - Automatic table boundary and header detection via heuristics and OOXML table definitions
- Row-aligned semantic chunking that never splits rows across chunk boundaries
- Formats rows as
Header: Value | Header: Valuepairs for optimal search accuracy - OOXML metadata parsing: number formats (dates, currency, percentages), merged cell regions, named table definitions
- Column type inference (text, integer, float, date, currency, percentage, boolean)
- Backward-compatible flat text output alongside structured chunks
- New modules:
xlsx_chunker,xlsx_ooxml,xlsx_table_detect
Remove Vulnerable xlsx Dependency — Issue #198
- Removed SheetJS
[email protected]from@memvid/sdk(CVE-2024-22363, CVE-2023-30533) - Production code already used ExcelJS — only example files were updated
- Downstream users no longer receive Dependabot security alerts from
@memvid/sdk
CLI Deploy Fix: Proprietary Crate Handling
- Made
memvid-ghostpackoptional inmemvid-ask-modeland removed from workspace members - CI builds no longer fail when proprietary crates are absent (
.gitignore'd) - Ghost model kind returns a clean error when the runtime is unavailable
🐛 Bug Fixes
- Fixed clippy pedantic lints (
implicit_clone,cast_possible_truncation) - Fixed
dead_codewarning forpropagate_merged_cells - Resolved VecIndexManifest model field lint
xlsx_structuredtests now gracefully skip on CI when fixture file is absent
📝 Documentation
- Chinese (Simplified) README translation (#193 by @nightire)
- README updates (@mo-omar-0197)
📚 Related Issues & PRs
- #198 — Remove vulnerable xlsx (SheetJS) dependency (@intergrado-cg report, @Olow304 fix)
- #193 — Chinese README translation (@nightire, merged by @sharafdin)
🙏 Contributors
Thank you to all contributors who made this release possible:
- @Olow304 — Structured XLSX pipeline, xlsx vulnerability fix, CLI deploy fix, clippy/lint cleanup
- @nightire — Chinese (Simplified) README translation
- @sharafdin — PR review and merge
- @mo-omar-0197 — README updates
- @intergrado-cg — Reported xlsx security vulnerability (#198)
- Frame-level ACL enforcement across search, ask, and replay paths (opt‑in)
- Strict vector index model consistency to prevent silent mismatches
- OpenAI API added as an embedding provider option
Full changelog
Release Date: February 6, 2026
Overview
This release adds frame-level ACL (Access Control Lists), vector index model consistency enforcement, symspell data corruption fixes, and several CI/build improvements. It also includes README documentation updates and ONNX Runtime noise suppression on macOS.
🚀 New Features
Frame-Level ACL Enforcement
- Added ACL (Access Control List) plumbing across search, ask, and replay paths
- Per-frame access control enables fine-grained permission enforcement on chunks
- Robustness fixes for ACL boundary conditions
- New tests and benchmark/example updates for ACL workflows
Vector Index Model Consistency (PR #188)
- Enforces strict binding between vector index and embedding model
- Prevents silent model mismatch corruption when switching embedding providers
- Ensures vector search results are always consistent with the model used at index time
SymSpell Cleanup Fix & Dictionary Tooling (PR #187)
- Fixed
symspell_cleanupdata corruption bug - Added dictionary download tooling for easier setup
- More reliable spell-correction preprocessing for search queries
OpenAI API Embedding Provider (PR #173)
- Added OpenAI API as an embedding provider option
- Enables using OpenAI embeddings alongside local ONNX models
- Flexible embedding backend selection
🐛 Bug Fixes
ONNX Runtime Stderr Suppression (macOS)
- Suppressed noisy ONNX Runtime warnings on macOS stderr
- Cleaner console output during normal operation
CI Build Fixes
- Added missing
#[cfg(feature = "lex")]guards for tantivy-dependent code - Fixed CI cache key to use
Cargo.tomlhash instead of missingCargo.lock - Committed
Cargo.lockfor reproducible CI builds - Moved target-specific deps section after main dependencies
- Ran
cargo fmtonclip.rsandtext_embed.rs
Lint Fixes
- Resolved redundant closure lints in
tantivy.rsandsearch/mod.rs - General lint formatting cleanup
📝 Documentation
- Added Memvid v1 deprecation warning to README (@sharafdin)
- README updates and improvements (@mo-omar-0197)
📊 Performance & Reliability
- ACL enforcement: Zero-overhead when no ACL policy is set
- Model consistency: Prevents silent search quality degradation from model mismatch
- SymSpell fix: Eliminates data corruption in spell-correction preprocessing
📚 Related Pull Requests
- #188 — feat: enforce vector index model consistency (@0x-pankaj)
- #187 — feat: fix symspell_cleanup data corruption and add dictionary tooling (@0x-pankaj)
- #173 — feat: add OpenAI API embedding provider (@0x-pankaj)
- Direct push — Frame-level ACL enforcement across search/ask/replay (@Olow304)
🎯 Migration Notes
For Users
- No breaking changes — all existing
.mv2files remain compatible - ACL is opt-in; existing memories work without any ACL configuration
- Vector model consistency is enforced automatically on new indexes
For Developers
- New
aclScopefield available on API keys (nullable, no migration needed) - ACL types available in
types/acl.rs - Embedding model is now strictly bound to vector index at creation time
🙏 Contributors
Thank you to all contributors who made this release possible:
- @Olow304 — ACL enforcement, CI fixes, lint cleanup
- @0x-pankaj — Vector model consistency, symspell fix, OpenAI embeddings
- @sharafdin — Documentation (deprecation notice)
- @mo-omar-0197 — README updates
- Multi‑word query behavior now uses implicit AND; use explicit OR for previous semantics.
- Windows test stability improved with file handle release delay.
- HNSW vector search implementation for fast approximate nearest neighbor queries
- SIMD acceleration of L2 distance calculations improving CPU performance
- LRU eviction added to extraction cache preventing unbounded memory growth
Full changelog
Release Date: January 25, 2026
Overview
This release includes significant performance improvements, bug fixes, and feature enhancements. Highlights include HNSW vector search implementation, SIMD acceleration, Windows test fixes, and improved query precision with implicit AND operators.
🚀 New Features
HNSW Vector Search Implementation (PR #185)
- Applied HNSW (Hierarchical Navigable Small World) implementation patch
- Enables fast approximate nearest neighbor search for large vector indexes
- Significant performance improvement for vector search operations
- Better scalability for memories with many vector embeddings
SIMD Acceleration (PR #176)
- Added SIMD acceleration for vector distance calculations
- Optimized L2 distance computations using SIMD instructions
- Faster vector similarity searches
- Improved performance on modern CPUs with SIMD support
Extraction Cache Improvements (PR #175)
- Added LRU (Least Recently Used) eviction to extraction cache
- Better memory management for document extraction
- Prevents cache from growing unbounded
- Improved performance for repeated document processing
Query Precision Enhancement (PR #178)
- Changed implicit query operator from OR to AND for precision
- Multi-word queries now require all terms to match (implicit AND)
- More precise search results
- Explicit OR operator still available when needed
- Better user experience for targeted searches
🐛 Bug Fixes
Windows Test Fixes (PR #186)
- Fixed Windows test failures by adding delay for Tantivy file handle release
- Resolved file locking issues on Windows during test cleanup
- Tests now pass reliably on Windows platforms
- Improved cross-platform test stability
Clippy Safety Overhaul (PR #180)
- Comprehensive safety improvements based on Clippy linter recommendations
- Fixed potential safety issues across the codebase
- Improved code quality and maintainability
- Enhanced memory safety guarantees
📝 Documentation
Internationalization
- Added Bengali (bn) README translation (PR #182)
- Added Japanese README translation (PR #177)
- Improved accessibility for non-English speakers
- Expanded documentation coverage
Documentation Improvements (PR #181)
- Added HTML markers to all README files to make updates easier
- Improved documentation maintenance workflow
- Better structure for automated documentation updates
🔧 Developer Experience
Build & Development Tools (PR #184)
- Created script to add flags for easier development workflow
- Streamlined feature flag management
- Improved developer productivity
📊 Performance Improvements
- HNSW Implementation: Faster vector search for large indexes
- SIMD Acceleration: Optimized distance calculations
- LRU Cache: Better memory utilization
- Query Precision: More accurate search results
🙏 Contributors
Thank you to all contributors who made this release possible:
- @sharafdin - Release manager
- @0x-pankaj - HNSW implementation, SIMD acceleration, extraction cache, Windows fixes, Clippy safety
- @Abhisheklearn12 - Query operator precision fix
- @Adam-Elmi - Documentation improvements, build scripts
- @krishnaK-D-Bair - Bengali translation
- @yukaty - Japanese translation
📚 Related Pull Requests
- #186 - fix(tests): add Windows delay for Tantivy file handle release
- #185 - feat: apply HNSW implementation patch
- #184 - Created a script to add flags
- #182 - docs: add Bengali (bn) README translation
- #181 - Added HTML markers to all README files to make updates easier
- #180 - Fix/clippy safety overhaul
- #178 - Fix: Change implicit query operator from OR to AND for precision
- #177 - docs(i18n): add Japanese README translation
- #176 - feat: add SIMD acceleration for vector distance calculations
- #175 - feat(extract): add LRU eviction to extraction cache
🎯 Migration Notes
For Users
- No breaking changes in this release
- All existing
.mv2files remain compatible - Query behavior change: Multi-word queries now use implicit AND (more precise)
- Use explicit
ORoperator if you need the old behavior - Example:
"machine learning"now requires both words (was: either word) - Example:
"machine OR learning"still works for either word
- Use explicit
For Developers
- Windows developers: Test stability improved
- Performance: Vector search is significantly faster with HNSW
- Memory: Extraction cache now has bounded memory usage
📚 Documentation
🔗 Related
- Removed extractous from Python SDK default features
- .mv2e encryption capsule with AES-256-GCM and Argon2id key derivation
- Multi-word search queries now default to OR logic for improved recall
Full changelog
Highlights
Encryption Capsule (.mv2e)
- Introduced secure file encryption with .mv2e format
- AES-256-GCM encryption with Argon2id key derivation
- Lock/unlock files with password protection via lock_file() and unlock_file() APIs
- Header contains KDF parameters, salt, and nonce for secure decryption
Search Improvements
- Multi-word queries now default to OR logic for better recall (e.g., "machine learning" finds documents with either term)
- Fixed parallel segment indexing to properly use search_text field when no_raw=true
SDK & CLI Compatibility
- Full cross-compatibility between CLI and SDK created .mv2 files
- Removed extractous from Python SDK default features to avoid native library dependencies
Bug Fixes
- Fixed lexical search indexing for documents ingested via SDK putMany
- Resolved Python SDK import error related to missing Tika native library
Contributors
Thank you to our contributors for this release:
- @sharafdin
- @akash-R-A-J
- @0x-pankaj
- @Sjondepon
- @karamokoisrael
- @shiinedev
- @DavidReque
- @GinoGreen
- @omartood
- @krishnaK-D-Bair
- @reneleonhardt
- @soomin.lee
- @qool
- DoctorQuietMode: quiet option in DoctorOptions to suppress debug logs
- Comprehensive streaming encryption tests for large files and password validation
Full changelog
v2.0.133
Features
- Doctor Quiet Mode: Added quiet option to DoctorOptions to suppress debug logs during doctor operations. Useful for SDK integrations where verbose output is unwanted.
Improvements
- Streaming Encryption Tests: Added comprehensive tests for the streaming encryption feature (PR #117):
- streaming_encryption_large_file - Tests encrypt/decrypt roundtrip for files >1MB
- wrong_password_fails_streaming - Verifies password validation in streaming format
- Confirms reserved[0] == 0x01 marker for streaming format detection
Internal
- Replaced println! with doctor_log! macro for conditional logging
- Added thread-local DOCTOR_QUIET flag for clean log suppression
- Updated all test files to use quiet: true in DoctorOptions
Compatibility
- Fully backward compatible with existing .mv2 and .mv2e files
- Streaming encryption (PR #117) auto-detects format via header byte
- Added Ed25519 cryptographic signature verification for dashboard-issued capacity tickets
Full changelog
What's New
Ed25519 Ticket Signature Verification
Added cryptographic signature verification for dashboard-issued capacity tickets using Ed25519.
Changes:
- New
signature.rsmodule for Ed25519 verification - Ticket validation in lifecycle management
- Dashboard public key verification for capacity tickets
- New ticket types in
types/ticket.rs
- Single‑file architecture storing everything in a .mv2 file
- Sub‑5 ms local retrieval latency
- Multi‑modal support (text, PDF, DOCX, images via CLIP, audio via Whisper)
Full changelog
🚀 Memvid 2.0 - Complete Rust Rewrite
Give your AI agents memory in one file.
This release marks a complete rewrite of Memvid from Python to Rust, delivering 10-100x performance improvements and a truly portable single-file memory system.
Highlights
- Single-file architecture - Everything in one .mv2 file, no databases or sidecars
- Sub-5ms retrieval - Blazing fast local memory access
- Multi-modal support - Text, PDF, DOCX, images (CLIP), and audio (Whisper)
- Hybrid search - BM25 full-text + HNSW vector similarity
- Time-travel - Query any point in memory history
- Encryption - Optional password-protected capsules (.mv2e)
Installation
Rust:
[dependencies]
memvid-core = "2.0"
CLI:
npm install -g memvid-cli
SDKs:
- Node.js: npm install @memvid/sdk
- Python: pip install memvid-sdk
Links
- https://docs.memvid.com
- https://sandbox.memvid.com
- https://crates.io/crates/memvid-core