This release includes 2 breaking changes for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
Affected surfaces
ReleasePort's take
Light signalv0.6.3 moves embedding storage to SQLite, dramatically accelerating YAML-backed search ingestion via batched APIs and one-roundtrip save/load. Datasource name validation now rejects dots and slashes; existing names containing '.' must be migrated before upgrade.
Why it matters: Operators running YAML storage should test the SQLite migration in dev. Breaking change: datasource names containing '.' will fail validation—audit and rename affected sources before upgrading.
Summary
AI summaryEmbedding storage moved to SQLite sidecar for faster YAML-backed searches and added an optional datasource filter on search.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Breaking | Medium |
Datasource names now reject dots, slashes, nulls, empty/whitespace; existing names containing '.' will fail validation on upgrade. Datasource names now reject dots, slashes, nulls, empty/whitespace; existing names containing '.' will fail validation on upgrade. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
search supports optional datasource filter across all retrieval channels. search supports optional datasource filter across all retrieval channels. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Batched embedding save and get APIs added to StorageBackend ABC. Batched embedding save and get APIs added to StorageBackend ABC. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Performance | Medium |
Embeddings storage moved to SQLite sidecar in YAMLStorage improves ingestion speed dramatically. Embeddings storage moved to SQLite sidecar in YAMLStorage improves ingestion speed dramatically. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Performance | Medium |
New batched embedding APIs reduce number of round-trips per call to exactly one read and one write. New batched embedding APIs reduce number of round-trips per call to exactly one read and one write. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Performance | Medium |
EmbeddingService._apply_pending performs exactly one batched read and one batched write per call. EmbeddingService._apply_pending performs exactly one batched read and one batched write per call. Source: granite4.1:30b@2026-05-22-audit Confidence: low |
— |
| Deprecation | Medium |
counters.yaml and id_counters table removed; memory IDs derived directly from corpus. counters.yaml and id_counters table removed; memory IDs derived directly from corpus. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Bugfix | Medium |
Cascade-delete now matches exact canonical IDs or strict dotted-path descendants, fixing prefix mismatches. Cascade-delete now matches exact canonical IDs or strict dotted-path descendants, fixing prefix mismatches. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Bugfix | Medium |
Concurrent save_memory race resolved by single transactional INSERT in SQLiteStorage. Concurrent save_memory race resolved by single transactional INSERT in SQLiteStorage. Source: llm_adapter@2026-05-21 Confidence: high |
— |
Full changelog
SLayer 0.6.3
Release theme: embedding-search hot paths run faster on YAML-backed stores, plus a new datasource filter on search and several correctness fixes around memory id allocation and cascade-delete that affected the embedding-search subsystem introduced in 0.6.0.
Highlights
New: optional datasource filter on search
search(...) and all four surfaces (MCP tool, REST POST /search, CLI slayer search, SlayerClient.search) gain an optional datasource: Optional[str] = None argument. When set, all three retrieval channels pre-filter their corpora to that one datasource:
- Entity hits (tantivy + embedding-cosine channels) include only docs rooted at the requested datasource: exact name match or strict dotted-path descendant (
<ds>.<model>,<ds>.<model>.<leaf>). Character-prefix matches do not qualify, sodatasource="prod"excludes a sibling datasource namedprod_v2. - Memory hits include any memory whose
entitieslist contains at least one entity rooted at the requested datasource. Memories spanning multiple datasources surface from each. - BM25 / IDF statistics in channels 1 and 2 and the cosine matrix in channel 3 reflect only the filtered subset (pre-filter, not post-filter).
Unknown datasource names raise ValueError (HTTP 400 on REST). Validation runs before any corpus walk so typos surface fast.
This builds on the canonical-id namespace rule shipped in this same release: dotted datasource names are rejected, so the "rooted at" prefix match is unambiguous. Helper slayer.memories.resolver.canonical_id_rooted_at encodes the same dotted-namespace rule used by the embedding cascade-delete.
Embedding storage moved to a SQLite sidecar in YAMLStorage
Before 0.6.3, YAMLStorage persisted every embedding row to a single embeddings.yaml list. Each save/get/list/delete operation read and re-wrote the entire file, with the embedding vector (typically 768-1536 floats per row) inline. On every slayer ingest, EmbeddingService._apply_pending re-parsed embeddings.yaml M times for hash-skip and then M more times for writes (M = model + visible columns + named measures + custom aggregations). Even moderate corpora made ingest unusably slow.
0.6.3 introduces a SidecarEmbeddingStore helper backed by SQLite. YAMLStorage now persists embeddings to <base_dir>/embeddings.db while keeping models, datasources, datasource priority, and memories in their original git-diffable YAML form. SQLiteStorage delegates to the same helper. The SQL lives in exactly one place.
Net effect on a JaffleShop-sized model tree: slayer ingest goes from "minutes of YAML round-tripping" to "one batched read + one batched write per refresh."
New batched embedding APIs
The StorageBackend ABC gained two methods, plumbed all the way through EmbeddingService:
save_embeddings(rows: List[Embedding]) -> Noneget_embeddings_for_canonical_ids(*, canonical_ids: List[str], embedding_model_name: str) -> Dict[str, Embedding]
Default implementations fall back to M-iteration over the existing single-row methods, so third-party storage backends continue to work without modification. The bundled SQLite + YAML backends override them to issue single batched round-trips via SidecarEmbeddingStore.save_many / get_many. EmbeddingService._apply_pending (used by every save_memory, edit_model, and slayer ingest) now makes exactly one batched read and one batched write per call, independent of subtree size.
Cascade-delete fix: descendants only, not character prefixes
StorageBackend.delete_embeddings_for_canonical(canonical_id_prefix=X) previously used LIKE X || '%', which silently matched anything starting with the same characters. Concrete consequences:
delete_memory(4)cascaded todelete_embeddings_for_canonical("memory:4"), which also wipedmemory:42,memory:43,memory:400, and so on.delete_datasource("orders")also wiped embeddings rooted at sibling datasources likeorders_archive,orders123.delete_model("orders", "customers")also wiped embeddings rooted atorders.customers_v2,orders.customers123.
The cascade now matches the canonical id exactly OR strict dotted-path descendants (<root>.<...>). Pinned by regression tests.
Datasource-name validation tightened
To make the dotted-path namespace unambiguous:
DatasourceConfig.namenow rejects.,/,\, NUL, empty/whitespace-only.__is deliberately allowed (datasource names never appear in SQL alias positions).SlayerModel.data_sourcerejects the same set (was previously missing the., whitespace, and NUL checks).- Storage-layer
_validate_path_component(used byget_model/delete_model/get_datasource/delete_datasource) likewise rejects..
Upgrade note: if any of your existing datasource names contain a ., save/load will start failing validation on 0.6.3. Rename them before upgrading. __ is still fine.
Memory ids no longer use a separate counter store
counters.yaml (YAML) and the id_counters table (SQLite) are gone. The next id is derived directly from the memories corpus:
- YAML:
max(int_ids) + 1over the current rows inmemories.yaml(or1for an empty file). - SQLite:
INSERT ... RETURNING idagainst thememoriestable. Id assignment happens inside SQLite's write lock atomically with the insert, so two concurrentsave_memorycalls can never collide on the same id.
Behaviour change: memory ids of deleted memories may now be reused by future saves. Cascade-on-delete in delete_memory already removes the matching embedding row, so reuse never strands data. (Pre-0.6.3 documentation claimed "ids are never reused"; that contract is gone, replaced with "ids increase monotonically while the corpus grows; freed ids may be reused.")
Fixed: concurrent save_memory race on SQLite
The pre-0.6.3 SQLiteStorage._next_seq_sync + _save_memory_sync flow used two separate SQLite connections. Concurrent save_memory calls could both read the same MAX(id) + 1 before either inserted, then both INSERT OR REPLACE the same id, silently clobbering one of the two memories. 0.6.3 collapses both steps into one INSERT ... RETURNING id transaction. Pinned by a regression test firing 25 concurrent saves and asserting unique ids.
Migration
YAMLStorage.__init__ performs an idempotent one-time rename on first open at 0.6.3:
| If present | Renamed to |
| --- | --- |
| <base_dir>/embeddings.yaml | embeddings.yaml.legacy |
| <base_dir>/counters.yaml | counters.yaml.legacy |
If a .legacy file already exists, neither file is touched (idempotent, never clobbers an existing backup). Embeddings are regeneratable artifacts; re-run slayer ingest (or rely on slayer serve --ingest-on-startup from 0.6.1) to repopulate embeddings.db. Until then the embedding-similarity channel of search contributes nothing and emits a single warning. Tantivy full-text and BM25 entity-overlap continue to work.
For SQLiteStorage, no migration is required. The legacy id_counters table, if present from a pre-0.6.3 database, is left in place as harmless dead data and is never queried.
Public API surface
Added (concrete on StorageBackend, with default M-iteration impls):
save_embeddings(rows)get_embeddings_for_canonical_ids(*, canonical_ids, embedding_model_name)
Behaviour change:
delete_embeddings_for_canonical(canonical_id_prefix=...): semantics narrowed to exact-id OR strict dotted-path descendant. Previously a character prefix; now namespace-aware.- Memory ids of deleted memories may be reused.
DatasourceConfig.nameandSlayerModel.data_sourcereject., leading/trailing whitespace, NUL bytes.search(...)gains optionaldatasourceargument across MCP, REST, CLI, and the Python client.
Internal (not a public contract, but documented for backend authors):
- New helper class
slayer.storage.sidecar_embedding_store.SidecarEmbeddingStore. - New mixin
SidecarEmbeddingsMixinproviding the embedding CRUD forwards by delegating toself._embeddings_store. Both bundled backends inherit it. - New pure helper
slayer.memories.resolver.canonical_id_rooted_at(canonical_id, datasource) -> bool.
Removed (internal):
SQLiteStorage._next_seq_sync,_seed_counter_sync,_COUNTER_SEED_TABLES: counter machinery dead-code-removed.YAMLStorage._read_counters,_write_counters,_max_memory_id.id_counterstable creation inSQLiteStorage._init_db(existing tables left untouched).
Docs
docs/configuration/storage.md: documents the newembeddings.dbsidecar, the.legacyrename, and the new memory-id allocation.docs/concepts/memories.md: id-reuse note, no more counter-store language.docs/concepts/search.md: embedding sidecar paragraph, cascade-semantics clarification, and thedatasourcefilter section.CLAUDE.md: memories + embeddings + search sections updated.
Test coverage added
tests/test_sidecar_embedding_store.py(new): helper CRUD round-trips, batched APIs (incl. empty-input short-circuit verified bysqlite3.connectspy), the prefix-greedy regression set, and theget_manychunked-IN regression at 2000 ids.tests/test_embeddings_storage.py(extended): parametrised across both backends; YAML legacy-rename idempotency; same prefix-greedy regression set via the public ABC.tests/test_memories_storage.py(extended): id reuse on tail delete; empty-corpus first id is1; new save never collides with existing id; concurrentsave_memoryproduces 25 unique ids;delete_datasource/get_datasourcereject bad inputs.tests/test_models.py(extended): full validator coverage forDatasourceConfig.nameandSlayerModel.data_source.tests/test_embeddings_service.py(extended): pins that_apply_pendingissues exactly one batched read + one batched write per call.tests/test_canonical_id_helpers.py(new): dotted-namespace rule coverage for the new helper.tests/test_search_datasource_filter.py(new): memory scoping (kept / cross-datasource / dropped / untagged), entity scoping at channel 2, validation, recency fallback, empty-corpus,None == no filter.tests/test_search_surfaces.py(extended): acceptance + rejection ofdatasourceacross MCP, REST, CLI, and the Python client.
Total: 2584 unit tests, all green; ruff check clean.
Contributors
Egor Kraev. Generated with Claude Code.
Breaking Changes
- Datasource names now reject '.', '/', '\', NUL, and whitespace; existing names containing '.' will fail validation on upgrade.
- Memory IDs may be reused after deletion (previously never reused).
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
Track SLayer, a semantic layer maintained by your agent
Get notified when new releases ship.
Sign up freeAbout SLayer, a semantic layer maintained by your agent
All releases →Related context
Related tools
Earlier breaking changes
Beta — feedback welcome: [email protected]