This release includes 1 breaking change for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
Affected surfaces
ReleasePort's take
Light signalrecall_memories API removed entirely without deprecation shim; migrate to unified search channel. Schema v5→v6 with auto-migration on load.
Why it matters: Migrate recall_memories to search channel before upgrading to v0.6.0. Breaking API removal with no deprecation shim. Search channel provides unified retrieval; schema auto-migrates.
Summary
AI summaryrecall_memories is removed and fully replaced by search channel 1.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Breaking | Medium |
recall_memories surface entirely removed with no deprecation shim. recall_memories surface entirely removed with no deprecation shim. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Unified search tool retrieves memories and schema entities via channels. Unified search tool retrieves memories and schema entities via channels. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Dense embedding similarity channel via litellm added to search. Dense embedding similarity channel via litellm added to search. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
BM25Plus entity-overlap ranker scoped to memories only in search. BM25Plus entity-overlap ranker scoped to memories only in search. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Demo default size reduced from 4 years to 2 years. Demo default size reduced from 4 years to 2 years. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
New slayer search refresh-samples CLI command updates sample cache. New slayer search refresh-samples CLI command updates sample cache. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Existing storage auto-migrates on load via converter chain. Existing storage auto-migrates on load via converter chain. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Tantivy full-text index channel covers memories and entity docs. Tantivy full-text index channel covers memories and entity docs. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Cascade delete removes associated embeddings by canonical-id prefix. Cascade delete removes associated embeddings by canonical-id prefix. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Embeddings persisted with SHA256 content_hash for idempotency. Embeddings persisted with SHA256 content_hash for idempotency. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Demo jafgen logs streamed live to terminal for real-time progress. Demo jafgen logs streamed live to terminal for real-time progress. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Per-entity embedding failures are non-fatal, warnings appended. Per-entity embedding failures are non-fatal, warnings appended. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Memory entity introduced at schema version v1. Memory entity introduced at schema version v1. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Embedding entity introduced at schema version v1. Embedding entity introduced at schema version v1. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Default embedding model openai/text-embedding-3-small, overridable. Default embedding model openai/text-embedding-3-small, overridable. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Canonical field enables literal entity string lookup in search. Canonical field enables literal entity string lookup in search. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
SearchResponse includes resolved_input_entities for diagnostics. SearchResponse includes resolved_input_entities for diagnostics. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Memory hits partitioned by query type with independent caps. Memory hits partitioned by query type with independent caps. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Column samples populated on ingest, edit_model, and inspect_model. Column samples populated on ingest, edit_model, and inspect_model. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Fallback to line-pumping for non-TTY streams like ipykernel. Fallback to line-pumping for non-TTY streams like ipykernel. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Column.sampled field caches profile strings for search indexing. Column.sampled field caches profile strings for search indexing. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Search degrades gracefully when embeddings unavailable or missing. Search degrades gracefully when embeddings unavailable or missing. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Reciprocal Rank Fusion merges search channels with k=60. Reciprocal Rank Fusion merges search channels with k=60. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Channel 2 builds an in‑memory Tantivy full‑text index covering memories and entity documents, tokenized with Porter stemmer and exact‑match `canonical` field. Channel 2 builds an in‑memory Tantivy full‑text index covering memories and entity documents, tokenized with Porter stemmer and exact‑match `canonical` field. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
| Feature | Medium |
Live streaming of jafgen logs shows real‑time progress; falls back to line‑pumping for non‑TTY streams like ipykernel. Live streaming of jafgen logs shows real‑time progress; falls back to line‑pumping for non‑TTY streams like ipykernel. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
| Feature | Medium |
Embeddings are persisted with a SHA256 `content_hash` for idempotent re‑runs; failures are non‑fatal and logged as warnings. Embeddings are persisted with a SHA256 `content_hash` for idempotent re‑runs; failures are non‑fatal and logged as warnings. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
| Feature | Medium |
`Memory` and `Embedding` schema entities introduced at version v1, each gaining an explicit `version` field. `Memory` and `Embedding` schema entities introduced at version v1, each gaining an explicit `version` field. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
| Dependency | Medium |
Embedding search gated behind optional embedding_search extra. Embedding search gated behind optional embedding_search extra. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Dependency | Low |
Embedding search functionality requires the optional `embedding_search` extra, pulling in `litellm` and `numpy`. Embedding search functionality requires the optional `embedding_search` extra, pulling in `litellm` and `numpy`. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
| Performance | Medium |
Sampled cache makes search indexing cheap and inspect_model stable. Sampled cache makes search indexing cheap and inspect_model stable. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Refactor | Medium |
delete_model and delete_datasource moved to StorageBackend ABC. delete_model and delete_datasource moved to StorageBackend ABC. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Refactor | Medium |
SlayerModel schema version bumped from v5 to v6. SlayerModel schema version bumped from v5 to v6. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Refactor | Medium |
`delete_model` and `delete_datasource` moved to the `StorageBackend` ABC, handling cascade deletion of associated embeddings by canonical‑id prefix. `delete_model` and `delete_datasource` moved to the `StorageBackend` ABC, handling cascade deletion of associated embeddings by canonical‑id prefix. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
Full changelog
SLayer 0.6.0 Release Notes
A feature release: three PRs since 0.5.1 collapse the agent retrieval surface into a single search tool, add three-channel ranking over both memories and schema entities (entity-overlap BM25 + tantivy full-text + optional dense embeddings via litellm), introduce a persisted per-column sampled snapshot that feeds the search index, and tighten the bundled Jaffle Shop demo (streamed jafgen logs, default size dropped from 4 years to 2). One breaking change rides along: recall_memories is gone -- no deprecation shim, no alias -- and is fully replaced by search channel 1. SlayerModel bumps from v5 to v6 (forward no-op converter) and two new entities pick up an explicit version field: Memory and Embedding, both at v1.
Unified search tool with two retrieval channels (DEV-1375, BREAKING)
A new search tool retrieves memories AND schema entities (datasources, non-hidden models, non-hidden columns, named ModelMeasures, custom Aggregations) through two parallel channels merged by Reciprocal Rank Fusion (k=60). Channel 1 is the BM25Plus entity-overlap ranker recall_memories used to wrap, now scoped to memories only. Channel 2 builds a fresh in-memory tantivy index per call covering memories union entity docs, tokenised by en_stem (Porter stemmer plus default tokenisation on _ and .) so "shipped" matches "shipping" and "customer" matches customer_id; an exact-match canonical field lets agents paste a literal <ds>.<model>.<col> string and get the doc back directly. The behaviour matrix is documented in docs/concepts/search.md -- both inputs run both channels; entity-only input runs channel 1 only; question-only input runs channel 2 only; the empty input returns the newest learning-only plus query-bearing memories with a warning. Memory hits are partitioned by Memory.query is None: learning-only memories land in memories, query-bearing memories in example_queries, each capped independently so bulky example queries cannot crowd out small learning-only notes. SearchResponse.resolved_input_entities echoes the resolver output for diagnostics. The same surface is exposed across MCP search, REST POST /search, CLI slayer search [--entity ...] [--question ...] [--query ...], and SlayerClient.search. The breaking part: the entire recall_memories surface is gone -- MCP tool, REST POST /memories/recall, CLI slayer memory recall, SlayerClient.recall_memories, the RecallHit / RecallResponse Pydantic models, and MemoryService.recall_memories. Channel 1 of search is the exact same BM25 ranker on the exact same canonical-entity index; migration is a one-call swap.
Persisted per-column sample-value cache (DEV-1375)
Every Column gains an optional sampled: Optional[str] field caching the per-column profile string the search index renders into each entity's text field. Previously this was recomputed by inspect_model on every call; the cache makes search indexing cheap and gives inspect_model a stable rendered snapshot. Populated on slayer ingest / ingest_datasource_models MCP / POST /ingest for every table-backed model in the touched datasource; on the new slayer search refresh-samples [--data-source X] [--model M ...] CLI; on edit_model (column-level edits refresh that column; model-level filter / sql / source-query changes refresh every column); and lazily on inspect_model with best-effort write-back when the cache is None. sql-mode and query-backed models are silently skipped in this release. The schema bump to SlayerModel v6 is a no-op forward converter (slayer/storage/v6_migration.py); the field defaults to None and is populated by the first subsequent ingest.
Optional embedding-based third retrieval channel (DEV-1386)
A third channel runs alongside tantivy and entity-overlap BM25: dense embedding similarity via litellm, with the question embedded once per call and cosine similarity computed in numpy over a corpus matrix loaded fresh from storage. Memory rankings are RRF-fused across all three channels; entity hits -- previously tantivy-only with raw scores -- are RRF-fused across channels 2 and 3. The channel is gated behind the optional embedding_search extra (pip install motley-slayer[embedding_search] -- pulls litellm and numpy); when the extra is missing, no provider key is configured, or no embedding rows exist for the active model, the channel emits one warning into SearchResponse.warnings and search degrades gracefully via tantivy + BM25. The default model is openai/text-embedding-3-small; override via the SLAYER_EMBEDDING_MODEL env var in <provider>/<model-name> litellm format. Provider credentials (OPENAI_API_KEY, AZURE_API_KEY, etc.) are read by litellm directly. Embeddings are persisted in a sidecar table (SQLite embeddings table / YAML embeddings.yaml) keyed by (canonical_id, embedding_model_name), with a SHA256 content_hash on each row so idempotent re-runs skip the litellm call when the source text hasn't changed. Storage is JSON lists of floats (~6 KB per 1536-dim row) -- portable, debuggable, dialect-neutral. Per-entity embed failures (rate limits, transient network errors, bad keys) are non-fatal: the failing row is not written and a warning is appended to the response. Switching SLAYER_EMBEDDING_MODEL mid-project leaves old rows inert; re-run slayer ingest or re-save the memories to populate the new model's rows. Dimension mismatch between the question embedding and stored rows is detected and warns instead of crashing. Refresh edges match Column.sampled: slayer ingest, edit_model, and save_memory. New Memory and Embedding schema entities both ship at v1.
Storage cascade-delete refactor (DEV-1386)
delete_model and delete_datasource are now defined on the StorageBackend ABC -- backends implement only the row-level _delete_model_row / _delete_datasource_row primitives and the ABC wrappers handle embedding-row cascade by canonical-id prefix. delete_model drops every embedding under <ds>.<model>% (model doc + columns + measures + aggregations); delete_datasource drops every row under <ds>%; delete_memory drops the matching memory:<id> row. Matches the existing pattern from DEV-1361 where collision and validation rules live in the ABC, not duplicated per backend.
Bundled demo: streamed logs and faster default (PR #104, follow-up d34b8bf)
slayer datasources create demo now forwards jafgen's stdout/stderr live to the user's terminal so the multi-minute generation step shows real-time progress instead of a silent wait. When the active stream lacks a real file descriptor (e.g. an ipykernel OutStream shim), the implementation falls back to a line-pumping Popen so notebook integration tests still pass; on real TTYs it keeps the inheriting fast path so Rich progress animations render correctly. The default demo size drops from 4 years to 2 -- enough to exercise every Jaffle Shop schema feature, but fast enough that slayer serve --demo and slayer mcp --demo finish inside MCP-client startup timeouts. Override with --years N on slayer datasources create demo.
Schema versions
SlayerModel v5 -> v6 (forward no-op for the new optional Column.sampled field). SlayerQuery remains v3. DatasourceConfig remains v1. New entities: Memory v1 and Embedding v1. Existing storage migrates automatically on load via the converter chain in slayer/storage/migrations.py.
Breaking Changes
- `recall_memories` API surface (MCP tool, REST `/memories/recall`, CLI `slayer memory recall`, `SlayerClient.recall_memories` and related Pydantic models) is removed and replaced by channel 1 of the new unified `search` tool.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
Track SLayer, a semantic layer maintained by your agent
Get notified when new releases ship.
Sign up freeAbout SLayer, a semantic layer maintained by your agent
All releases →Related context
Related tools
Earlier breaking changes
- v0.6.3 Datasource names now reject dots, slashes, nulls, empty/whitespace; existing names containing '.' will fail validation on upgrade.
- v0.5.1 Two-mode reference semantics enforced: SQL mode accepts arbitrary SQL; DSL mode strictly resolves identifiers.
- v0.5.1 RecallHit.match_count renamed to RecallHit.score across MCP, REST, CLI, and SlayerClient.
Beta — feedback welcome: [email protected]