SLayer, a semantic layer maintained by your agent

v0.6.0 Breaking

This release includes 1 breaking change for platform teams planning a safe upgrade.

Published 2mo AI Agents & Assistants

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

semantic-layer

Affected surfaces

breaking_upgrade deps

ReleasePort's take

Light signal

editorial:auto 2mo

recall_memories API removed entirely without deprecation shim; migrate to unified search channel. Schema v5→v6 with auto-migration on load.

Why it matters: Migrate recall_memories to search channel before upgrading to v0.6.0. Breaking API removal with no deprecation shim. Search channel provides unified retrieval; schema auto-migrates.

Summary

AI summary

recall_memories is removed and fully replaced by search channel 1.

Changes in this release

Type	Severity	Summary	CVE
Breaking	Medium	recall_memories surface entirely removed with no deprecation shim. recall_memories surface entirely removed with no deprecation shim. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature
Feature	Medium	Unified search tool retrieves memories and schema entities via channels. Unified search tool retrieves memories and schema entities via channels. Source: llm_adapter@2026-05-21 Confidence: high	—
Feature	Medium	Dense embedding similarity channel via litellm added to search. Dense embedding similarity channel via litellm added to search. Source: llm_adapter@2026-05-21 Confidence: high	—
Feature	Medium	BM25Plus entity-overlap ranker scoped to memories only in search. BM25Plus entity-overlap ranker scoped to memories only in search. Source: llm_adapter@2026-05-21 Confidence: high	—
Feature	Medium	Demo default size reduced from 4 years to 2 years. Demo default size reduced from 4 years to 2 years. Source: llm_adapter@2026-05-21 Confidence: high	—
Feature	Medium	New slayer search refresh-samples CLI command updates sample cache. New slayer search refresh-samples CLI command updates sample cache. Source: llm_adapter@2026-05-21 Confidence: high	—
Feature	Medium	Existing storage auto-migrates on load via converter chain. Existing storage auto-migrates on load via converter chain. Source: llm_adapter@2026-05-21 Confidence: high	—
Feature	Medium	Tantivy full-text index channel covers memories and entity docs. Tantivy full-text index channel covers memories and entity docs. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Cascade delete removes associated embeddings by canonical-id prefix. Cascade delete removes associated embeddings by canonical-id prefix. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Embeddings persisted with SHA256 content_hash for idempotency. Embeddings persisted with SHA256 content_hash for idempotency. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Demo jafgen logs streamed live to terminal for real-time progress. Demo jafgen logs streamed live to terminal for real-time progress. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Per-entity embedding failures are non-fatal, warnings appended. Per-entity embedding failures are non-fatal, warnings appended. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Memory entity introduced at schema version v1. Memory entity introduced at schema version v1. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Embedding entity introduced at schema version v1. Embedding entity introduced at schema version v1. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Default embedding model openai/text-embedding-3-small, overridable. Default embedding model openai/text-embedding-3-small, overridable. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Canonical field enables literal entity string lookup in search. Canonical field enables literal entity string lookup in search. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	SearchResponse includes resolved_input_entities for diagnostics. SearchResponse includes resolved_input_entities for diagnostics. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Memory hits partitioned by query type with independent caps. Memory hits partitioned by query type with independent caps. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Column samples populated on ingest, edit_model, and inspect_model. Column samples populated on ingest, edit_model, and inspect_model. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Fallback to line-pumping for non-TTY streams like ipykernel. Fallback to line-pumping for non-TTY streams like ipykernel. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Column.sampled field caches profile strings for search indexing. Column.sampled field caches profile strings for search indexing. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Search degrades gracefully when embeddings unavailable or missing. Search degrades gracefully when embeddings unavailable or missing. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Reciprocal Rank Fusion merges search channels with k=60. Reciprocal Rank Fusion merges search channels with k=60. Source: llm_adapter@2026-05-21 Confidence: low	—
Feature	Medium	Channel 2 builds an in‑memory Tantivy full‑text index covering memories and entity documents, tokenized with Porter stemmer and exact‑match `canonical` field. Channel 2 builds an in‑memory Tantivy full‑text index covering memories and entity documents, tokenized with Porter stemmer and exact‑match `canonical` field. Source: granite4.1:30b@2026-05-23-audit Confidence: low	—
Feature	Medium	Live streaming of jafgen logs shows real‑time progress; falls back to line‑pumping for non‑TTY streams like ipykernel. Live streaming of jafgen logs shows real‑time progress; falls back to line‑pumping for non‑TTY streams like ipykernel. Source: granite4.1:30b@2026-05-23-audit Confidence: low	—
Feature	Medium	Embeddings are persisted with a SHA256 `content_hash` for idempotent re‑runs; failures are non‑fatal and logged as warnings. Embeddings are persisted with a SHA256 `content_hash` for idempotent re‑runs; failures are non‑fatal and logged as warnings. Source: granite4.1:30b@2026-05-23-audit Confidence: low	—
Feature	Medium	`Memory` and `Embedding` schema entities introduced at version v1, each gaining an explicit `version` field. `Memory` and `Embedding` schema entities introduced at version v1, each gaining an explicit `version` field. Source: granite4.1:30b@2026-05-23-audit Confidence: low	—
Dependency	Medium	Embedding search gated behind optional embedding_search extra. Embedding search gated behind optional embedding_search extra. Source: llm_adapter@2026-05-21 Confidence: low	—
Dependency	Low	Embedding search functionality requires the optional `embedding_search` extra, pulling in `litellm` and `numpy`. Embedding search functionality requires the optional `embedding_search` extra, pulling in `litellm` and `numpy`. Source: granite4.1:30b@2026-05-23-audit Confidence: low	—
Performance	Medium	Sampled cache makes search indexing cheap and inspect_model stable. Sampled cache makes search indexing cheap and inspect_model stable. Source: llm_adapter@2026-05-21 Confidence: high	—
Refactor
Refactor	Medium	delete_model and delete_datasource moved to StorageBackend ABC. delete_model and delete_datasource moved to StorageBackend ABC. Source: llm_adapter@2026-05-21 Confidence: low	—
Refactor	Medium	SlayerModel schema version bumped from v5 to v6. SlayerModel schema version bumped from v5 to v6. Source: llm_adapter@2026-05-21 Confidence: low	—
Refactor	Medium	`delete_model` and `delete_datasource` moved to the `StorageBackend` ABC, handling cascade deletion of associated embeddings by canonical‑id prefix. `delete_model` and `delete_datasource` moved to the `StorageBackend` ABC, handling cascade deletion of associated embeddings by canonical‑id prefix. Source: granite4.1:30b@2026-05-23-audit Confidence: low	—

Full changelog

SLayer 0.6.0 Release Notes

A feature release: three PRs since 0.5.1 collapse the agent retrieval surface into a single search tool, add three-channel ranking over both memories and schema entities (entity-overlap BM25 + tantivy full-text + optional dense embeddings via litellm), introduce a persisted per-column sampled snapshot that feeds the search index, and tighten the bundled Jaffle Shop demo (streamed jafgen logs, default size dropped from 4 years to 2). One breaking change rides along: recall_memories is gone -- no deprecation shim, no alias -- and is fully replaced by search channel 1. SlayerModel bumps from v5 to v6 (forward no-op converter) and two new entities pick up an explicit version field: Memory and Embedding, both at v1.

Unified `search` tool with two retrieval channels (DEV-1375, BREAKING)

A new search tool retrieves memories AND schema entities (datasources, non-hidden models, non-hidden columns, named ModelMeasures, custom Aggregations) through two parallel channels merged by Reciprocal Rank Fusion (k=60). Channel 1 is the BM25Plus entity-overlap ranker recall_memories used to wrap, now scoped to memories only. Channel 2 builds a fresh in-memory tantivy index per call covering memories union entity docs, tokenised by en_stem (Porter stemmer plus default tokenisation on _ and .) so "shipped" matches "shipping" and "customer" matches customer_id; an exact-match canonical field lets agents paste a literal <ds>.<model>.<col> string and get the doc back directly. The behaviour matrix is documented in docs/concepts/search.md -- both inputs run both channels; entity-only input runs channel 1 only; question-only input runs channel 2 only; the empty input returns the newest learning-only plus query-bearing memories with a warning. Memory hits are partitioned by Memory.query is None: learning-only memories land in memories, query-bearing memories in example_queries, each capped independently so bulky example queries cannot crowd out small learning-only notes. SearchResponse.resolved_input_entities echoes the resolver output for diagnostics. The same surface is exposed across MCP search, REST POST /search, CLI slayer search [--entity ...] [--question ...] [--query ...], and SlayerClient.search. The breaking part: the entire recall_memories surface is gone -- MCP tool, REST POST /memories/recall, CLI slayer memory recall, SlayerClient.recall_memories, the RecallHit / RecallResponse Pydantic models, and MemoryService.recall_memories. Channel 1 of search is the exact same BM25 ranker on the exact same canonical-entity index; migration is a one-call swap.

Persisted per-column sample-value cache (DEV-1375)

Every Column gains an optional sampled: Optional[str] field caching the per-column profile string the search index renders into each entity's text field. Previously this was recomputed by inspect_model on every call; the cache makes search indexing cheap and gives inspect_model a stable rendered snapshot. Populated on slayer ingest / ingest_datasource_models MCP / POST /ingest for every table-backed model in the touched datasource; on the new slayer search refresh-samples [--data-source X] [--model M ...] CLI; on edit_model (column-level edits refresh that column; model-level filter / sql / source-query changes refresh every column); and lazily on inspect_model with best-effort write-back when the cache is None. sql-mode and query-backed models are silently skipped in this release. The schema bump to SlayerModel v6 is a no-op forward converter (slayer/storage/v6_migration.py); the field defaults to None and is populated by the first subsequent ingest.

Optional embedding-based third retrieval channel (DEV-1386)

A third channel runs alongside tantivy and entity-overlap BM25: dense embedding similarity via litellm, with the question embedded once per call and cosine similarity computed in numpy over a corpus matrix loaded fresh from storage. Memory rankings are RRF-fused across all three channels; entity hits -- previously tantivy-only with raw scores -- are RRF-fused across channels 2 and 3. The channel is gated behind the optional embedding_search extra (pip install motley-slayer[embedding_search] -- pulls litellm and numpy); when the extra is missing, no provider key is configured, or no embedding rows exist for the active model, the channel emits one warning into SearchResponse.warnings and search degrades gracefully via tantivy + BM25. The default model is openai/text-embedding-3-small; override via the SLAYER_EMBEDDING_MODEL env var in <provider>/<model-name> litellm format. Provider credentials (OPENAI_API_KEY, AZURE_API_KEY, etc.) are read by litellm directly. Embeddings are persisted in a sidecar table (SQLite embeddings table / YAML embeddings.yaml) keyed by (canonical_id, embedding_model_name), with a SHA256 content_hash on each row so idempotent re-runs skip the litellm call when the source text hasn't changed. Storage is JSON lists of floats (~6 KB per 1536-dim row) -- portable, debuggable, dialect-neutral. Per-entity embed failures (rate limits, transient network errors, bad keys) are non-fatal: the failing row is not written and a warning is appended to the response. Switching SLAYER_EMBEDDING_MODEL mid-project leaves old rows inert; re-run slayer ingest or re-save the memories to populate the new model's rows. Dimension mismatch between the question embedding and stored rows is detected and warns instead of crashing. Refresh edges match Column.sampled: slayer ingest, edit_model, and save_memory. New Memory and Embedding schema entities both ship at v1.

Storage cascade-delete refactor (DEV-1386)

delete_model and delete_datasource are now defined on the StorageBackend ABC -- backends implement only the row-level _delete_model_row / _delete_datasource_row primitives and the ABC wrappers handle embedding-row cascade by canonical-id prefix. delete_model drops every embedding under <ds>.<model>% (model doc + columns + measures + aggregations); delete_datasource drops every row under <ds>%; delete_memory drops the matching memory:<id> row. Matches the existing pattern from DEV-1361 where collision and validation rules live in the ABC, not duplicated per backend.

Bundled demo: streamed logs and faster default (PR #104, follow-up d34b8bf)

slayer datasources create demo now forwards jafgen's stdout/stderr live to the user's terminal so the multi-minute generation step shows real-time progress instead of a silent wait. When the active stream lacks a real file descriptor (e.g. an ipykernel OutStream shim), the implementation falls back to a line-pumping Popen so notebook integration tests still pass; on real TTYs it keeps the inheriting fast path so Rich progress animations render correctly. The default demo size drops from 4 years to 2 -- enough to exercise every Jaffle Shop schema feature, but fast enough that slayer serve --demo and slayer mcp --demo finish inside MCP-client startup timeouts. Override with --years N on slayer datasources create demo.

Schema versions

SlayerModel v5 -> v6 (forward no-op for the new optional Column.sampled field). SlayerQuery remains v3. DatasourceConfig remains v1. New entities: Memory v1 and Embedding v1. Existing storage migrates automatically on load via the converter chain in slayer/storage/migrations.py.

Breaking Changes

`recall_memories` API surface (MCP tool, REST `/memories/recall`, CLI `slayer memory recall`, `SlayerClient.recall_memories` and related Pydantic models) is removed and replaced by channel 1 of the new unified `search` tool.

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track SLayer, a semantic layer maintained by your agent

Get notified when new releases ship.

About SLayer, a semantic layer maintained by your agent

All releases →

Related context

Related tools

Earlier breaking changes

v0.7.1 Changes `search()` response to a single flat `results` list capped by `max_results`, removing separate `memories`, `example_queries`, and `entities` buckets.
v0.7.1 Changes `search()` to return a single flat `results` list, removing per‑bucket caps.
v0.6.3 Datasource names now reject dots, slashes, nulls, empty/whitespace; existing names containing '.' will fail validation on upgrade.
v0.5.1 Two-mode reference semantics enforced: SQL mode accepts arbitrary SQL; DSL mode strictly resolves identifiers.
v0.5.1 RecallHit.match_count renamed to RecallHit.score across MCP, REST, CLI, and SlayerClient.