No immediate action

v0.9.12 Breaking risk 1d

Case-sensitive YAML IDs

Open

Config change

v0.9.11 Bug fix 6d

Auth

Duplicate CTE name fix

Open

No immediate action

v0.9.10 Bug fix 8d

Connection string handling fix

Open

No immediate action

v0.9.9 Bug fix 9d

Quotes reserved words

Open

No immediate action

v0.9.8 New feature 11d

OSI importer

Open

No immediate action

v0.9.7 Maintenance 13d

Routine maintenance and dependency updates.

Open

Config change

v0.9.5 New feature 19d

RBAC

Cross‑join row‑level security

Open

No immediate action

v0.9.4 Bug fix 20d

Case‑folding SQL fixes + psql \d

Open

No immediate action

v0.9.3 Feature 24d

`root_hint` for recommend_root_model

Open

No immediate action

v0.9.2 New feature 24d

Postgres schema per datasource + MetricFlow improvements

Open

Config change

v0.9.0 New feature 1mo

RBAC

Row-level security

Open

No immediate action

v0.8.3 Bug fix 1mo

ColumnCycleError fix

Open

No immediate action

v0.8.1 Maintenance 1mo

Routine maintenance and dependency updates.

Open

No immediate action

v0.8.0 New feature 1mo

Postgres facade

Open

No immediate action

v0.7.5 Bug fix 1mo

Quoted/capitalised table name fix

Open

No immediate action

v0.7.4 New feature 1mo

Tier 1 Snowflake + BigQuery

Open

No immediate action

v0.7.3 Breaking risk 1mo

Compact search/model summaries

Open

No immediate action

v0.7.2 Mixed 1mo

Deduplication opt‑out + SQLite type widening

Open

No immediate action

v0.7.1 Breaking risk 1mo

MSSQL datasource + search flattening

Open

No immediate action

v0.7.0 New feature 1mo

Postgres protocol access

Open

No immediate action

v0.6.10 New feature 2mo

sampled_values + distinct_count

Open

No immediate action

v0.6.9 Breaking risk 2mo

String memory IDs

Open

Config change

v0.6.8 Breaking risk 2mo

Rejects colon in identifiers

Open

No immediate action

v0.6.7 New feature 2mo

Multi-stage queries

Open

No immediate action

v0.6.6 New feature 2mo

Nested queries API

Open

No immediate action

v0.6.5 New feature 2mo

Circular reference detection

Open

No immediate action

v0.6.4 Maintenance 2mo

Routine maintenance and dependency updates.

Open

Review required

v0.6.3 Breaking risk 2mo

Auth RBAC

SQLite embedding store + datasource filter

Open

No immediate action

v0.6.2 Maintenance 2mo

Routine maintenance and dependency updates.

Open

No immediate action

v0.6.1 Breaking risk 2mo

Boot-time auto-ingest

Open

Review required

v0.6.0 Breaking risk 2mo

Breaking upgrade Dependencies

removal of recall_memories

Open

v0.5.1 Breaking risk 2mo

Breaking changes

`RecallHit.match_count: int` is renamed to `RecallHit.score: float` across all API surfaces (MCP, REST, CLI, SlayerClient).

Notable features

BM25 ranking for memory recall via new `bm25_rank` module and `rank_bm25.BM25Plus`.
Two-mode reference semantics locked down: SQL mode accepts arbitrary functions; DSL mode enforces strict identifier resolution with a string‑hygiene allowlist (lower, upper, trim, replace, substr, instr, length, concat).
MCP `query` tool now accepts inline `source_model` dicts (`str | ModelExtension | SlayerModel`).

Full changelog

SLayer 0.5.1 Release Notes

A maintenance release: six PRs since 0.5.0 tighten the agent-facing surface that the bird-interact-agents benchmark pounded on. The headline items are a BM25 ranker for the memory layer, a cleaned-up two-mode reference model that pins what is and isn't valid in Column.sql / Column.filter / SlayerModel.filters versus DSL formula and filter strings, SQL-mode model filters now properly accepting arbitrary SQL functions plus a small string-hygiene allowlist for DSL-mode query filters (lower, upper, trim, replace, substr, instr, length, concat, and ||), an MCP query tool that finally accepts inline source-model dicts, and clearer diagnostics for two filter-parser edge cases (path-qualified LIKE and subquery-shaped predicates). One small breaking change rides along: RecallHit.match_count: int is renamed to RecallHit.score: float -- no alias.

BM25 ranking for memory recall (DEV-1365, BREAKING)

MemoryService.recall_memories previously ranked candidates by raw entity-overlap count, which trivially favoured memories tagged with large entity sets -- a memory linked to 50 entities would out-overlap a precisely-tagged one of 2 regardless of relevance. The new slayer/memories/ranker.py module exposes bm25_rank(memories, query_entities) over canonical entity sets, using rank_bm25.BM25Plus (now a core dep). IDF / avgdl are computed across the full memory corpus (recall calls storage.list_memories(entities=None) rather than the intersection-filtered form); an explicit set-intersection pre-filter still enforces "must overlap on at least one entity" and BM25 is used purely to rank the eligible set. BM25Plus (rather than BM25Okapi) is chosen specifically because at small corpus sizes Okapi's IDF goes negative for terms appearing in even a moderate fraction of documents, which inverts BM25's length normalisation and reproduces the exact bug DEV-1365 was meant to fix; the rationale is documented in the module header. The empty-about recency fallback is unchanged. The breaking part: RecallHit.match_count: int is replaced by RecallHit.score: float across MCP, REST, CLI, and SlayerClient. Hard rename, no alias -- callers introspecting the field break loudly rather than silently misreading. inspect_model's Learnings section is unchanged (it is a per-model browsing view, not a retrieval query, and still pulls memories via the storage entity-intersection filter in insertion order).

Two-mode reference semantics (DEV-1369, BREAKING)

SLayer has always had two distinct expression layers but the implementation had drifted; this release pins the intended semantics and enforces them. SQL mode (Column.sql, Column.filter, SlayerModel.filters) is sqlglot-parsed free SQL: arbitrary function calls (json_extract, coalesce, nullif, CASE WHEN, ...), dialect operators, and __-delimited join paths (customers__regions.name) are all accepted; DSL constructs (aggregation colon syntax, transform calls, raw OVER (...) in filters) are rejected. DSL mode (ModelMeasure.formula, SlayerQuery.{measures, filters, dimensions, time_dimensions, order, main_time_dimension}) is the Python-AST DSL: bare names strict-resolve at enrichment time against the model's defined Column / ModelMeasure / custom-aggregation / query-level alias set, and raw SQL function calls or unknown names raise actionable errors. The DEV-1336 escape hatch -- auto-promoting a query filter against a Column whose sql contained a window function to a post-aggregation outer WHERE -- is removed; the rank-family transforms (DEV-1353) cover top-N filtering in pure DSL and the new error message points at them or at multi-stage source_queries. Opportunistic consolidation: the new slayer/core/refs.py is now the single source of truth for identifier-shape regexes (AGG_REF_RE, IDENT_OR_PATH_RE, DOTTED_IDENT_REF_RE, IDENTIFIER_RE), the agg-suffix parser (canonical_agg_name, strip_agg_suffix), and the user-input dunder helper -- four prior duplicates in formula.py, dbt/converter.py, engine/enrichment.py, and memories/resolver.py re-export from here. _walk_join_chain collapses the two near-duplicate join walkers in query_engine.py. Migration: stored YAML / SQLite models with aggregation colon syntax or transform calls inside Column.filter / SlayerModel.filters will fail at load (move the predicate to a query-level filter or a ModelMeasure.formula); stored SlayerQuery objects that filter on a windowed Column will raise (switch to filters=["rank(<measure>) <= N"] or factor the window column into a multi-stage source); stored queries that filter on bare names not declared as Column entries on the model (silent pass-through to underlying-table columns) will raise at enrichment (declare the column on the model first). The new docs/concepts/references.md is the single source of truth; CLAUDE.md, docs/concepts/queries.md, and docs/concepts/models.md point at it.

Filters referencing unjoined models now raise at enrichment (DEV-1367)

A query filter of the form <other_model>.<col> <op> <value>, where <other_model> is not in the source model's joins, used to silently render SQL with an unbound table reference in the WHERE clause and fail at execution with a cryptic "no such column" -- the agent received an opaque SQL-runtime error instead of a clear translation-time error. DEV-1369's strict-resolution check fired only on the bare-name branch of resolve_filter_columns; this PR closes the dotted-path gap. The dotted-path branch now raises ValueError at enrichment when the path's head segment doesn't match any ModelJoin.target_model on the source model (error: "Filter '' references model '' but it is not in joins for source model ''. Add it to source_model.joins or rewrite the filter to use a local derived column.") or when the path resolves through joins but the leaf column doesn't exist on the terminal model. The cross-model-measure rerooting CTE path (_build_rerooted_enriched) inherits the outer query's filter list and some of those filters reference models reachable from the outer source but not from the re-rooted source; a new drop_unreachable_filters: bool = False kwarg threads through enrich_query / _enrich onto resolve_filter_columns's drop_if_unresolved parameter. The rerooting call sets it True so unresolved dotted paths drop from the parsed-filter list rather than raising; outer-query callers keep the strict default.

MCP `query` tool accepts inline `source_model` (DEV-1372)

The MCP query tool's source_model parameter is now typed str | ModelExtension | SlayerModel, matching SlayerQuery.source_model's native polymorphism. Previously typed str, which forced agents to JSON-encode inline dicts -- SLayer then validated the JSON blob as a model name and rejected it with Invalid model name '{...}': must not contain '/'. Wire-level evidence from a single 15-task households / Haiku 4.5 benchmark run: 15 / 88 query-tool calls (17 %) failed this way. The run-by-name shortcut (engine.execute(str) with model-name strings) is preserved -- it layers model.query_variables -> stage -> runtime precedence that the regular SlayerQuery path doesn't reach -- and is gated on isinstance(source_model, str) so inline values fall through to SlayerQuery.model_validate. The variable-precedence asymmetry that forces this gate is filed separately as DEV-1373.

SQL-mode predicate routing and string-hygiene allowlist (DEV-1378)

Closes two implementation gaps left over from the DEV-1369 reference-semantics work. First, Column.filter and SlayerModel.filters (SQL mode) were authored as raw SQL but re-parsed at enrichment time through the DSL parser, so any model filter containing json_extract, coalesce, CASE WHEN, or any other arbitrary SQL function raised Unknown filter function 'json_extract' at runtime. The new slayer/sql/sql_predicate.py module exposes a dedicated parse_sql_predicate that rejects DSL constructs (colon syntax, transform calls, raw OVER (...)) up front, extracts column-shaped identifiers, and otherwise accepts any SQL function call; every enrichment site that handles an SQL-mode filter (measure_def.filter, the model-filter validation loop, _collect_needed_paths, and _resolve_joins) now routes through this parser. DSL-mode query filters (SlayerQuery.filters) continue through the DSL parser unchanged. Strictness is split per mode: model filters resolve with strict=False, query filters with strict=True. Second, DSL-mode query filters now accept a small lowercase string-hygiene allowlist -- lower, upper, trim, replace, substr, instr, length, concat -- exposed as the STRING_HYGIENE_OPS frozenset in slayer/core/formula.py; the SQL || operator is rewritten to concat(...) by a new _preprocess_concat pass so agents can write lower(name) || '_x' = 'foo_x' in filter strings. Names are lowercase only, matching SLayer's existing transform convention; sqlglot translates per-dialect at SQL-generation time. Also folded in: the SQL generator's predicate parser now wraps user-supplied fragments in SELECT 1 WHERE ... before parsing (_parse_predicate), which dodges a sqlglot trap on SQLite / MySQL where REPLACE at the start of a fragment was parsed as the REPLACE INTO statement keyword rather than the function call. Docs updated in docs/concepts/queries.md; CLAUDE.md and .claude/skills/slayer-query.md reflect the new allowlist.

Filter-parser fixes for path-qualified LIKE and subquery-in-filter (DEV-1376)

Two DSL filter-parser gaps surfaced by the bird-interact-agents benchmark. Path-qualified LIKE / NOT LIKE (<joined_model>.<col> like '...') was rejected with "Unsupported filter syntax" because _preprocess_like's LHS regex excluded .; the four-step preprocess collapses into a single regex (_LIKE_RE) whose LHS accepts dotted paths and whose pattern group captures the closing quote in one shot. Subquery-shaped predicates (housenum in (select ...), not in (select ...), exists (select ...)) used to surface Python's misleading "Perhaps you forgot a comma?" advice and send agents on a nonsense recovery path; a new _SUBQUERY_IN_FILTER_RE early-rejects in parse_filter next to the existing has_window_function check and points at source_queries / Column.sql / joins instead. Narrowly scoped to IN (SELECT...) / NOT IN (SELECT...) / EXISTS (SELECT...); existing UNION SELECT / ; SELECT 1 paths keep their original handling.

Schema versions

SlayerModel remains version 5, SlayerQuery remains version 3, DatasourceConfig remains version 1. No storage migrations were needed for this release.

View release on GitHub

v0.5.0 Breaking risk 2mo

⚠ Upgrade required

Automatic v4 → v5 storage migration runs on load; use `slayer storage migrate-types [--dry-run]` for explicit post‑migration refinement.
The Jaffle Shop demo now depends on the upstream `jafgen` package (`^0.4.14`) as a core dependency, removing the need for manual git installs.

Breaking changes

DataType enum values are now uppercase sqlglot‑aligned names (TEXT, INT, DOUBLE, BOOLEAN, DATE, TIMESTAMP) instead of lowercase strings.
Unused aggregation pseudo-types (`count`, `sum`, `avg`, `min`, `max`, `last`, `count_distinct`) have been removed from DataType.

Notable features

Schema‑drift detection and idempotent re‑ingestion with `engine.validate_models` and `ingest_datasource_idempotent()`.
Agent memory layer for saving/retrieving free‑form learnings and example queries via MCP, REST, CLI, and async client methods.

Full changelog

SLayer 0.5.0 Release Notes

A feature release: five PRs since 0.4.3 add an agent-memory layer, a schema-drift validation and idempotent re-ingestion subsystem, three new ranking transforms, and type-aware CAST emission. Two breaking changes ride along, both covered by an automatic v4 -> v5 storage migration: DataType enum values are now uppercase sqlglot-aligned names (TEXT, INT, DOUBLE, BOOLEAN, DATE, TIMESTAMP), and the unused aggregation pseudo-types (count, sum, avg, min, max, last, count_distinct) have been dropped from DataType.

Schema-drift detection and idempotent re-ingestion (DEV-1356)

A new read-only diagnostic, engine.validate_models(data_source=...), diffs persisted models against live database schemas and returns the minimal list of deletes needed for SQL generation to remain valid (EditModelDelete for individual columns / measures / joins, WholeModelDelete when an entire backing table is gone). The same surface is exposed as the MCP validate_models tool, the REST POST /validate-models endpoint, and the CLI slayer validate-models [--data-source X] [--json]; --force-clean actually applies the deletes for batch repair workflows. The new top-level slayer/engine/schema_drift.py module owns the cascade walking and a two-pass invariant so dropping a column also prunes anything that referenced it. Query-time DBAPI errors that are caused by drift are now wrapped as the new SchemaDriftError (in slayer/core/errors.py) carrying a structured payload instead of opaque driver text. ingest_datasource_idempotent() in slayer/engine/ingestion.py walks the live datasource and only adds to existing sql_table-mode models -- it never overwrites user customizations and silently skips sql-mode and query-backed models. See docs/concepts/schema-drift.md and docs/examples/schema-drift.md.

Agent memory layer: learnings and saved queries (DEV-1357)

A new slayer.memories package (models.py, service.py, resolver.py) lets agents save free-form notes ("learnings") and example queries indexed by the schema entities they reference, then recall them before drafting new queries. Three MCP tools (save_memory, forget_memory, recall_memories), three REST endpoints (POST /memories, DELETE /memories/{memory_id}, POST /memories/recall), three CLI subcommands (slayer memory save|forget|recall), and matching SlayerClient async methods all front the same MemoryService orchestrator. Entity references are canonicalized through slayer.memories.resolver; unresolvable references raise the new EntityResolutionError (HTTP 422), and missing memory IDs raise the new MemoryNotFoundError (HTTP 404). inspect_model now renders a Learnings section listing memories that reference the inspected model. Storage adds a memories table (SQLite) plus an entity-overlap index, and a memories.yaml + counters.yaml pair (YAML backend). See docs/concepts/memories.md and specs/DEV-1357-memories.md.

Type-aware CAST emission and DataType realignment (DEV-1361, BREAKING)

DataType enum values were lowercase strings ("string", "number", "integer", "boolean", "date", "time") plus seven unused aggregation pseudo-types ("count", "sum", ...); they are now uppercase sqlglot-aligned names (TEXT, INT, DOUBLE, BOOLEAN, DATE, TIMESTAMP), and the pseudo-types are gone. Generated SQL now emits explicit CAST(... AS <type>) on column SQL and measure expressions (idempotent against pre-cast input, so CAST(... AS INT) is not double-wrapped) so result tuples match downstream comparators that care about driver-returned types. Auto-ingestion now narrows generic numeric SQL columns to INT vs DOUBLE based on live precision/scale via slayer/storage/type_refinement.py instead of always emitting DOUBLE. The SlayerModel schema bumps to v5: the v4 -> v5 converter in slayer/storage/v5_migration.py is a pure dict transform that rewrites legacy lowercase types and strips the dropped pseudo-types so the field falls through to its default after Pydantic loads; it recurses into source_queries[].source_model inline dicts so multi-stage models migrate cleanly. Migration is automatic on load via the converter chain in slayer/storage/migrations.py. The new slayer storage migrate-types [--dry-run] [--data-source X] CLI exposes the optional post-migration datasource introspection refinement step explicitly for batch workflows. Lenient before-validators on Column.type and ModelMeasure.type still accept legacy lowercase input from external callers without erroring.

Ranking transform family (DEV-1353)

Three new first-class transforms -- percent_rank, dense_rank, ntile -- join the existing rank. All four accept an optional partition_by= kwarg to rank within subsets of rows (a column name, a dotted join path, or a list of either) and require no time dimension; ntile additionally takes a positive-integer n= kwarg. The SQL generator emits native PERCENT_RANK(), DENSE_RANK(), NTILE(n) window functions; positional args beyond the wrapped measure are rejected with an actionable error pointing at the keyword form. See the rank-family entries in slayer/help/topics/04_transforms.md and docs/concepts/formulas.md.

Demo: upstream `jafgen` is now a core dependency

The Jaffle Shop demo dataset switched from a vendored, extras-only git-URL install to the upstream jafgen package pinned to ^0.4.14 as a core dependency, so pip install motley-slayer alone enables slayer datasources create demo (and the --demo flags on slayer serve / slayer mcp) without any manual git-based install. slayer/demo/jaffle_shop.py was rewritten to drive the upstream CLI rather than ship its own vendored DDL.

Schema versions

SlayerModel is now version 5 (was 4 in 0.4.3). SlayerQuery remains version 3 and DatasourceConfig remains version 1. All older versions are upgraded automatically on load via the converter chain in slayer/storage/migrations.py; saves always emit the current version.

View release on GitHub

v0.4.3 Breaking risk 2mo

⚠ Upgrade required

New `data_source` argument required on `engine.execute`, MCP model tools, HTTP `/models` endpoints (`?data_source=`), and `SlayerClient`. Existing bare‑name calls will raise `AmbiguousModelError` if multiple datasources match.
Use new datasource‑priority APIs (`set_datasource_priority`, `get_datasource_priority`, CLI `slayer datasources priority`) to resolve ambiguous lookups without passing `data_source` on every call.

Breaking changes

Models are now keyed by `(data_source, name)` instead of bare `name`; schema bumps from v3 to v4 with automatic migration moving YAML files; bare-name lookups that match more than one datasource raise `AmbiguousModelError` (HTTP 409).

Notable features

Recursive expansion of derived columns allows any column on the same or joined model—including other derived columns—to be referenced in SQL expressions.
Auto‑join discovery for filter‑only cross‑table references automatically adds implied joins when a filter uses a derived column that spans a join.
Multi‑stage query `source_queries` can now declare `joins.target_model` against prior sibling stages, forming a true DAG instead of being limited to final stage joins.

Full changelog

SLayer 0.4.3 Release Notes

A bug-fix and quality release: thirteen PRs since 0.4.2 covering storage, the query planner, SQL generation, MCP introspection, and a large drop in test-suite wall time.

Datasource-scoped model storage

Models are now keyed by (data_source, name) instead of bare name, so two datasources sharing a table name no longer collide silently. The schema bumps from v3 to v4, with an automatic migration that moves YAML files from models/<name>.yaml into models/<data_source>/<name>.yaml and rebuilds the SQLite models table with a composite primary key. A new data_source argument flows through engine.execute, the MCP create_model / edit_model / inspect_model / delete_model tools, the HTTP /models endpoints (?data_source=), and SlayerClient. Bare-name lookups that match more than one datasource raise the new AmbiguousModelError (HTTP 409). New datasource-priority APIs -- set_datasource_priority / get_datasource_priority, GET/PUT /datasources/priority, and the slayer datasources priority CLI -- let you disambiguate without passing data_source on every call.

Recursive expansion of derived columns

A Column.sql may now reference any other column on the same model or on a joined model -- including derived columns whose own sql is itself an expression. Chains like "A.bar / B.foo_normalized" (where foo_normalized is itself derived) used to fail at execute with no such column; expansion now happens at enrichment time and applies uniformly to dimensions, time dimensions, measures, cross-model measures, and filters. Reference cycles raise a clear error.

Auto-join discovery for filter-only cross-table refs

When a filter references a bare-named local derived column whose Column.sql itself crosses a join (e.g., filters=["is_eu = 1"] where is_eu.sql reads customers.region), the planner now walks the column's reference chain, discovers the implied joins, and adds them. Previously the column also had to appear in dimensions to force the join.

Window functions in filters

Filter expressions that resolve to SQL containing OVER (...) no longer crash with near "OVER": syntax error. A model column whose sql contains a window function is auto-promoted to a computed column on the inner SELECT and referenced via an outer post-filter wrap, so it can appear in a query filter naturally. Raw OVER text inside a ModelMeasure.formula or an inline query filter is rejected at construction or parse time with an actionable error pointing at the rank() / first() / last() / lag() / lead() transforms.

Multi-stage query joins form a DAG

Non-final named stages in source_queries can now declare joins.target_model against prior named sibling stages, so a multi-stage query forms a true DAG instead of being limited to "final stage joins anything, inner stages join nothing". Forward references and self-references surface as clear errors instead of the generic Model 'X' not found.

Honor user-supplied measure names on inner stages

{"formula": "amount:sum", "name": "rev"} on an inner SlayerQuery stage now emits the alias rev instead of the canonical amount_sum. Downstream stages that reference the renamed measure no longer error or surface NULLs.

SQLite `json_extract()` semantics fix

sqlglot's default SQLite generator was emitting JSONExtract as col -> '$.path', which in SQLite returns the JSON-quoted form ('"Owned"' with literal quotes). Equality and CASE-WHEN matches against bare-string literals silently produced all-NULL or 0 -- wrong answers, no error. The generator now rewrites every JSONExtract to JSON_EXTRACT(...) on SQLite while leaving Postgres / DuckDB / MySQL paths and JSONExtractScalar (->>) untouched.

Preserve `log10(x)` / `log2(x)` literal form

sqlglot was canonicalizing log10(x) and log2(x) into LOG(10, x) / LOG(2, x), which broke benchmark agents reading back inspect_model.last_sql and triggered runtime errors on dialects without 2-arg LOG(b, x) (Oracle in some configs, embedded SQLite builds without log). The generator now rewrites the canonicalized form back to log10(...) / log2(...) on every dialect that natively supports the alias. A strict-error log2 UDF for SQLite is registered alongside the existing _log10 for parity with Postgres semantics.

`aggN` placeholder leak fix

When a measure formula wrapped a colon-syntax aggregation reference inside a non-transform SQL call (nullif, coalesce, ...) and combined it with another aggregation in arithmetic, the embedded __aggN__ placeholder leaked all the way to emitted SQL, crashing execution with no such column: __aggN__. The formula AST walker now recurses into non-transform Call args and keywords so the placeholder is registered correctly.

Multi-hop derived-column qualification regression tests

Six regression tests pin the contract that derived Column.sql reached via a multi-hop join path emits correctly qualified SQL across dim refs, cross-model measure aggregations, filters, time dimensions, the verbatim solar-panels reproduction from the original issue, and the diamond-join arm-isolation case. The bug itself was incidentally fixed by the derived-column expansion work above; the tests pin the fix so it cannot silently regress.

`meta` field surfaced in `inspect_model`

The meta field on ModelMeasure and Aggregation (added in 0.4.2) round-tripped through storage but never appeared in inspect_model output, so callers concluded the value had been dropped. Markdown columns / measures / aggregations tables now include a meta cell (compact JSON, pruned when no entity in the section uses it); the model header gets a **meta:** bullet; the JSON form emits meta unconditionally on every entity.

Unit suite ~30x faster

_retry_with_backoff was retrying every OperationalError, including deterministic ones like no such table, syntax errors, and permission denied -- adding a 3-second backoff to every such failure. A new _is_transient_db_error filter narrows retries to DisconnectionError plus a known-transient DBAPI signal allowlist (database is locked, deadlock, lost connection, broken pipe, server closed, connection refused/reset). Combined with a deferred dbt.cli.main import (saving ~4s of CLI startup) and session-scope MCP and FastAPI test fixtures, the unit suite drops from ~243s to ~8.3s. --cov is also dropped from the default pytest addopts so coverage is now opt-in via pytest --cov=slayer --cov-report=term-missing.

Integration suite ~85s faster

Module-scope Postgres and DuckDB fixtures plus a cached Jaffle Shop DuckDB shared across tests cut the integration-suite wall time without weakening per-test isolation guarantees.

View release on GitHub

v0.4.2 New feature 2mo

Notable features

Added meta field to ModelMeasure and Aggregation

Full changelog

What's Changed

Add meta field to ModelMeasure and Aggregation by @ZmeiGorynych in https://github.com/MotleyAI/slayer/pull/83

Full Changelog: https://github.com/MotleyAI/slayer/compare/v0.4.1...v0.4.2

View release on GitHub

v0.4.1 New feature 2mo

Security fixes

Validates `percentile(p=…)` argument as numeric in [0,1] to prevent injection attacks such as `pg_sleep(10)`

Notable features

Seven statistical aggregations (stddev_samp, var_pop, corr, covar_samp, etc.) exposed via colon syntax
Scalar math UDFs (ln, log10, log(b,x), exp, sqrt, pow/power) added to SQLite

Full changelog

SLayer 0.4.1

Bug-fix-and-capability release that brings SQLite to math/stat parity
with Postgres, DuckDB, MySQL, and ClickHouse, and exposes seven new
statistical aggregations as built-in colon-syntax aggregates. Motivated
by DEV-1317: a
BIRD-Interact KB-translation effort that hit ~30–40 formulas which were
unencodable on bare SQLite for want of basic math/stat operators.

Backwards-compatible. No schema migrations.

What's new

Scalar math UDFs on SQLite

Use these inside Column.sql, ModelMeasure.formula, or
Aggregation.formula. NULL inputs return NULL (matching cross-dialect
SQL semantics). Math-domain errors propagate as
sqlite3.OperationalError — matching Postgres's strict semantics, not
SQLite ≥3.35's silent-NULL log() built-in.

| Name | Args | Notes |
|---|---|---|
| ln(x) | 1 | Natural logarithm |
| log10(x) | 1 | Base-10 logarithm |
| log(B, X) | 2 | Returns log_B(X). Base first, value second — matches Postgres LOG(b, x), sqlglot's emission, and the SQLite ≥3.35 built-in. The UDF registers unconditionally and overrides any built-in so the strict-error policy is uniform across SQLite versions. |
| exp(x) | 1 | e^x |
| sqrt(x) | 1 | Square root |
| pow(x, n) / power(x, n) | 2 | Both spellings registered. Uses math.pow internally — rejects negative-base + non-integer exponent (Python's ** would silently return a complex), and bounds huge exponents to IEEE-754 (Python's ** would build unbounded big-ints). |

Statistical aggregations (colon syntax)

Available as built-in aggregations on every numeric column:

{"formula": "latency:stddev_samp"}
{"formula": "latency:var_pop"}
{"formula": "price:corr(other=quantity)"}
{"formula": "price:covar_samp(other=quantity)"}

| Aggregation | Postgres-matching edge cases |
|---|---|
| stddev_samp / var_samp / covar_samp | NULL when N ≤ 1 |
| stddev_pop / var_pop / covar_pop | NULL at N = 0; 0 at N = 1 |
| corr | NULL when fewer than 2 non-null pairs OR either side has zero variance |

The two-column aggregations (corr, covar_samp, covar_pop) take
the second column as a named other= parameter — same shape as
weighted_avg(weight=…). NULL pairs (either x or y NULL) are dropped
entirely from the calculation.

Memory profile: implemented with Welford online accumulators —
O(1) memory regardless of group size, and numerically more stable than
the naive two-pass formula. Safe for SQLite analytics workloads with
millions of rows per group.

Cross-dialect support

| Dialect | Stat aggregates | Math scalars |
|---|---|---|
| SQLite | All seven via Python UDFs (slayer/sql/sqlite_udfs.py); var_samp/var_pop aliased to variance/variance_pop to handle sqlglot's transpilation rewrite. | All seven UDFs above. |
| Postgres / DuckDB | All native: STDDEV_* / VAR_* / CORR / COVAR_*. | All native. |
| ClickHouse | All native (sqlglot may emit camelCase, e.g. varSamp). | All native. |
| MySQL | STDDEV_SAMP / STDDEV_POP / VAR_SAMP / VAR_POP are native. corr / covar_samp / covar_pop raise NotImplementedError at SQL generation time (no native function, no Python-UDF mechanism). Use MariaDB or compute client-side. | All native. |

Internal improvements

These don't change behaviour but are worth knowing about:

Unified aggregate builders. The three dialect-aware builders
(_build_percentile, _build_formula_agg, _build_stat_agg in
slayer/sql/generator.py) now share two helpers — _wrap_filter and
SQLGenerator._resolve_agg_param — and emit fully model-qualified
column references via _resolve_sql. Cognitive complexity of
_build_stat_agg dropped from 22 → ~6, addressing a Sonar
python:S3776 finding.
percentile(p=…) validation. p is now validated as a numeric
literal in [0, 1] after parameter resolution. Closes a security gap
where a malicious model-level default like p=pg_sleep(10) could
bypass the SQL-injection regex check (Codex review finding).

Verify your work

# Full unit suite
poetry run pytest -m "not integration"

# SQLite + DuckDB integration (no Docker needed)
poetry run pytest tests/integration/test_integration.py -m integration
poetry run pytest tests/integration/test_integration_duckdb.py -m integration

# Lint
poetry run ruff check slayer/ tests/ examples/

If you run the Docker-backed examples, the new aggregations are smoke-
tested against live ClickHouse and MySQL via
examples/{clickhouse,mysql}/verify.py (with check_stddev_var() and,
on ClickHouse, check_corr_covar() — MySQL skips the latter since
corr/covar_* aren't supported there).

Documentation

CLAUDE.md — Key Conventions + Aggregation caveats sections updated.
docs/concepts/{formulas,queries,models,terminology}.md — new aggregations + scalar functions documented end-to-end, with the log(B, X) arg-order pin called out explicitly.
docs/database-support.md — per-dialect support matrix + SQLite UDF caveats.
docs/examples/07_aggregations/aggregations.md — extended quick-reference table.
slayer/help/topics/03_aggregations.md + .claude/skills/slayer-{query,models}.md — agent-facing summaries updated.

Known limitations & follow-ups

Filed as sibling Linear issues for future releases:

DEV-1320 — SQLite string/regex UDFs (regexp, regexp_replace, split_part, …). Not blocking any known KB pattern but rounds out parity.
DEV-1321 — ClickHouse stat aggregates may return NaN (not SQL NULL) for single-row / empty / zero-variance groups, contradicting the documented Postgres-style semantics. Fix needs IF(isNaN(x), NULL, x) wrapping plus ClickHouse integration coverage.
DEV-1322 — Type-check kwarg-as-column args for weighted_avg(weight=…) / corr(other=…) / covar_*(other=…). Today amount:corr(other=string_column) slips past enrichment and fails at the database with a dialect-specific runtime error rather than a clear "not numeric" message.

Acknowledgements

CodeRabbit, SonarCloud, and Codex (via the codex-review skill) all
contributed review feedback that shaped the final shape of this
release.

View release on GitHub

v0.4.0 Breaking risk 2mo

⚠ Upgrade required

Run any existing model through `engine.save_model()` once to trigger the v1→v2→v3 migration; stored YAML/SQLite will be upgraded automatically.
Update Python code constructing `SlayerQuery` objects: replace `fields=` with `measures=` and remove `dry_run`/`explain` from query bodies.
When using CLI, prefer `--storage` over the deprecated `--models-dir`; default storage locations have changed per OS.

Breaking changes

`SlayerQuery.fields` → `SlayerQuery.measures` (all interfaces)
`SlayerModel.dimensions` + `SlayerModel.measures` → unified `columns`; old `Dimension`/`Measure` classes removed
`dry_run` and `explain` are no longer fields on `SlayerQuery`; pass them as kwargs to `engine.execute()`

Notable features

Query-backed models as a first-class source mode (`source_queries`, cached on save)
Windowed aggregations (`sum(window='30d')`, `avg(window='1y2m')`) and `consecutive_periods` transform
`median` and `percentile` support across Postgres, DuckDB, SQLite, ClickHouse

Full changelog

SLayer 0.4.0 — Schema overhaul, query-backed models, windowed aggregations

The 0.4.0 release lands the v1 → v2 → v3 schema migration: a unified column model, query-backed models as a first-class source mode, windowed sum/avg, cross-dialect median/percentile, and a richer MCP surface. The wire format changes are backward-compatible on read (a migration chain auto-upgrades v1/v2 YAML and SQLite stores at load time), but the Python types and JSON shapes have moved — see Breaking changes below.

Every change in this release is covered by the unit suite (1264 tests) and integration suites against SQLite, Postgres, and DuckDB.

Highlights

Unified columns and query-time aggregation (v2 schema)

SlayerModel no longer has separate dimensions and measures lists. Both have been merged into a single columns: List[Column]. Each column carries a data type and can be used either as a group-by dimension or as the input to an aggregation — the role is decided per query. SlayerModel.measures is repurposed to hold a library of named formulas (ModelMeasure) — saved metrics that queries can reference by bare name.

Aggregations are picked at query time using colon syntax: revenue:sum, *:count, price:weighted_avg(weight=quantity). Function-style calls (sum(revenue)) auto-correct to the colon form with a deprecation warning.

# v2/v3 model
columns:
  - {name: id, sql: id, type: number, primary_key: true}
  - {name: status, sql: status, type: string}
  - {name: revenue, sql: amount, type: number}
  - {name: quantity, sql: qty, type: number}

measures:
  - {name: aov, formula: "revenue:sum / *:count", label: "Average Order Value"}

// v2/v3 query
{
  "source_model": "orders",
  "measures": ["revenue:sum", {"formula": "aov"}],
  "dimensions": ["status"]
}

SlayerQuery.fields is now SlayerQuery.measures, and model is now source_model.

Query-backed models — first-class source mode

A SlayerModel can now be backed by one or more saved SlayerQuery stages:

await engine.create_model_from_query(
    query={"source_model": "orders", "measures": ["amount:sum"], "dimensions": ["region"]},
    name="monthly_revenue",
    variables={"region": "US"},  # default placeholder values
)

# Run by name
result = await engine.execute("monthly_revenue", variables={"region": "EU"})

source_queries joins sql_table and sql as the third source mode (mutually exclusive). query_variables provides defaults for {var} placeholders inside the saved stages, and a save-time dry-run populates the columns + backing_query_sql cache. The cache refreshes only on save paths — engine.execute never writes back to storage (closes a lost-update race against parallel edit_model calls).

Surfaced across every interface: REST POST /query accepts {"name": "...", "variables": {...}}; MCP query tool accepts a model-name shortcut; CLI slayer query <model_name> --variables k=v.

Schema versioning + read-time migrations

Every persisted entity (SlayerModel, SlayerQuery, DatasourceConfig) now carries a version: int field. Older versions are upgraded transparently at load time via a converter chain:

v1 → v2: merges dimensions + measures into columns; renames SlayerQuery.fields → measures; converts the deprecated type: sum/avg alias on legacy measures into allowed_aggregations.
v2 → v3: strips dry_run/explain from SlayerQuery (they're now engine.execute() kwargs); walks SlayerModel.source_queries recursively.

The migration hook lives on the Pydantic class itself (@model_validator(mode="before")), so every storage backend — YAMLStorage, SQLiteStorage, third-party backends, plus the MCP/REST/dbt entry points — picks up migrations automatically with no backend-side changes. See docs/concepts/models.md#schema-versioning.

Windowed aggregations + `consecutive_periods` transform

sum and avg accept an optional window parameter for trailing-time aggregations:

{
  "measures": [
    {"formula": "revenue:sum(window='30d')", "name": "revenue_30d"},
    {"formula": "revenue:avg(window='1y2m')", "name": "avg_14m"}
  ],
  "time_dimensions": [{"dimension": "created_at", "granularity": "month"}]
}

Compact duration grammar: y, m, w, d, h, min, s — combinable as 1y2m3w5d6h7min8s. The window is computed over raw source rows ending at each output bucket, so it can be larger, equal to, or smaller than the query's time granularity.

consecutive_periods(predicate) is a new time-ordered transform returning the trailing run length where the predicate is true at the query grain — usable in selected measures, filters, and comparisons against integer literals.

Time-ordered transforms (cumsum, lag, lead, consecutive_periods) now partition by the query's non-time dimensions, so running calculations don't bleed across groups.

`median` and `percentile` across tier-1 databases

revenue:median and revenue:percentile(p=0.95) now work on Postgres, DuckDB, SQLite, and ClickHouse:

Postgres: native PERCENTILE_CONT(p) WITHIN GROUP (ORDER BY x).
DuckDB: sqlglot rewrites to QUANTILE_CONT(x, p ORDER BY x).
SQLite: Python aggregate UDFs registered on every connection.
ClickHouse: native median(x) and parametric quantile(p)(x).
MySQL: not supported — raises NotImplementedError at SQL generation time. Use MariaDB or compute client-side.

MCP `inspect_model` section budgeting

inspect_model is the heaviest MCP tool an agent uses on each new model. It now accepts three params so callers can drill in without paying for the full envelope every call:

sections: subset of ["columns", "measures", "aggregations", "joins", "reachable_fields", "samples"]. Default = all six. Omitted sections collapse to a one-line backticked CSV of names (or are dropped entirely for reachable_fields / samples).
descriptions_max_chars: trim long descriptions with a literal ... [truncated] marker.
reachable_fields_depth: configurable BFS depth for joined-model field expansion (default 5, range [0, 20]).

When any section is trimmed, a quoted-Markdown footer at the end of the response shows what was returned, what was names-only, and what was omitted, with a hint on how to re-fetch. JSON output mirrors the same structure with <section>_names siblings and top-level omitted_sections / names_only_sections / unknown_sections arrays.

Default behaviour is unchanged: inspect_model(model_name="orders") produces the same payload as before.

One-line setup commands

slayer serve --demo and slayer mcp --demo spin up the bundled Jaffle Shop DuckDB datasource and ingest its models on startup. Idempotent. Requires the duckdb extra and jafgen.
slayer datasources create postgresql://user:${DB_PASSWORD}@host/db --ingest creates a datasource and ingests it in one command. Connection strings are parsed; secrets can be supplied via --password-stdin.

Auto-correct for function-style aggregations

SUM(amount), count(*), etc. inside a formula now auto-correct to the colon-aggregation form (amount:sum, *:count) with a one-shot warning. The colon syntax is canonical, but agents that emit SQL-flavoured aggregation calls no longer error out.

`first()` transform

Adds first(x) as the FIRST_VALUE-window mirror of the existing last(x) transform — broadcasts the earliest time bucket's value across all rows. Useful for "% growth from baseline" queries.

Breaking changes

The wire format is backward-compatible on read — old YAML and SQLite stores upgrade transparently via the v1 → v2 → v3 migration chain. But the Python types and JSON shapes have moved:

SlayerQuery.fields → SlayerQuery.measures. All four interfaces (REST body, MCP tool, CLI JSON, Python kwarg) now use measures. The Field class no longer exists; use {"formula": "..."} dicts or pass strings.
SlayerQuery.model → SlayerQuery.source_model in JSON bodies. The Python kwarg has always been source_model.
SlayerModel.dimensions + SlayerModel.measures → SlayerModel.columns. The old Dimension and Measure row-level classes are gone; a unified Column replaces both. SlayerModel.measures still exists but now holds named formulas (ModelMeasure), not row-level SQL.
SlayerQuery v3 sets extra="forbid". Direct Python construction with an unknown field (SlayerQuery(typo=...)) now raises ValidationError. Persisted v1/v2 queries with dry_run/explain are auto-migrated with a one-shot DeprecationWarning.
dry_run and explain are no longer fields on SlayerQuery. Pass them as kwargs to engine.execute(query, dry_run=..., explain=...). The MCP query tool, CLI --dry-run/--explain flags, and REST POST /query body still accept them at the surface layer; the routing pops them before constructing the SlayerQuery.
Query-backed model caches refresh only on save paths (engine.save_model, create_model_from_query(save=True)). Direct storage.save_model() calls bypass the cache refresh — go through the engine. engine.execute() no longer writes to storage as a side effect.
CLI: --models-dir is deprecated (still works); use --storage. Default storage path is now platform-appropriate (~/.local/share/slayer on Linux, ~/Library/Application Support/slayer on macOS, %LOCALAPPDATA%\slayer on Windows) instead of ./slayer_data. Override with $SLAYER_STORAGE.

Migration

For most users, no action is needed: load your v1/v2 store with 0.4.0, save any model once, and on-disk YAML/SQLite is upgraded to v3.

A few callers will need explicit fixes:

Anything constructing SlayerQuery in Python code: rename fields= → measures=. The old model= kwarg was already aliased to source_model= in v0.3 but is now removed; use source_model= directly.
Anything reading model.dimensions or model.measures (where measures was the row-level list): switch to model.columns. If you specifically want the named-formula library, that's still model.measures but contains ModelMeasure objects ({formula, name, label, description}), not the old Measure shape.
Any code passing dry_run=True inside a query body: move it to the engine.execute() kwarg.

What's new (full list)

Schema and migrations

#57 — Add version: int to every persisted entity + read-time migration chain in slayer/storage/migrations.py. Foundational; functionally a no-op until the v2 cut.
#61 (S3) — v2 schema: merge dimensions + measures into columns; repurpose measures as ModelMeasure formulas; rename SlayerQuery.fields → measures; v1 → v2 converter in slayer/storage/v2_migration.py.
#63 (S3a) — Bare-name ModelMeasure resolution ({"formula": "aov"} expands to the saved formula); 8 review follow-ups.
#66 (S4) — Auto-ingestion and dbt converter aligned with v2 schema (one Column per non-joined column, *:count always available, no auto-generated measures).
#77 — v3 schema: drop dry_run/explain from SlayerQuery (now engine kwargs); add extra="forbid"; v2 → v3 converter in slayer/storage/v3_migration.py.

New features

#67 (S2a) — Query-backed models as a first-class source mode (SlayerModel.source_queries, query_variables, engine.create_model_from_query, engine.execute("name") shorthand). Surfaced through MCP, REST, and CLI.
#64 (S5) — Windowed sum/avg via (window="..."); new consecutive_periods(predicate) transform; partition time-ordered transforms by non-time dimensions.
#60 (S7) — median and percentile work across Postgres, DuckDB, SQLite, ClickHouse. SQLite uses Python aggregate UDFs registered per-connection.
#70 (S8) — inspect_model section budgeting (sections, descriptions_max_chars, reachable_fields_depth).
#43 — One-line slayer datasources create <connection-string> [--ingest]. Supports --password-stdin for secure credentials.
#44 — slayer serve --demo / slayer mcp --demo boots the bundled Jaffle Shop DuckDB.
#45 — Auto-correct function-style aggregations (sum(x) → x:sum); add first() transform; convert remaining dataclasses to Pydantic.

Bug fixes

#46 — Skip information_schema PK fallback for SQLite (was triggering a silent error on auto-ingestion).
#56 — Log the underlying DBAPI exception + SQL excerpt on transient DB retries.
#75 — Fix ClickHouse type ingestion mapping numerics to STRING.
#76 — sqlite:///:memory: now works across multiple async calls on a single SlayerSQLClient (per-instance StaticPool engine with check_same_thread=False). Two separate clients on :memory: are still isolated; use mode=memory&cache=shared to share.
#79 — Read paths (engine.execute, inspect_model, get_column_types, MCP query, REST /query) no longer write to storage. Eliminates a lost-update race against parallel edit_model calls.

Internal

#78 — Windowed-CTE and consecutive_periods builders refactored from string concatenation to sqlglot AST; pure refactor, no behaviour change.
#80 (S9) — Docs, skills, help topics, READMEs, runnable examples, and the slayer-query skill all aligned with v2/v3 schema. Also corrected a long-standing spec drift: *:count produces result key orders._count, not orders.count.

Links

View release on GitHub

v0.3.1 Bug fix 3mo

Notable features

Query‑time multi‑hop join resolution eliminates stale metadata storage
INNER join support with automatic mirroring for bidirectional relationships
Per‑measure CTE isolation enabling cross‑model filtered measures and arithmetic formulas

Full changelog

Join overhaul: query-time multi-hop resolution, INNER join support, per-measure CTE isolation

Motivation

The join system had fundamental limitations exposed by the ACME insurance benchmark (gpt-5.3-codex scored 1/11 correct). Baked-in multi-hop joins at ingestion created stale metadata when the graph changed. All joins were LEFT, causing fan-out on entity relationships that should be INNER (50-118x inflated sums on benchmark queries). Cross-model filtered measures sharing a single base query caused JOIN intersections that zeroed out results. This PR rewrites the join and SQL generation architecture to fix these issues.

What changed

1. Query-time multi-hop join resolution

Ingestion now emits only direct joins (one ModelJoin per FK). Multi-hop paths like orders.customers.regions are resolved at query time by walking each intermediate model's own joins. This eliminates redundant storage and keeps join metadata consistent when the graph changes.

Files: ingestion.py, enrichment.py (_collect_needed_paths, _resolve_joins)

2. LEFT/INNER join support

New JoinType enum (LEFT, INNER). dbt entity-based joins (both foreign->primary and peer) now use INNER, matching MetricFlow behavior. Inner joins are automatically mirrored: if A->B is INNER, B->A is created too.

Join synchronization is centralized in a new JoinSyncStorage decorator that wraps every storage backend, handling mirroring on save, orphan cleanup on delete, and one-time reconciliation of hand-edited YAML.

Files: enums.py, models.py, entities.py, converter.py, join_sync.py (new), base.py

3. Per-measure CTE isolation

Major SQL generation refactor. The generator now follows a multi-stage pattern:

Stage 1: Base CTE (simple measures + dims) + per-measure CTEs for cross-model and cross-model-filtered measures
Stage 2: Combined CTE (LEFT JOIN all CTEs on shared dims), expressions/transforms stacked on top

This fixes two classes of bugs: filtered measures needing different INNER JOINs no longer intersect in a single base query, and cross-model measures in arithmetic formulas (previously "not yet supported") now work.

CTE builders were also converted from f-string SQL to sqlglot AST building for dialect safety.

Files: generator.py (major rewrite), enrichment.py

4. Re-rooted subqueries for cross-model measures

Cross-model measure CTEs previously only joined source -> target, so filters on tables reachable from the target's join graph were silently dropped. Now each cross-model measure gets a re-rooted EnrichedQuery where the target model is the FROM table, giving it access to its full join graph for correct filter application. Unreachable dimensions/filters are dropped gracefully (scalar CROSS JOIN).

Files: query_engine.py (_build_rerooted_enriched), enriched.py, generator.py

5. dbt entity system fixes

Peer joins: Models sharing a primary entity (e.g., claim and claim_coverage both declaring claim_identifier) now get bidirectional joins, enabling paths like claim.claim_coverage.policy
Foreign entity fan-out: Resolution now iterates ALL matching primary models, not just the first
Metric filter qualification: Filters referencing dimensions on peer models are now qualified correctly (e.g., loss_payment.has_loss_payment)
Deterministic resolution: get_primary_model and peer dimension selection use lexicographic ordering instead of depending on registration order
Peer join dedup: Uses (model, local_expr, peer_expr) signature instead of model-name-only

Files: entities.py, filters.py, converter.py

6. LLM-friendliness improvements

Auto-promote fields to dimensions: When an LLM puts a bare dimension name in fields, it's automatically moved to dimensions (with validation: no aggregation, resolves as dimension, not a measure)
Source model prefix stripping: New SlayerQuery.strip_source_model_prefix() centralizes the removal of redundant prefixes like orders.revenue:sum when source_model=orders, handling all reference types (dims, time dims, fields, filters, order, main_time_dimension)
Dimension aggregation: claim_identifier:count_distinct now works; dimensions are looked up when no measure matches, with type-safety checks (rejects sum/avg on strings)
Cross-model dimension aggregation: Formulas like policies.id:count_distinct work via a synthetic Measure wrapper
Constant-SQL filter fix: Dimensions with literal SQL (e.g., sql="1") no longer produce broken premium.1 qualifications
inspect_model show_sql: Now controls ALL SQL visibility (model SQL, filters, dim/measure SQL columns), reducing noise in LLM context

Files: query_engine.py, query.py, enrichment.py, generator.py, server.py

7. Runtime column type inference and measure profiling

New get_column_types() probes measure types via the engine pipeline (properly joined SQL), supporting DuckDB, Postgres, MySQL, and SQLite with dialect-specific OID maps
inspect_model now shows type (number/string/time/boolean) and value range (min..max or "all NULL") columns for measures, helping LLMs avoid errors like SUM(VARCHAR)

Files: client.py, query_engine.py, server.py

8. Time shift rework

time_shift now shifts the time dimension column via SQL INTERVAL instead of computing shifted date range strings in Python. Eliminates _shift_date, _build_row_number_join, and the row-number vs calendar join branching
change(x) and change_pct(x) are desugared at enrichment time into a hidden time_shift + arithmetic expression, simplifying the generator

Files: enrichment.py, generator.py, enriched.py

9. Additional bug fixes

ORDER BY with custom field names: field_name_aliases map consulted by the ORDER BY resolver
ORDER BY in computed/transform path: Uses _resolve_order_column() instead of manual f-string formatting
Nested WITH clause: _build_combined() returns structured CTE tuples instead of assembled SQL
CTE name collisions: Same measure with different aggregations (e.g., revenue:sum + revenue:avg) gets distinct CTE names
CTE WHERE leaking: Filters narrowed to only conditions whose table aliases are available in each CTE
Inline-SQL model joins: JOIN injection moved from string replacement to sqlglot AST level
FK dedup: Uses (target, src_col, tgt_col) signature, preserving multiple FKs to the same table
Self-referencing join mirroring: Skipped to prevent duplicates
First/last isolated measures: Get their own ranked subquery within their CTE instead of referencing non-existent _last_rn columns from the base

Test coverage

~3,400 new test lines across 12 test files:

test_sql_generator.py: ~2,000 new lines covering CTE isolation, re-rooted subqueries, cross-model dimension aggregation, ORDER BY resolution, first/last isolation, SQL validity checks (_assert_valid_sql validates every generated query with sqlglot)
test_join_sync.py (new): 321 lines: mirroring, save/delete sync, reconciliation
test_models.py (new): 303 lines: strip_source_model_prefix exhaustive coverage
test_dbt_converter.py: 326 new lines: peer joins, foreign entity, inner join mirroring
test_dbt_filters.py: 100 new lines: peer dimension qualification, deterministic selection
test_sql_client.py (new): 86 lines: type code mapping across drivers
test_mcp_server.py: 154 new lines: show_sql gating, measure types, inner join mirroring
Integration tests: expanded with cross-model filter scenarios

Documentation

Updated across ingestion, joins, formulas, transforms, queries, terminology, and help topics to reflect direct FK joins at ingestion, query-time traversal, INTERVAL-based time shifting, and dimension aggregation.

View release on GitHub

v0.3.0 Breaking risk 3mo

⚠ Upgrade required

`SlayerQueryEngine.execute()` is now `async def`; use `execute_sync()` for synchronous contexts (CLI, notebooks, scripts)
`slayer serve` and `slayer mcp` default to platform‑specific storage paths (`~/.local/share/slayer`, `~/Library/Application Support/slayer`, `%LOCALAPPDATA%\slayer`) when no `--storage` flag or `$SLAYER_STORAGE` is set
dbt import requires installing the `dbt` extra: `pip install motley-slayer[dbt]`

Breaking changes

`datasource_summary` removed; replaced by `models_summary`
`inspect_model` output format changed to complete single-model view with sampled values, measure metadata, join fields, and auto‑pruned markdown tables

Notable features

New async‑first engine, storage backends, and SQL client using native async drivers (`asyncpg`, `aiomysql`) and thread fallbacks
Filter placeholders `{variable}` supporting reusable filter templates with substitution from a query's `variables` dict
`models`, `dimensions`, and `measures` now support an optional `meta: Dict[str, Any]` field for arbitrary JSON metadata

Full changelog

SLayer 0.3 Release Notes

Redesigned MCP capabilities

The MCP server has been substantially reworked. datasource_summary is replaced by models_summary — a per-datasource brief overview — and inspect_model now returns a complete single-model view with sampled dimension values, measure metadata, reachable join fields, and auto-pruned markdown tables. A new help tool provides contextual explanations of SLayer concepts (queries, transforms, joins, etc.) directly inside the agent conversation. All MCP tools are now fully async, and query output supports format parameter ("markdown", "json", "csv").

Async-first architecture

The engine, storage backends, and SQL client are now async-first. SlayerQueryEngine.execute() is async def; use execute_sync() for CLI, notebooks, and scripts. Postgres and MySQL queries use native async drivers (asyncpg, aiomysql) with pooled connections. SQLite and DuckDB fall back to asyncio.to_thread. Storage backends (YAMLStorage, SQLiteStorage) are async, ready for true async implementations like Postgres-backed multi-tenant storage. All tests use pytest-asyncio with asyncio_mode = "auto".

Variables in filters

Filters now support {variable} placeholders, substituted from the query's variables dict. This keeps filter templates reusable across different parameter values without string concatenation: "status = '{status}' AND amount > {min}" with variables={"status": "completed", "min": 100}. Variable names must be alphanumeric, values must be strings or numbers (inserted as-is). {{/}} produce literal braces. Undefined variables raise clear errors.

Meta fields on models, measures, and dimensions

Models, dimensions, and measures now have an optional meta: Dict[str, Any] field for arbitrary user-defined JSON metadata (e.g., {"owner": "analytics", "pii": true}). Persisted in YAML and SQLite storage, editable via MCP edit_model (pass null to clear), HTTP API, and CLI. The MCP sentinel pattern distinguishes "not provided" from "explicitly clear."

Default storage path

slayer serve and slayer mcp now work with zero configuration. When no --storage flag or $SLAYER_STORAGE is set, SLayer uses a platform-appropriate default: ~/.local/share/slayer on Linux (XDG-compliant), ~/Library/Application Support/slayer on macOS, %LOCALAPPDATA%\slayer on Windows.

dbt Semantic Layer import

SLayer can now import models from a dbt project via slayer import-dbt. Semantic models and metrics from dbt_project.yml are converted into SLayer models with dimensions, measures, and joins. The --include-hidden-models flag additionally imports every regular dbt model not wrapped by a semantic_model as a hidden SLayer model (queryable by name but excluded from discovery endpoints). Requires the dbt extra: pip install motley-slayer[dbt].

View release on GitHub

v0.2.0 Breaking risk 3mo

⚠ Upgrade required

Models must validate `allowed_aggregations` lists at creation time; existing models without this field will be rejected until updated.
Queries previously embedding aggregation functions directly in measure definitions need to be rewritten using the new colon syntax.
Filters relying on legacy operator syntax should be migrated to the SQL‑style operators (`=`, `<>`, `IN`, `IS NULL`).

Breaking changes

Measures must be defined as pure row‑level expressions (`{name, sql}`) without aggregation; aggregation is now specified at query time using colon syntax (e.g., `"revenue:sum").
Query dimensions, order, and fields can accept plain strings instead of object wrappers.
Filters use SQL‑style operators (`=`, `<>`, `IN`, `IS NULL`) and support multi‑hop notation.

Notable features

Cross‑model measures with dot syntax (e.g., `"customers.score:avg"`), supporting multi‑hop aggregation.
Multistage queries allowing a query to serve as the source for another query or be saved as a permanent model via `create_model_from_query`.
Dynamic joins with automatic resolution of diamond inheritance and path‑based aliases for disambiguation.

Full changelog

SLayer 0.2 Release Notes

Aggregation separated from measures

Measures are now pure row-level expressions ({name: "revenue", sql: "amount"}); aggregation is chosen at query time with colon syntax ("revenue:sum", "*:count", "price:weighted_avg(weight=quantity)"). Built-in aggregations: sum, avg, min, max, count, count_distinct, first, last, weighted_avg, median, percentile. Custom aggregations with SQL templates can be defined at the model level. Per-measure allowed_aggregations whitelists are validated at model creation and query time.

Cross-model measures

Query measures from joined models with dot syntax and colon aggregation ("customers.*:count", "customers.score:avg", multi-hop: "customers.regions.population:sum"). Sub-query isolation prevents JOIN row multiplication. Transforms compose on cross-model measures ("cumsum(customers.score:avg)").

Multistage queries

Use a query as the source for another query, or save any query as a permanent model with create_model_from_query. ModelExtension extends models inline with extra dimensions, measures, or joins at query time.

Dynamic joins with diamond support

Joins are auto-resolved at query time by walking the join graph. Path-based aliases (customers__regions vs warehouses__regions) disambiguate when the same table is reachable via multiple FK paths.

Model filters

Always-applied WHERE conditions on models (e.g., "deleted_at IS NULL").

DuckDB support

New Tier 1 database, fully integration-tested, no Docker required.

Query introspection

dry_run previews generated SQL without executing; explain runs dialect-aware EXPLAIN (Postgres, SQLite, MySQL, DuckDB, ClickHouse, Snowflake, SQL Server, Oracle each use the correct syntax).

Simpler query syntax

Dimensions, time dimensions, and order accept plain strings ("status" instead of {"name": "status"}). Fields accept strings with colon syntax ("revenue:sum" instead of {"formula": "revenue_sum"}).

SQL-style filters

Operators =, <>, IN, IS NULL; multi-hop filters ("customers.regions.name = 'US'"); computed-column filters ("change(revenue:sum) > 0").

Extended CLI

Full CRUD for datasources: slayer datasources create, create-inline, delete, test (connectivity check)
slayer ingest now supports --include and --exclude table filters
--storage flag on all commands: pass a directory for YAML or a .db file for SQLite storage
--password-stdin for secure credential input
Rich help text with usage examples on every subcommand (slayer query --help, etc.)

uvx / uv support

SLayer now works with uvx --from motley-slayer slayer serve and uv tool install motley-slayer out of the box. PEP 621 [project] metadata added to pyproject.toml.

Documentation overhaul

Getting Started split into four audience-specific guides: MCP (AI agents), CLI (terminal), REST API (any language), Python SDK
Interfaces → Reference — interface docs repositioned as reference, getting-started pages are task-oriented
Each getting-started page has a "Verify it works" section
Pre-commit hooks with ruff (lint + format check)
CONTRIBUTING.md with dev setup, test matrix, and extension guides

Tutorials

Seven new tutorial chapters with Jupyter notebooks covering dynamic models, SQL vs DSL, auto-ingestion, time operations, joins, cross-model measures, multistage queries, and aggregations — all using the Jaffle Shop dataset on DuckDB.

View release on GitHub

v0.1.0 Maintenance 3mo

Notable features

Agent-Native Interfaces: full MCP support via stdio, HTTP SSE, REST and CLI
Nested Formula API for complex derived metrics (e.g., change(cumsum(revenue)))
Time‑Aware Analytics with precise gap handling using CTEs

Full changelog

SLayer v0.1.0 – First Stable Release 🚀

SLayer is a lightweight, open-source semantic layer designed to give AI agents (and humans) a PhD in your data. It represents a shift toward "intent-based" data access – maximizing the reliability and minimizing SQL boilerplate while constructing complex real-world business reports.

Core Architecture

Agent-Native Interfaces: Full support for the Model Context Protocol (MCP) via stdio and HTTP SSE, enabling seamless integration with AI environments like Claude Code and Cursor, as well as REST and CLI interfaces
Flexible Deployment: Operates as a standalone service or an embeddable & extendable Python library. Its pluggable StorageBackend allows for dynamic models updates without redeploys.
Dialect Agnostic: Leverages sqlglot to provide consistent behavior across 10+ databases.

Advanced Query Engine

Nested Formula API: Support for complex, derived metrics via arithmetic and nested transforms at query time (e.g., change(cumsum(revenue))), abstracting away the complexity of window functions.
- Time-Aware Analytics: A robust implementation of time_shift, change, and change_pct transforms, and last as aggregation method. SLayer utilizes self-join CTEs to ensure precise gap handling and eliminate spurious NULLs during period-over-period comparisons.
Smart Filter Routing: Logic-aware filtering that automatically extracts inline expressions and routes them to the appropriate WHERE or HAVING clauses based on aggregation context.

Automation & Modeling

Auto-Ingestion: Intelligent schema introspection that auto-generates denormalized models and FK-based rollup joins, significantly reducing the time to first query.
Flexible Storage & Auditability: Models and datasources are managed via a version-control friendly YAML layer, a single-file SQLite backend for embedded use, or any custom implementation via the pluggable StorageBackend protocol.

View release on GitHub

All releases

SLayer 0.5.1 Release Notes

BM25 ranking for memory recall (DEV-1365, BREAKING)

Two-mode reference semantics (DEV-1369, BREAKING)

Filters referencing unjoined models now raise at enrichment (DEV-1367)

MCP query tool accepts inline source_model (DEV-1372)

SQL-mode predicate routing and string-hygiene allowlist (DEV-1378)

Filter-parser fixes for path-qualified LIKE and subquery-in-filter (DEV-1376)

Schema versions

SLayer 0.5.0 Release Notes

Schema-drift detection and idempotent re-ingestion (DEV-1356)

Agent memory layer: learnings and saved queries (DEV-1357)

Type-aware CAST emission and DataType realignment (DEV-1361, BREAKING)

Ranking transform family (DEV-1353)

Demo: upstream jafgen is now a core dependency

Schema versions

SLayer 0.4.3 Release Notes

Datasource-scoped model storage

Recursive expansion of derived columns

Auto-join discovery for filter-only cross-table refs

Window functions in filters

Multi-stage query joins form a DAG

Honor user-supplied measure names on inner stages

SQLite json_extract() semantics fix

Preserve log10(x) / log2(x) literal form

__aggN__ placeholder leak fix

Multi-hop derived-column qualification regression tests

meta field surfaced in inspect_model

Unit suite ~30x faster

Integration suite ~85s faster

What's Changed

SLayer 0.4.1

What's new

Scalar math UDFs on SQLite

Statistical aggregations (colon syntax)

Cross-dialect support

Internal improvements

Verify your work

Documentation

Known limitations & follow-ups

Acknowledgements

SLayer 0.4.0 — Schema overhaul, query-backed models, windowed aggregations

Highlights

Unified columns and query-time aggregation (v2 schema)

Query-backed models — first-class source mode

Schema versioning + read-time migrations

Windowed aggregations + consecutive_periods transform

median and percentile across tier-1 databases

MCP inspect_model section budgeting

One-line setup commands

Auto-correct for function-style aggregations

first() transform

Breaking changes

Migration

What's new (full list)

Schema and migrations

New features

Bug fixes

Internal

Links

Join overhaul: query-time multi-hop resolution, INNER join support, per-measure CTE isolation

Motivation

What changed

1. Query-time multi-hop join resolution

2. LEFT/INNER join support

3. Per-measure CTE isolation

4. Re-rooted subqueries for cross-model measures

5. dbt entity system fixes

6. LLM-friendliness improvements

7. Runtime column type inference and measure profiling

8. Time shift rework

9. Additional bug fixes

Test coverage

Documentation

SLayer 0.3 Release Notes

Redesigned MCP capabilities

Async-first architecture

Variables in filters

Meta fields on models, measures, and dimensions

Default storage path

MCP `query` tool accepts inline `source_model` (DEV-1372)

Demo: upstream `jafgen` is now a core dependency

SQLite `json_extract()` semantics fix

Preserve `log10(x)` / `log2(x)` literal form

`aggN` placeholder leak fix

`meta` field surfaced in `inspect_model`

Windowed aggregations + `consecutive_periods` transform

`median` and `percentile` across tier-1 databases

MCP `inspect_model` section budgeting

`first()` transform