This release adds 2 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+14 more
Summary
AI summaryNew CLI command aegis check drift adds an offline entropy‑based drift detector with privacy guarantees.
Full changelog
What's New
`aegis check drift` CLI
Offline entropy-based drift detector for saved agent traces. Same signal that `auto_instrument()` exposes at runtime, now runnable on any JSONL trace from LangSmith, OTel, or custom loggers.
```bash
aegis check drift --trace path/to/trace.jsonl
aegis check drift --trace trace.jsonl --baseline gpt-4o-retail.json
aegis check drift --trace trace.jsonl --json --strict
```
Privacy invariant: reads only the `tool_name` field — never args, CoT, or prompts — so enterprise users can score prod traces without exfiltrating PII. Stdlib-only (Counter + math.log, no numpy).
Research: 1,960 Tau-Bench Agent Trajectories
Measured tool distribution drift on sierra-research/tau-bench public trajectories. 39.8% of 812 scored trajectories show measurable collapse (Δ entropy ≥ 0.3 nats). Cross-model gap on the same retail task family: Sonnet 3.5 New 48.2% vs GPT-4o 28.1% (1.7× ratio, n=599). Distribution is bimodal — agents either stay open or fall off a cliff.
- Post: https://acacian.github.io/aegis/research/tau-bench-tool-distribution-drift/
- Reproduces in ~30 seconds on a laptop (stdlib only)
4 pillars of differentiation
Unlike LLM-as-judge approaches (Patronus, Braintrust) and fine-tuned classifiers (Galileo, Maxim), the `check drift` metric is simultaneously:
- Deterministic — no second LLM judges the first, two runs give bit-identical results
- Privacy-preserving — tool names only, no prompt content ever read
- Cross-model comparable — normalized Δ on the same scale across GPT-4o and Sonnet
- 30-second reproducible — 120 lines of stdlib Python, no numpy or GPU
Other
- 15 new tests in `tests/cli/test_check.py` including a hard privacy-invariant assertion (PII planted in fixture traces must never appear in any output)
- `ScholarlyArticle` JSON-LD schema for `/research/*` pages, sitemap tier 0.8, `llms.txt` canonical facts section for LLM crawlers
Full Changelog: https://github.com/Acacian/aegis/compare/v0.9.3...v0.9.4
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About Acacian/aegis
Policy-based governance for AI agent tool calls. YAML policies, approval gates, risk assessment, and audit logging. Cross-platform: LangChain, OpenAI, Anthropic, MCP.
Related context
Beta — feedback welcome: [email protected]