This release adds 3 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+14 more
Affected surfaces
Summary
AI summaryPlan-and-Execute architecture formalized, agent identity cards with capability enforcement, and built-in eval framework introduced.
Full changelog
v1.8.4
Planning, identity, and evaluation.
Features
- Plan-and-Execute architecture formalized. Planning and execution are now explicit phases with typed interfaces, so you can swap planners without touching executors.
- Agent identity cards with capability enforcement. Every spawn carries a signed identity card; the orchestrator refuses tool calls outside the card's declared capabilities.
- Built-in eval framework with per-model accuracy reporting — useful for A/B-ing planners, routers, or adapter configs.
- Canary deployments for prompt/model versions. Route a fraction of traffic to a new prompt or model; promote automatically on metric parity.
- Opus alias upgraded to Claude Opus 4.7 — default "opus" now resolves to the 4.7 snapshot.
Docs
- Added growth metrics breakdown.
- Small README polish.
CI
- npm publish:
NPM_TOKENis exported so.npmrcinterpolation works under GitHub Actions.
Full changelog: https://github.com/chernistry/bernstein/compare/v1.8.3...v1.8.4
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About chernistry/bernstein
Deterministic multi-agent orchestrator for 18 CLI coding agents (Claude Code, Codex, Cursor, Aider, Gemini CLI, OpenAI Agents SDK, and more). MCP server mode (stdio + HTTP/SSE) exposes the orchestrator to any MCP client. Git worktree isolation per agent, HMAC-chained audit trail, cost-aware model routing via contextual bandit. ~11K monthly PyPI downloads, Apache 2.0.
Related context
Related tools
Beta — feedback welcome: [email protected]