This release adds 2 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+14 more
Affected surfaces
Summary
AI summaryExempted loopback clients from API rate limiting, fixing mass agent failures.
Full changelog
Highlights
Agent lifecycle reliability — Fixed the root cause of mass agent failures: loopback API requests were being rate-limited by our own server (28K+ 429 errors per run). Internal traffic now bypasses rate limiting. Also fixed stale claim detection using wrong timestamp, silent stderr, heartbeat race condition, and worktree failures not blocking spawn.
LinUCB bandit routing — The contextual bandit model router is now wired into the orchestrator. Learns optimal model selection per task type from historical outcomes.
Audit integrity on startup — HMAC chain verification of audit logs runs automatically on orchestrator boot. SOC 2 control mappings for compliance evidence export.
TUI improvements — Split-pane view, toast notifications, progress bars, dark/light theme switching, resize debounce, viewport clipping.
Features
- Zombie cleanup on startup — detects and kills orphaned agent processes from prior crashed runs
- MCP readiness probe before agent spawn — validates MCP servers are responsive
- Prompt size pre-check with model fallback chain (AGENT-003, AGENT-004)
- Command execution and network endpoint anomaly detection
- Immutable audit trail with Sigstore/Rekor attestation + HIPAA compliance mode
- Semantic diff improvements for default-arg detection
- SLA monitoring and SSO/OIDC integration workflow specs (ENT-009, ENT-010)
- Spawn failure categorization into actionable types with retry strategies
Docs
- Comprehensive troubleshooting guide for top 20 failure modes
- Security hardening guide and interactive quickstart tutorial
- Adapter selection guide with comparison matrix
- Lifecycle FSM diagrams with Mermaid rendering
- ReadTheDocs + MkDocs Material configuration
- Performance tuning and deployment guide fixes
Fixes
- Exempt loopback clients from API rate limiting (root cause of 466 agent failures)
- Stale claim detection: use
claimed_atinstead ofcreated_at, timeout 10m → 15m - Agent stderr captured to
.stderr.loginstead of /dev/null - Heartbeat file touched before spawn, not after (eliminated race condition)
- Worktree creation failure now blocks spawn (was silently swallowed)
- Stabilized spawner rate limiting tests
- Bumped cryptography 46.0.6 → 46.0.7
Full changelog: https://github.com/chernistry/bernstein/compare/v1.5.4...v1.5.5
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About chernistry/bernstein
Deterministic multi-agent orchestrator for 18 CLI coding agents (Claude Code, Codex, Cursor, Aider, Gemini CLI, OpenAI Agents SDK, and more). MCP server mode (stdio + HTTP/SSE) exposes the orchestrator to any MCP client. Git worktree isolation per agent, HMAC-chained audit trail, cost-aware model routing via contextual bandit. ~11K monthly PyPI downloads, Apache 2.0.
Related context
Related tools
Beta — feedback welcome: [email protected]