This release adds 2 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+14 more
Summary
AI summaryCI system gains self‑healing, bot‑PR elimination, cross‑discipline hygiene and a new AI‑BOM export.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | High |
Hotfix R-counter blocks further auto-merge after two consecutive hotfixes, preventing recursive lint drift cycles. Hotfix R-counter blocks further auto-merge after two consecutive hotfixes, preventing recursive lint drift cycles. Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Feature | High |
Trunk health SLO with Andon gate holds merges when trunk is red, stopping bug spread (Toyota Lean intervention). Trunk health SLO with Andon gate holds merges when trunk is red, stopping bug spread (Toyota Lean intervention). Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Feature | Medium |
Auto-heal v2 shipped with 26 parameters, classifier, heal-branch, admin-merge. Auto-heal v2 shipped with 26 parameters, classifier, heal-branch, admin-merge. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Weekly aggregated digest issue replaces multiple auto-release-skipped notifications (alarm fatigue intervention). Weekly aggregated digest issue replaces multiple auto-release-skipped notifications (alarm fatigue intervention). Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Feature | Medium |
Auto-triage main-red events to culprit PR, halving median MTTR for main-red incidents. Auto-triage main-red events to culprit PR, halving median MTTR for main-red incidents. Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Feature | Low |
Contract-drift autofix now inline-pushes regenerated lockfile instead of opening a PR, eliminating bot-PR class source. Contract-drift autofix now inline-pushes regenerated lockfile instead of opening a PR, eliminating bot-PR class source. Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Feature | Low |
Idempotency self-check in contract-drift regeneration ensures second run is a no-op; non‑deterministic regen aborts workflow (SPC intervention). Idempotency self-check in contract-drift regeneration ensures second run is a no-op; non‑deterministic regen aborts workflow (SPC intervention). Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Feature | Low |
Hotfix R-counter allow‑list and classifier treat benign doc‑format drift as non‑hotfix events (EDGE-4). Hotfix R-counter allow‑list and classifier treat benign doc‑format drift as non‑hotfix events (EDGE-4). Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Feature | Low |
GH API rate‑limit guard with token‑bucket and 429 backoff for long‑running agents (EDGE-7). GH API rate‑limit guard with token‑bucket and 429 backoff for long‑running agents (EDGE-7). Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Feature | Low |
Scoped CI concurrency groups by branch to prevent queue cancellation during rapid merges. Scoped CI concurrency groups by branch to prevent queue cancellation during rapid merges. Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Feature | Low |
AI-BOM export supports Bernstein JSON, CycloneDX 1.5 AI/ML extension, and SPDX 2.3 with AI annotations; deterministic via Hypothesis tests. AI-BOM export supports Bernstein JSON, CycloneDX 1.5 AI/ML extension, and SPDX 2.3 with AI annotations; deterministic via Hypothesis tests. Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Feature | Low |
Diary system writes structured entries per closed task with redaction; synthesizer clusters diaries by tag overlap to draft markdown reports (HITL‑gated). Diary system writes structured entries per closed task with redaction; synthesizer clusters diaries by tag overlap to draft markdown reports (HITL‑gated). Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Feature | Low |
Consensus relay provides HMAC‑chained handoff of cycle decisions, blockers, and open questions for operator restarts. Consensus relay provides HMAC‑chained handoff of cycle decisions, blockers, and open questions for operator restarts. Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Feature | Low |
Operator GUI is now an installable PWA with service worker caching; tunnel command publishes via registered drivers and prints QR code onboarding. Operator GUI is now an installable PWA with service worker caching; tunnel command publishes via registered drivers and prints QR code onboarding. Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Bugfix | Medium |
Added 404-cordon to prevent masking errors in typos-cli fetch URL. Added 404-cordon to prevent masking errors in typos-cli fetch URL. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
| Bugfix | Medium |
Included agents-md drift class in classifier to stop lint drift misclassification. Included agents-md drift class in classifier to stop lint drift misclassification. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
| Bugfix | Medium |
Explicitly dispatch heal-branch CI when auto-heal pushes a fix branch to avoid trigger leak. Explicitly dispatch heal-branch CI when auto-heal pushes a fix branch to avoid trigger leak. Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Bugfix | Medium |
Trunk‑Andon override escapes allow forced merges when Andon detects breakage that is the fix itself (EDGE-5). Trunk‑Andon override escapes allow forced merges when Andon detects breakage that is the fix itself (EDGE-5). Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Bugfix | Medium |
Fixed macOS runner saturation by splitting matrix jobs and adding nightly full‑matrix run; resolved stale heartbeat test bug on macOS. Fixed macOS runner saturation by splitting matrix jobs and adding nightly full‑matrix run; resolved stale heartbeat test bug on macOS. Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Bugfix | Low |
Reordered composition so ruff runs after agents-md sync, preventing whitespace tweaks from appearing as lint regressions. Reordered composition so ruff runs after agents-md sync, preventing whitespace tweaks from appearing as lint regressions. Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Bugfix | Low |
Contract-drift fallback comments with patch when inline-push lacks write permission on fork PRs. Contract-drift fallback comments with patch when inline-push lacks write permission on fork PRs. Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Bugfix | Low |
Advisory PR push‑lock prevents race conditions during parallel agent waves (EDGE-6). Advisory PR push‑lock prevents race conditions during parallel agent waves (EDGE-6). Source: granite4.1:30b@2026-05-20-audit Confidence: low |
— |
| Other | Low |
affected_surface affected_surface Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
Full changelog
v2.1.0 closed the loop on routing observability. v2.2.0 is about the CI immune system: auto-heal grew teeth, the bot-PR class got eliminated, and five cross-discipline interventions (Toyota Lean, epidemiology, alarm fatigue, SPC) stopped recurring failure modes that had been costing real wall time. Three feature workstreams that slipped from v2.1 also landed.
Self-healing CI grew teeth
Auto-heal v2 shipped in v2.1 (#1393, 26 parameters, classifier + heal-branch + admin-merge) and produced zero successful heals in the first three weeks. Every main-red event still required a human-dispatched hotfix. Three things were wrong:
- #1452 typos-cli 404. The fetch URL was stale; the workflow failed before classification. Added a 404-cordon so the daemon now opens a self-issue and stops rather than masking errors.
- #1452 agents-md drift class was missing from the classifier. Lint drift from
bernstein agents-md syncnot running on doc-only commits looked like a new failure class to the heuristic. Added it. - #1452 composition order: ruff was running before agents-md sync, so the sync's whitespace tweaks looked like lint regressions. Reordered.
Plus the trigger leak: #1460 auto-heal pushed its fix branch but the heal-branch CI never started, because push events from GITHUB_TOKEN don't fire downstream workflows by default. Now explicitly dispatches.
Bot-PR class eliminated
#1449 moved contract-drift autofix from "open a PR with the regenerated lockfile" to "inline-push the regenerated lockfile to the PR head." That was the dominant bot-PR-class source. The recursive lint drift cycle that ate a Saturday afternoon is gone.
Cross-discipline CI hygiene wave
Five interventions, each borrowed from a discipline that already solved an analogous problem:
| PR | Discipline | Intervention |
| --- | --- | --- |
| #1454 | alarm fatigue (anesthesiology) | Weekly aggregated digest issue. Replaces N auto-release-skipped notifications with one rolling summary. |
| #1455 | epidemiology (R0) | Hotfix R-counter. Detects when a hotfix begets another hotfix. Two-in-a-row blocks further auto-merge until human triage. |
| #1456 | Toyota Lean (Andon cord) | Trunk health SLO + Andon gate. Holds merges on red trunk. Blocks the bug spread that auto-merge would otherwise inflict. |
| #1457 | bisect on red | Auto-triage main-red to culprit PR. Halves the median MTTR for main-red events. |
| #1467 | SPC (control charts, META F) | Idempotency self-check in regen_contract_drift. Second run of the same regen must be a no-op; if not, the regen is non-deterministic and the workflow halts. |
Seven edge-case hardenings
The first three followed from the wave above. The next four are independent:
- #1458 contract-drift fork-PR fallback shape. The inline-push path needs write to the PR head; on fork PRs that's denied. Now falls back to a comment with the regenerated patch.
- #1459 R-counter benign-drift allow-list + classifier (EDGE-4). Auto-formatting churn on docs files is not a hotfix-class event. Distinct path.
- #1463 advisory PR push-lock for parallel-agent waves (EDGE-6). Six-agent waves were racing on the same PR's branch. Soft lock prevents the lost-write that bricked one PR last cycle.
- #1464 GH API rate-limit guard for long-running agent loops (EDGE-7). Token-bucket plus 429 backoff. Replaces the "wait two minutes and retry" pattern that triggered the secondary rate limit anyway.
- #1465 trunk-Andon override escapes (EDGE-5). Two override paths (force-merge label, commit-message token) for the case where the Andon-detected breakage is the fix.
- #1455 hotfix R-counter (also above) — paired with the Andon gate so the override loop has bounded depth.
- #1450 hygiene for five noise-prone workflows (auto-release filter, scheduled cleanup, telegram dedupe, release-please if-cond guard, delete-master removal).
Branch-scoped CI concurrency
#1470 scopes the CI concurrency group by branch so rapid-merge bursts drain the queue instead of cancelling each other's downstream signals. Plus #1472 hotfix repair for three follow-on root causes (QR dep skip on macOS, GUI URL test path, release-please conditional). Plus #1473 and #1474 clearing actionlint annotation-cap noise via level=error and -shellcheck= flag — the cap was eating real signal under a wall of style nags.
macOS runner saturation fix
The macOS hosted-runner queue depth was 20-70 minutes during burst-merge waves. Issue #1468 categorised the failure mode. #1475 split macOS off the per-PR default matrix into two new gated jobs (test-macos, adapter-integration-macos) that fire on push-to-main, on macos_sensitive path changes, or on a macos-needed label. Added .github/workflows/ci-macos-nightly.yml for the full matrix daily at 06:00 UTC. CI-gate accepts legitimate macOS skips.
Caught a real bug a week later: #1476. The test_reaps_stale_heartbeat test was patching one binding of _is_process_alive but _refresh_heartbeat_from_signals had a separate binding defined locally in bernstein.core.agents.agent_lifecycle. The unpatched call fell through to a real os.kill(pid=999, 0). On Linux and Windows that raised; on macos-latest PID 999 was owned by a system daemon, so the call succeeded, the heartbeat got refreshed, and the test failed. Test-only fix; production reap path was correct.
AI-BOM export (#1438)
bernstein bom emit and bernstein bom verify. Three encoders behind one dispatcher: Bernstein-native JSON, CycloneDX 1.5 with the AI/ML extension shape, and SPDX 2.3 with AI-specific annotations. Pure projection from existing lineage / cost / adapter state -- no recomputed hashes, no I/O during generate_bom. Determinism enforced by Hypothesis property tests across all three formats. Tamper detection via sha256 chain. Closes #1371.
Diary + synthesis (#1432)
Two-tier knowledge layer over closed task transcripts. Diary writes one structured entry per closed task (tried/worked/failed/rationale/tags) with redaction of OpenAI keys, GitHub tokens, AWS access keys, PEM banners, and high-entropy hex. Synthesizer clusters diaries by tag-overlap Jaccard (stdlib only, no embeddings in v1) and drafts a markdown report. HITL-gated: reports default to approved: false. 142 tests including 20 Hypothesis property tests. Closes #1369.
Consensus relay (#1435)
HMAC-chained per-cycle handoff so an operator restarting a long evolution cycle can pull the prior cycle's decisions/blockers/open-questions/next-action into context without rediscovery. Atomic-write store at .sdd/runtime/consensus/<cycle>.json. bernstein consensus list|show|export|next|verify. 73 unit + 12 integration tests. Closes #1368.
PWA + tunnel + QR onboarding (#1442)
Operator GUI is now an installable PWA: web app manifest, service worker with stale-while-revalidate for /api/projects and /api/cost, programmatic maskable icons mounted under both / and /ui/. iOS Safari and Android Chrome install cleanly. bernstein gui serve --tunnel publishes through the existing tunnel driver registry (cloudflared / ngrok / bore / tailscale, auto-select), issues a URL-safe bearer token + 6-word diceware passphrase persisted at ~/.bernstein/dashboard.passphrase (0600), and prints an ASCII QR. bernstein gui qr [--rotate] reprints or rotates. 106 unit + 22 integration tests. Closes #1218.
Upgrade
pip install -U bernstein==2.2.0 or uv tool upgrade bernstein. No config migration. Existing diaries / consensus stores / BOMs are read-compatible.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About chernistry/bernstein
Deterministic multi-agent orchestrator for 18 CLI coding agents (Claude Code, Codex, Cursor, Aider, Gemini CLI, OpenAI Agents SDK, and more). MCP server mode (stdio + HTTP/SSE) exposes the orchestrator to any MCP client. Git worktree isolation per agent, HMAC-chained audit trail, cost-aware model routing via contextual bandit. ~11K monthly PyPI downloads, Apache 2.0.
Related context
Related tools
Beta — feedback welcome: [email protected]