Skip to content

chernistry/bernstein

v2.2.0 Feature

This release adds 2 notable features for engineering teams evaluating rollout.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agent-framework agent-orchestrator agentic-ai ai-agents ai-coding aider
+14 more
anthropic claude-code cli-tool codex-cli coding-agent deterministic-scheduler hmac-audit llm mcp-server model-context-protocol multi-agent parallel-worktrees python swe-bench

Summary

AI summary

CI system gains self‑healing, bot‑PR elimination, cross‑discipline hygiene and a new AI‑BOM export.

Changes in this release

Feature High

Hotfix R-counter blocks further auto-merge after two consecutive hotfixes, preventing recursive lint drift cycles.

Hotfix R-counter blocks further auto-merge after two consecutive hotfixes, preventing recursive lint drift cycles.

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature High

Trunk health SLO with Andon gate holds merges when trunk is red, stopping bug spread (Toyota Lean intervention).

Trunk health SLO with Andon gate holds merges when trunk is red, stopping bug spread (Toyota Lean intervention).

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Medium

Auto-heal v2 shipped with 26 parameters, classifier, heal-branch, admin-merge.

Auto-heal v2 shipped with 26 parameters, classifier, heal-branch, admin-merge.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Weekly aggregated digest issue replaces multiple auto-release-skipped notifications (alarm fatigue intervention).

Weekly aggregated digest issue replaces multiple auto-release-skipped notifications (alarm fatigue intervention).

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Medium

Auto-triage main-red events to culprit PR, halving median MTTR for main-red incidents.

Auto-triage main-red events to culprit PR, halving median MTTR for main-red incidents.

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Low

Contract-drift autofix now inline-pushes regenerated lockfile instead of opening a PR, eliminating bot-PR class source.

Contract-drift autofix now inline-pushes regenerated lockfile instead of opening a PR, eliminating bot-PR class source.

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Low

Idempotency self-check in contract-drift regeneration ensures second run is a no-op; non‑deterministic regen aborts workflow (SPC intervention).

Idempotency self-check in contract-drift regeneration ensures second run is a no-op; non‑deterministic regen aborts workflow (SPC intervention).

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Low

Hotfix R-counter allow‑list and classifier treat benign doc‑format drift as non‑hotfix events (EDGE-4).

Hotfix R-counter allow‑list and classifier treat benign doc‑format drift as non‑hotfix events (EDGE-4).

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Low

GH API rate‑limit guard with token‑bucket and 429 backoff for long‑running agents (EDGE-7).

GH API rate‑limit guard with token‑bucket and 429 backoff for long‑running agents (EDGE-7).

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Low

Scoped CI concurrency groups by branch to prevent queue cancellation during rapid merges.

Scoped CI concurrency groups by branch to prevent queue cancellation during rapid merges.

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Low

AI-BOM export supports Bernstein JSON, CycloneDX 1.5 AI/ML extension, and SPDX 2.3 with AI annotations; deterministic via Hypothesis tests.

AI-BOM export supports Bernstein JSON, CycloneDX 1.5 AI/ML extension, and SPDX 2.3 with AI annotations; deterministic via Hypothesis tests.

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Low

Diary system writes structured entries per closed task with redaction; synthesizer clusters diaries by tag overlap to draft markdown reports (HITL‑gated).

Diary system writes structured entries per closed task with redaction; synthesizer clusters diaries by tag overlap to draft markdown reports (HITL‑gated).

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Low

Consensus relay provides HMAC‑chained handoff of cycle decisions, blockers, and open questions for operator restarts.

Consensus relay provides HMAC‑chained handoff of cycle decisions, blockers, and open questions for operator restarts.

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Low

Operator GUI is now an installable PWA with service worker caching; tunnel command publishes via registered drivers and prints QR code onboarding.

Operator GUI is now an installable PWA with service worker caching; tunnel command publishes via registered drivers and prints QR code onboarding.

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Bugfix Medium

Added 404-cordon to prevent masking errors in typos-cli fetch URL.

Added 404-cordon to prevent masking errors in typos-cli fetch URL.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: low

Bugfix Medium

Included agents-md drift class in classifier to stop lint drift misclassification.

Included agents-md drift class in classifier to stop lint drift misclassification.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: low

Bugfix Medium

Explicitly dispatch heal-branch CI when auto-heal pushes a fix branch to avoid trigger leak.

Explicitly dispatch heal-branch CI when auto-heal pushes a fix branch to avoid trigger leak.

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Bugfix Medium

Trunk‑Andon override escapes allow forced merges when Andon detects breakage that is the fix itself (EDGE-5).

Trunk‑Andon override escapes allow forced merges when Andon detects breakage that is the fix itself (EDGE-5).

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Bugfix Medium

Fixed macOS runner saturation by splitting matrix jobs and adding nightly full‑matrix run; resolved stale heartbeat test bug on macOS.

Fixed macOS runner saturation by splitting matrix jobs and adding nightly full‑matrix run; resolved stale heartbeat test bug on macOS.

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Bugfix Low

Reordered composition so ruff runs after agents-md sync, preventing whitespace tweaks from appearing as lint regressions.

Reordered composition so ruff runs after agents-md sync, preventing whitespace tweaks from appearing as lint regressions.

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Bugfix Low

Contract-drift fallback comments with patch when inline-push lacks write permission on fork PRs.

Contract-drift fallback comments with patch when inline-push lacks write permission on fork PRs.

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Bugfix Low

Advisory PR push‑lock prevents race conditions during parallel agent waves (EDGE-6).

Advisory PR push‑lock prevents race conditions during parallel agent waves (EDGE-6).

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Other Low

affected_surface

affected_surface

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: low

Full changelog

v2.1.0 closed the loop on routing observability. v2.2.0 is about the CI immune system: auto-heal grew teeth, the bot-PR class got eliminated, and five cross-discipline interventions (Toyota Lean, epidemiology, alarm fatigue, SPC) stopped recurring failure modes that had been costing real wall time. Three feature workstreams that slipped from v2.1 also landed.

Self-healing CI grew teeth

Auto-heal v2 shipped in v2.1 (#1393, 26 parameters, classifier + heal-branch + admin-merge) and produced zero successful heals in the first three weeks. Every main-red event still required a human-dispatched hotfix. Three things were wrong:

  • #1452 typos-cli 404. The fetch URL was stale; the workflow failed before classification. Added a 404-cordon so the daemon now opens a self-issue and stops rather than masking errors.
  • #1452 agents-md drift class was missing from the classifier. Lint drift from bernstein agents-md sync not running on doc-only commits looked like a new failure class to the heuristic. Added it.
  • #1452 composition order: ruff was running before agents-md sync, so the sync's whitespace tweaks looked like lint regressions. Reordered.

Plus the trigger leak: #1460 auto-heal pushed its fix branch but the heal-branch CI never started, because push events from GITHUB_TOKEN don't fire downstream workflows by default. Now explicitly dispatches.

Bot-PR class eliminated

#1449 moved contract-drift autofix from "open a PR with the regenerated lockfile" to "inline-push the regenerated lockfile to the PR head." That was the dominant bot-PR-class source. The recursive lint drift cycle that ate a Saturday afternoon is gone.

Cross-discipline CI hygiene wave

Five interventions, each borrowed from a discipline that already solved an analogous problem:

| PR | Discipline | Intervention |
| --- | --- | --- |
| #1454 | alarm fatigue (anesthesiology) | Weekly aggregated digest issue. Replaces N auto-release-skipped notifications with one rolling summary. |
| #1455 | epidemiology (R0) | Hotfix R-counter. Detects when a hotfix begets another hotfix. Two-in-a-row blocks further auto-merge until human triage. |
| #1456 | Toyota Lean (Andon cord) | Trunk health SLO + Andon gate. Holds merges on red trunk. Blocks the bug spread that auto-merge would otherwise inflict. |
| #1457 | bisect on red | Auto-triage main-red to culprit PR. Halves the median MTTR for main-red events. |
| #1467 | SPC (control charts, META F) | Idempotency self-check in regen_contract_drift. Second run of the same regen must be a no-op; if not, the regen is non-deterministic and the workflow halts. |

Seven edge-case hardenings

The first three followed from the wave above. The next four are independent:

  • #1458 contract-drift fork-PR fallback shape. The inline-push path needs write to the PR head; on fork PRs that's denied. Now falls back to a comment with the regenerated patch.
  • #1459 R-counter benign-drift allow-list + classifier (EDGE-4). Auto-formatting churn on docs files is not a hotfix-class event. Distinct path.
  • #1463 advisory PR push-lock for parallel-agent waves (EDGE-6). Six-agent waves were racing on the same PR's branch. Soft lock prevents the lost-write that bricked one PR last cycle.
  • #1464 GH API rate-limit guard for long-running agent loops (EDGE-7). Token-bucket plus 429 backoff. Replaces the "wait two minutes and retry" pattern that triggered the secondary rate limit anyway.
  • #1465 trunk-Andon override escapes (EDGE-5). Two override paths (force-merge label, commit-message token) for the case where the Andon-detected breakage is the fix.
  • #1455 hotfix R-counter (also above) — paired with the Andon gate so the override loop has bounded depth.
  • #1450 hygiene for five noise-prone workflows (auto-release filter, scheduled cleanup, telegram dedupe, release-please if-cond guard, delete-master removal).

Branch-scoped CI concurrency

#1470 scopes the CI concurrency group by branch so rapid-merge bursts drain the queue instead of cancelling each other's downstream signals. Plus #1472 hotfix repair for three follow-on root causes (QR dep skip on macOS, GUI URL test path, release-please conditional). Plus #1473 and #1474 clearing actionlint annotation-cap noise via level=error and -shellcheck= flag — the cap was eating real signal under a wall of style nags.

macOS runner saturation fix

The macOS hosted-runner queue depth was 20-70 minutes during burst-merge waves. Issue #1468 categorised the failure mode. #1475 split macOS off the per-PR default matrix into two new gated jobs (test-macos, adapter-integration-macos) that fire on push-to-main, on macos_sensitive path changes, or on a macos-needed label. Added .github/workflows/ci-macos-nightly.yml for the full matrix daily at 06:00 UTC. CI-gate accepts legitimate macOS skips.

Caught a real bug a week later: #1476. The test_reaps_stale_heartbeat test was patching one binding of _is_process_alive but _refresh_heartbeat_from_signals had a separate binding defined locally in bernstein.core.agents.agent_lifecycle. The unpatched call fell through to a real os.kill(pid=999, 0). On Linux and Windows that raised; on macos-latest PID 999 was owned by a system daemon, so the call succeeded, the heartbeat got refreshed, and the test failed. Test-only fix; production reap path was correct.

AI-BOM export (#1438)

bernstein bom emit and bernstein bom verify. Three encoders behind one dispatcher: Bernstein-native JSON, CycloneDX 1.5 with the AI/ML extension shape, and SPDX 2.3 with AI-specific annotations. Pure projection from existing lineage / cost / adapter state -- no recomputed hashes, no I/O during generate_bom. Determinism enforced by Hypothesis property tests across all three formats. Tamper detection via sha256 chain. Closes #1371.

Diary + synthesis (#1432)

Two-tier knowledge layer over closed task transcripts. Diary writes one structured entry per closed task (tried/worked/failed/rationale/tags) with redaction of OpenAI keys, GitHub tokens, AWS access keys, PEM banners, and high-entropy hex. Synthesizer clusters diaries by tag-overlap Jaccard (stdlib only, no embeddings in v1) and drafts a markdown report. HITL-gated: reports default to approved: false. 142 tests including 20 Hypothesis property tests. Closes #1369.

Consensus relay (#1435)

HMAC-chained per-cycle handoff so an operator restarting a long evolution cycle can pull the prior cycle's decisions/blockers/open-questions/next-action into context without rediscovery. Atomic-write store at .sdd/runtime/consensus/<cycle>.json. bernstein consensus list|show|export|next|verify. 73 unit + 12 integration tests. Closes #1368.

PWA + tunnel + QR onboarding (#1442)

Operator GUI is now an installable PWA: web app manifest, service worker with stale-while-revalidate for /api/projects and /api/cost, programmatic maskable icons mounted under both / and /ui/. iOS Safari and Android Chrome install cleanly. bernstein gui serve --tunnel publishes through the existing tunnel driver registry (cloudflared / ngrok / bore / tailscale, auto-select), issues a URL-safe bearer token + 6-word diceware passphrase persisted at ~/.bernstein/dashboard.passphrase (0600), and prints an ASCII QR. bernstein gui qr [--rotate] reprints or rotates. 106 unit + 22 integration tests. Closes #1218.

Upgrade

pip install -U bernstein==2.2.0 or uv tool upgrade bernstein. No config migration. Existing diaries / consensus stores / BOMs are read-compatible.

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track chernistry/bernstein

Get notified when new releases ship.

Sign up free

About chernistry/bernstein

Deterministic multi-agent orchestrator for 18 CLI coding agents (Claude Code, Codex, Cursor, Aider, Gemini CLI, OpenAI Agents SDK, and more). MCP server mode (stdio + HTTP/SSE) exposes the orchestrator to any MCP client. Git worktree isolation per agent, HMAC-chained audit trail, cost-aware model routing via contextual bandit. ~11K monthly PyPI downloads, Apache 2.0.

All releases →

Related context

Beta — feedback welcome: [email protected]