Skip to content

chernistry/bernstein

v2.3.0 Security

This release includes 4 security fixes for security teams reviewing exposed deployments.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →
This release patches 4 known CVEs

Topics

agent-framework agent-orchestrator agentic-ai ai-agents ai-coding aider
+14 more
anthropic claude-code cli-tool codex-cli coding-agent deterministic-scheduler hmac-audit llm mcp-server model-context-protocol multi-agent parallel-worktrees python swe-bench

Affected surfaces

auth

Summary

AI summary

Broad release touches Highlights, Internal / quality, adapters, and orchestration.

Changes in this release

Feature Medium

10 backlog-tracker adapters now ship under single TrackerContract.

10 backlog-tracker adapters now ship under single TrackerContract.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Webhook ingestion and plugin hookspec for third-party tracker plugins added.

Webhook ingestion and plugin hookspec for third-party tracker plugins added.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Issue-to-PR pipeline introduced in orchestration loop.

Issue-to-PR pipeline introduced in orchestration loop.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Tracker comments used as multi-agent handoff message bus.

Tracker comments used as multi-agent handoff message bus.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Review-bot acknowledgement gate blocks merge until must-address findings addressed.

Review-bot acknowledgement gate blocks merge until must-address findings addressed.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Signed lineage audit log captures signed tracker state moves.

Signed lineage audit log captures signed tracker state moves.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Playwright self-testing sandbox for UI/web agent runs added.

Playwright self-testing sandbox for UI/web agent runs added.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Secrets broker provides short-lived per-task tokens.

Secrets broker provides short-lived per-task tokens.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Progress-watch liveness probe via session-log growth implemented.

Progress-watch liveness probe via session-log growth implemented.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Scheduled upstream-signal sweep with operator rollup added.

Scheduled upstream-signal sweep with operator rollup added.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Directory-based instance registry for multi-instance hosts introduced.

Directory-based instance registry for multi-instance hosts introduced.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

YAML eval harness implemented.

YAML eval harness implemented.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Telemetry-grounded autofix MVP added.

Telemetry-grounded autofix MVP added.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Long-running session memory feature included.

Long-running session memory feature included.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Run-failure classification with structured tracker writeback added.

Run-failure classification with structured tracker writeback added.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Stacked branches and per-snapshot undo for git operations introduced.

Stacked branches and per-snapshot undo for git operations introduced.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Adapter contract check distinguishes upstream --help from real drift; treats runtime_failure as warning.

Adapter contract check distinguishes upstream --help from real drift; treats runtime_failure as warning.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Feature Medium

Bulk refurb auto-fix wave 1 across src/ applied.

Bulk refurb auto-fix wave 1 across src/ applied.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: low

Feature Medium

CI dependency updates: actions/checkout v4 -> v6, actions/upload-artifact v4 -> v7, Python pin to <=3.13.

CI dependency updates: actions/checkout v4 -> v6, actions/upload-artifact v4 -> v7, Python pin to <=3.13.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: low

Bugfix Medium

Split scorecard job so SARIF upload completes separately.

Split scorecard job so SARIF upload completes separately.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Bugfix Medium

Close urllib / SHA1 / Trivy alerts.

Close urllib / SHA1 / Trivy alerts.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Bugfix Medium

Tracker_pipeline review follow-ups fixed.

Tracker_pipeline review follow-ups fixed.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Bugfix Medium

Commit-completion module review-bot follow-ups resolved.

Commit-completion module review-bot follow-ups resolved.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Bugfix Medium

Lock aider adapter-integration job to Python 3.13.

Lock aider adapter-integration job to Python 3.13.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Bugfix Medium

Honor SARIF suppressions before Code Scanning upload.

Honor SARIF suppressions before Code Scanning upload.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Bugfix Medium

Dispatch audit events outside broker lock; index tokens by value.

Dispatch audit events outside broker lock; index tokens by value.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: low

Bugfix Medium

Mask credentials in logger calls.

Mask credentials in logger calls.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: low

Bugfix Medium

Replace subprocess shell=True with list-form args.

Replace subprocess shell=True with list-form args.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: low

Bugfix Medium

Playwright runner review follow-ups, including asyncio.CancelledError propagation and unsafe-task_id rejection fixed.

Playwright runner review follow-ups, including asyncio.CancelledError propagation and unsafe-task_id rejection fixed.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: low

Bugfix Medium

Restore startup banner regression and add coverage in TUI.

Restore startup banner regression and add coverage in TUI.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: low

Full changelog

v2.3.0

127 commits since v2.2.0. The headline is the tracker-adapter family landing: 10 backlog-tracker adapters now ship under a single TrackerContract, plus webhook ingestion and a plugin hookspec for third-party tracker plugins. The orchestration loop also gained an issue-to-PR pipeline, a retry-with-continuation path for success-without-commit runs, and a multi-agent handoff message bus that piggybacks on tracker comments. The supporting workstreams (review-bot acknowledgement gate, signed lineage audit log, secrets broker, telemetry-grounded autofix, Playwright self-testing sandbox) close several long-standing reliability and security gaps.

Highlights

  • Tracker-adapter family. 10 adapters land, all conforming to the single TrackerContract (Jira Cloud + DC, GitLab Issues, Linear, Plane, Asana, ServiceNow, ClickUp, GitHub Projects v2, plus webhook ingestion). Closes the gap operators have hit when integrating non-GitHub backlogs.
  • Tracker plugin hookspec + registry + CLI. Third-party tracker integrations now plug in via the same pluggy spec the orchestrator uses internally (#1599).
  • Issue -> plan-comment -> PR pipeline. New orchestration mode that walks a tracker issue through plan synthesis, plan-comment posting for human review, and PR creation in one path (#1600).
  • Tracker comments as a multi-agent handoff bus. Worker agents now coordinate over tracker comments so a session can resume across CLI restarts and across operator machines (#1606).
  • Review-bot acknowledgement gate. CodeRabbit and Sourcery findings classified as must-address now block merge until they are addressed in a fixup commit or acknowledged in the PR body with a structured marker. Nightly sweeper + reusable shepherd workflow template ship in the same PR (#1583).
  • Lineage v2 - signed audit log of tracker state moves. Each tracker-side state transition is captured as a signed lineage entry, so operators can audit the full chain when a ticket loses or gains the wrong label (#1602).
  • Playwright-based sandbox for UI/web agent runs. A new self-testing layer drives a Playwright context against the dev server, captures screenshots / console / network errors, and hands the structured result back to an LLM judge for verdict (#1603).

New features

| Area | Change |
|---|---|
| trackers | 10 adapters land under TrackerContract (Asana, ClickUp, GitHub Projects v2, GitLab Issues, Jira Cloud, Jira DC, Linear, Plane, ServiceNow, plus webhook ingestion) (#1560, #1570-#1577, #1601) |
| plugins | Tracker plugin hookspec + registry + bernstein trackers CLI (#1599) |
| orchestration | Issue -> plan-comment -> PR pipeline (#1600), tracker comments as handoff bus (#1606), multi-tracker federation layer (#1561), retry-with-continuation on success-without-commit (#1596) |
| security | Secrets broker for short-lived per-task tokens (#1605) |
| reliability | Progress-watch liveness probe via session-log growth (#1597) |
| sandbox | Playwright self-testing for UI/web agent runs (#1603) |
| lineage | Signed audit log of tracker state moves (#1602), content-addressed trace store + viewer (#1564), per-ticket transcript bundle (#1562) |
| devops | Scheduled upstream-signal sweep with operator rollup (#1594) |
| fleet | Directory-based instance registry for multi-instance hosts (#1592) |
| eval | YAML eval harness (#1565) |
| autofix | Telemetry-grounded autofix MVP (#1566) |
| memory | Long-running session memory (#1559) |
| observability | Run-failure classification with structured tracker writeback (#1569) |
| git | Stacked branches + per-snapshot undo (#1563) |
| quality | Review-bot acknowledgement gate + nightly sweeper + reusable shepherd template (#1583) |
| cost | Hard per-ticket cost cap with clean termination and tracker writeback (#1578) |

Fixes

  • fix(adapters): refresh aider contract for the upstream --yes -> --yes-always rename; contract checker now distinguishes a broken upstream --help from real drift; CI workflow treats the new runtime-failure exit code as a warning rather than a hard fail (#1595).
  • fix(security): dispatch audit events outside the broker lock; index tokens by value (#1607). Split scorecard job so SARIF upload completes (#1613). Mask credentials in logger calls (#1519). Replace subprocess shell=True with list-form args (#1513). Close urllib / SHA1 / Trivy alerts (#1518).
  • fix(orchestration): tracker_pipeline review follow-ups (#1609); commit-completion module review-bot follow-ups (#1608).
  • fix(sandbox): Playwright runner review follow-ups, including asyncio.CancelledError propagation through broad except handlers and unsafe-task_id rejection (#1610).
  • fix(tui): restore startup banner regression + add coverage (#1568).
  • fix(ci): lock aider adapter-integration job to Python 3.13 (#1586); honour SARIF suppressions before Code Scanning upload (#1520); emit CI gate for paths-ignored-only PRs (#1521); restore minimum-required write permissions broken by security hardening (#1481).
  • fix(review): apply deferred review-bot findings batch (#1584).
  • fix(quality): bulk refurb auto-fix wave 1 across src/ (#1558).
  • fix(test): repair main-red after refurb auto-fix removed str() in _run_git (#1591).
  • fix(docs): sync agents-md module map for the devops sub-package (#1612).

Internal / quality

  • Bulk refurb auto-fix wave 2. FURB113 (repeated append -> list.extend, 259 sites), FURB107 (try/except: pass -> contextlib.suppress, 267 sites), FURB173 (dict spread -> | merge, 178 sites), FURB108 (chained == -> in {...}) - landed via libcst rewriter + ruff autofix (#1582).
  • Bulk refurb auto-fix wave 1. Initial refurb sweep across src/ (#1558).
  • CI dependency churn. actions/checkout v4 -> v6 (#1598), actions/upload-artifact v4 -> v7 (#1611), python pin to <=3.13 until adapter 3.14 compat is confirmed (#1590), aider adapter-integration job locked to Python 3.13 (#1586).
  • Adapter contract check. Truncated upstream --help output is no longer reported as N missing flags; surfaces on a dedicated runtime_failure field that the workflow treats as a warning rather than drift (part of #1595).

Upgrade notes

  • No manual operator action required. pip install --upgrade bernstein (or uv pip install --upgrade bernstein) brings v2.3.0 in.
  • Operators integrating with non-GitHub backlogs can now register their tracker via the new plugin hookspec (bernstein trackers --help for the CLI surface).
  • The new review-bot acknowledgement gate runs on every PR. Must-address findings need either a fixup commit (bot-ack: <id> in the commit message) or a PR-body marker (<!-- bot-ack: <id> reason=... -->).

Security Fixes

  • Dispatch audit events outside broker lock; index tokens by value (#1607)
  • Mask credentials in logger calls (#1519)
  • Replace subprocess shell=True with list-form args to avoid injection (#1513)
  • Close urllib / SHA1 / Trivy alerts (#1518)

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track chernistry/bernstein

Get notified when new releases ship.

Sign up free

About chernistry/bernstein

Deterministic multi-agent orchestrator for 18 CLI coding agents (Claude Code, Codex, Cursor, Aider, Gemini CLI, OpenAI Agents SDK, and more). MCP server mode (stdio + HTTP/SSE) exposes the orchestrator to any MCP client. Git worktree isolation per agent, HMAC-chained audit trail, cost-aware model routing via contextual bandit. ~11K monthly PyPI downloads, Apache 2.0.

All releases →

Beta — feedback welcome: [email protected]