This release includes 4 security fixes for security teams reviewing exposed deployments.
Topics
+14 more
Affected surfaces
Summary
AI summaryBroad release touches Highlights, Internal / quality, adapters, and orchestration.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | Medium |
10 backlog-tracker adapters now ship under single TrackerContract. 10 backlog-tracker adapters now ship under single TrackerContract. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Webhook ingestion and plugin hookspec for third-party tracker plugins added. Webhook ingestion and plugin hookspec for third-party tracker plugins added. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Issue-to-PR pipeline introduced in orchestration loop. Issue-to-PR pipeline introduced in orchestration loop. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Tracker comments used as multi-agent handoff message bus. Tracker comments used as multi-agent handoff message bus. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Review-bot acknowledgement gate blocks merge until must-address findings addressed. Review-bot acknowledgement gate blocks merge until must-address findings addressed. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Signed lineage audit log captures signed tracker state moves. Signed lineage audit log captures signed tracker state moves. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Playwright self-testing sandbox for UI/web agent runs added. Playwright self-testing sandbox for UI/web agent runs added. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Secrets broker provides short-lived per-task tokens. Secrets broker provides short-lived per-task tokens. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Progress-watch liveness probe via session-log growth implemented. Progress-watch liveness probe via session-log growth implemented. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Scheduled upstream-signal sweep with operator rollup added. Scheduled upstream-signal sweep with operator rollup added. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Directory-based instance registry for multi-instance hosts introduced. Directory-based instance registry for multi-instance hosts introduced. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
YAML eval harness implemented. YAML eval harness implemented. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Telemetry-grounded autofix MVP added. Telemetry-grounded autofix MVP added. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Long-running session memory feature included. Long-running session memory feature included. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Run-failure classification with structured tracker writeback added. Run-failure classification with structured tracker writeback added. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Stacked branches and per-snapshot undo for git operations introduced. Stacked branches and per-snapshot undo for git operations introduced. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Adapter contract check distinguishes upstream --help from real drift; treats runtime_failure as warning. Adapter contract check distinguishes upstream --help from real drift; treats runtime_failure as warning. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Feature | Medium |
Bulk refurb auto-fix wave 1 across src/ applied. Bulk refurb auto-fix wave 1 across src/ applied. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
| Feature | Medium |
CI dependency updates: actions/checkout v4 -> v6, actions/upload-artifact v4 -> v7, Python pin to <=3.13. CI dependency updates: actions/checkout v4 -> v6, actions/upload-artifact v4 -> v7, Python pin to <=3.13. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
| Bugfix | Medium |
Split scorecard job so SARIF upload completes separately. Split scorecard job so SARIF upload completes separately. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Medium |
Close urllib / SHA1 / Trivy alerts. Close urllib / SHA1 / Trivy alerts. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Medium |
Tracker_pipeline review follow-ups fixed. Tracker_pipeline review follow-ups fixed. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Medium |
Commit-completion module review-bot follow-ups resolved. Commit-completion module review-bot follow-ups resolved. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Medium |
Lock aider adapter-integration job to Python 3.13. Lock aider adapter-integration job to Python 3.13. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Medium |
Honor SARIF suppressions before Code Scanning upload. Honor SARIF suppressions before Code Scanning upload. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Medium |
Dispatch audit events outside broker lock; index tokens by value. Dispatch audit events outside broker lock; index tokens by value. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
| Bugfix | Medium |
Mask credentials in logger calls. Mask credentials in logger calls. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
| Bugfix | Medium |
Replace subprocess shell=True with list-form args. Replace subprocess shell=True with list-form args. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
| Bugfix | Medium |
Playwright runner review follow-ups, including asyncio.CancelledError propagation and unsafe-task_id rejection fixed. Playwright runner review follow-ups, including asyncio.CancelledError propagation and unsafe-task_id rejection fixed. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
| Bugfix | Medium |
Restore startup banner regression and add coverage in TUI. Restore startup banner regression and add coverage in TUI. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
Full changelog
v2.3.0
127 commits since v2.2.0. The headline is the tracker-adapter family landing: 10 backlog-tracker adapters now ship under a single TrackerContract, plus webhook ingestion and a plugin hookspec for third-party tracker plugins. The orchestration loop also gained an issue-to-PR pipeline, a retry-with-continuation path for success-without-commit runs, and a multi-agent handoff message bus that piggybacks on tracker comments. The supporting workstreams (review-bot acknowledgement gate, signed lineage audit log, secrets broker, telemetry-grounded autofix, Playwright self-testing sandbox) close several long-standing reliability and security gaps.
Highlights
- Tracker-adapter family. 10 adapters land, all conforming to the single
TrackerContract(Jira Cloud + DC, GitLab Issues, Linear, Plane, Asana, ServiceNow, ClickUp, GitHub Projects v2, plus webhook ingestion). Closes the gap operators have hit when integrating non-GitHub backlogs. - Tracker plugin hookspec + registry + CLI. Third-party tracker integrations now plug in via the same pluggy spec the orchestrator uses internally (#1599).
- Issue -> plan-comment -> PR pipeline. New orchestration mode that walks a tracker issue through plan synthesis, plan-comment posting for human review, and PR creation in one path (#1600).
- Tracker comments as a multi-agent handoff bus. Worker agents now coordinate over tracker comments so a session can resume across CLI restarts and across operator machines (#1606).
- Review-bot acknowledgement gate. CodeRabbit and Sourcery findings classified as must-address now block merge until they are addressed in a fixup commit or acknowledged in the PR body with a structured marker. Nightly sweeper + reusable shepherd workflow template ship in the same PR (#1583).
- Lineage v2 - signed audit log of tracker state moves. Each tracker-side state transition is captured as a signed lineage entry, so operators can audit the full chain when a ticket loses or gains the wrong label (#1602).
- Playwright-based sandbox for UI/web agent runs. A new self-testing layer drives a Playwright context against the dev server, captures screenshots / console / network errors, and hands the structured result back to an LLM judge for verdict (#1603).
New features
| Area | Change |
|---|---|
| trackers | 10 adapters land under TrackerContract (Asana, ClickUp, GitHub Projects v2, GitLab Issues, Jira Cloud, Jira DC, Linear, Plane, ServiceNow, plus webhook ingestion) (#1560, #1570-#1577, #1601) |
| plugins | Tracker plugin hookspec + registry + bernstein trackers CLI (#1599) |
| orchestration | Issue -> plan-comment -> PR pipeline (#1600), tracker comments as handoff bus (#1606), multi-tracker federation layer (#1561), retry-with-continuation on success-without-commit (#1596) |
| security | Secrets broker for short-lived per-task tokens (#1605) |
| reliability | Progress-watch liveness probe via session-log growth (#1597) |
| sandbox | Playwright self-testing for UI/web agent runs (#1603) |
| lineage | Signed audit log of tracker state moves (#1602), content-addressed trace store + viewer (#1564), per-ticket transcript bundle (#1562) |
| devops | Scheduled upstream-signal sweep with operator rollup (#1594) |
| fleet | Directory-based instance registry for multi-instance hosts (#1592) |
| eval | YAML eval harness (#1565) |
| autofix | Telemetry-grounded autofix MVP (#1566) |
| memory | Long-running session memory (#1559) |
| observability | Run-failure classification with structured tracker writeback (#1569) |
| git | Stacked branches + per-snapshot undo (#1563) |
| quality | Review-bot acknowledgement gate + nightly sweeper + reusable shepherd template (#1583) |
| cost | Hard per-ticket cost cap with clean termination and tracker writeback (#1578) |
Fixes
fix(adapters): refresh aider contract for the upstream--yes->--yes-alwaysrename; contract checker now distinguishes a broken upstream--helpfrom real drift; CI workflow treats the new runtime-failure exit code as a warning rather than a hard fail (#1595).fix(security): dispatch audit events outside the broker lock; index tokens by value (#1607). Split scorecard job so SARIF upload completes (#1613). Mask credentials in logger calls (#1519). Replacesubprocess shell=Truewith list-form args (#1513). Close urllib / SHA1 / Trivy alerts (#1518).fix(orchestration): tracker_pipeline review follow-ups (#1609); commit-completion module review-bot follow-ups (#1608).fix(sandbox): Playwright runner review follow-ups, includingasyncio.CancelledErrorpropagation through broadexcepthandlers and unsafe-task_id rejection (#1610).fix(tui): restore startup banner regression + add coverage (#1568).fix(ci): lock aider adapter-integration job to Python 3.13 (#1586); honour SARIF suppressions before Code Scanning upload (#1520); emit CI gate for paths-ignored-only PRs (#1521); restore minimum-required write permissions broken by security hardening (#1481).fix(review): apply deferred review-bot findings batch (#1584).fix(quality): bulk refurb auto-fix wave 1 acrosssrc/(#1558).fix(test): repair main-red after refurb auto-fix removedstr()in_run_git(#1591).fix(docs): sync agents-md module map for the devops sub-package (#1612).
Internal / quality
- Bulk refurb auto-fix wave 2. FURB113 (repeated
append->list.extend, 259 sites), FURB107 (try/except: pass->contextlib.suppress, 267 sites), FURB173 (dict spread ->|merge, 178 sites), FURB108 (chained==->in {...}) - landed via libcst rewriter + ruff autofix (#1582). - Bulk refurb auto-fix wave 1. Initial refurb sweep across
src/(#1558). - CI dependency churn.
actions/checkoutv4 -> v6 (#1598),actions/upload-artifactv4 -> v7 (#1611), python pin to <=3.13 until adapter 3.14 compat is confirmed (#1590), aider adapter-integration job locked to Python 3.13 (#1586). - Adapter contract check. Truncated upstream
--helpoutput is no longer reported as N missing flags; surfaces on a dedicatedruntime_failurefield that the workflow treats as a warning rather than drift (part of #1595).
Upgrade notes
- No manual operator action required.
pip install --upgrade bernstein(oruv pip install --upgrade bernstein) brings v2.3.0 in. - Operators integrating with non-GitHub backlogs can now register their tracker via the new plugin hookspec (
bernstein trackers --helpfor the CLI surface). - The new review-bot acknowledgement gate runs on every PR. Must-address findings need either a fixup commit (
bot-ack: <id>in the commit message) or a PR-body marker (<!-- bot-ack: <id> reason=... -->).
Security Fixes
- Dispatch audit events outside broker lock; index tokens by value (#1607)
- Mask credentials in logger calls (#1519)
- Replace subprocess shell=True with list-form args to avoid injection (#1513)
- Close urllib / SHA1 / Trivy alerts (#1518)
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About chernistry/bernstein
Deterministic multi-agent orchestrator for 18 CLI coding agents (Claude Code, Codex, Cursor, Aider, Gemini CLI, OpenAI Agents SDK, and more). MCP server mode (stdio + HTTP/SSE) exposes the orchestrator to any MCP client. Git worktree isolation per agent, HMAC-chained audit trail, cost-aware model routing via contextual bandit. ~11K monthly PyPI downloads, Apache 2.0.
Related context
Related tools
Beta — feedback welcome: [email protected]