This release adds 3 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+5 more
Affected surfaces
Summary
AI summaryUpdates What's New in v0.7.0, 40/60/90, and 15/25/40 across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | Medium |
Adds Frontend/UI task category with separate content-patch and page-build bands. Adds Frontend/UI task category with separate content-patch and page-build bands. Source: llm_adapter@2026-05-25 Confidence: high |
— |
| Feature | Medium |
Adds App-development task category with generic cold L-style prior and UI human-comparison multiplier. Adds App-development task category with generic cold L-style prior and UI human-comparison multiplier. Source: llm_adapter@2026-05-25 Confidence: high |
— |
| Feature | Medium |
Adds `3-round` review mode with a 35‑minute additive review tier. Adds `3-round` review mode with a 35‑minute additive review tier. Source: llm_adapter@2026-05-25 Confidence: high |
— |
| Feature | Medium |
Adds METR threshold entries for Opus 4.7 and GPT‑5.5; retains `opus_4_x` alias. Adds METR threshold entries for Opus 4.7 and GPT‑5.5; retains `opus_4_x` alias. Source: llm_adapter@2026-05-25 Confidence: high |
— |
| Feature | Medium |
Adds opt‑in structured audit logging via `AGENT_ESTIMATE_AUDIT_*` environment variables. Adds opt‑in structured audit logging via `AGENT_ESTIMATE_AUDIT_*` environment variables. Source: llm_adapter@2026-05-25 Confidence: high |
— |
| Bugfix | Medium |
Routes research‑grounded brainstorms to the research band instead of the flat brainstorm band. Routes research‑grounded brainstorms to the research band instead of the flat brainstorm band. Source: llm_adapter@2026-05-25 Confidence: high |
— |
| Bugfix | Medium |
Refreshes Claude `/estimate` skill to v0.7.0 parity with Codex slice (frontend/app_dev types, `3-round` review mode, refreshed METR keys). Refreshes Claude `/estimate` skill to v0.7.0 parity with Codex slice (frontend/app_dev types, `3-round` review mode, refreshed METR keys). Source: llm_adapter@2026-05-25 Confidence: high |
— |
| Bugfix | Medium |
Corrects Codex skill install path in `skills/estimate/README.md` to `.codex/skills/...`. Corrects Codex skill install path in `skills/estimate/README.md` to `.codex/skills/...`. Source: llm_adapter@2026-05-25 Confidence: low |
— |
| Bugfix | Medium |
Updates Codex model‑key alias to resolve to GPT‑5.5 METR threshold; retains GPT‑5.4 availability. Updates Codex model‑key alias to resolve to GPT‑5.5 METR threshold; retains GPT‑5.4 availability. Source: llm_adapter@2026-05-25 Confidence: low |
— |
| Refactor | Medium |
Updates repository structure: adds Makefile, scripts/preflight.py, and multi‑runtime `skills/estimate/` layout. Updates repository structure: adds Makefile, scripts/preflight.py, and multi‑runtime `skills/estimate/` layout. Source: llm_adapter@2026-05-25 Confidence: low |
— |
Full changelog
What's New in v0.7.0
Added
- Frontend/UI task category with separate content-patch (15/25/40) and page-build (40/60/90) bands.
- App-development task category with a generic cold L-style prior and app/UI human-comparison multiplier.
3-roundreview mode with a 35 minute additive review tier.- METR threshold entries for Opus 4.7 (current) and GPT-5.5;
opus_4_xretained as a forward-compatible alias. - Opt-in structured audit logging via
AGENT_ESTIMATE_AUDIT_*environment variables, emitting secret-scrubbed JSON events to stdout, stderr, or a file.
Changed
- Research-grounded brainstorms now route to the research band instead of the flat brainstorm band.
- Codex model-key alias now resolves to the GPT-5.5 METR threshold; GPT-5.4 remains available.
- Corrected the Codex skill install path in
skills/estimate/README.mdto.codex/skills/.... - Version bumped to v0.7.0 across package, plugin, action, issue template, and tests.
- Claude runtime
/estimateskill refreshed to v0.7.0 parity with the Codex slice (frontend/app_dev types,3-roundreview mode, refreshed METR keys). claude/claude_opusmodel-key aliases now resolve toopus_4_7(Opus 4.7);opus_4_6retained for backward compatibility.
This release also brings the repository structure up to date: a Makefile for dev shortcuts, a scripts/preflight.py pre-PR check, and the multi-runtime skills/estimate/ layout (shared spec + per-runtime Claude/Codex slices).
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About Agent-estimate
All releases →Beta — feedback welcome: [email protected]