Skip to content

Agenda Intel MD

v0.8.1 Breaking

This release includes 2 breaking changes for platform teams planning a safe upgrade.

Published 14d MCP Developer Tools
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

a2a a2a-protocol agenda-intelligence agent-infrastructure agentic-workflow ai-agents
+14 more
cloudflare-workers deal-risk-gate decision-grade geopolitical-intelligence geopolitical-risk json-schema mcp mcp-server model-context-protocol policy-analysis regulatory-compliance risk-analysis sanctions strategic-intelligence

Affected surfaces

auth

Summary

AI summary

audit.validation_score now machine‑verified and overwritten, with added machine_verified and self_assessed_score schema fields.

Changes in this release

Feature Medium

audit.validation_score now machine-verified, not self-graded

audit.validation_score now machine-verified, not self-graded

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

system prompt includes explicit output-format block

system prompt includes explicit output-format block

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

README quickstart mentions [llm] extra for Anthropic API integration

README quickstart mentions [llm] extra for Anthropic API integration

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

schema adds optional audit.machine_verified and audit.self_assessed_score fields

schema adds optional audit.machine_verified and audit.self_assessed_score fields

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Feature Low

audit.self_assessed_score field added to preserve LLM's original grade

audit.self_assessed_score field added to preserve LLM's original grade

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Low

audit.machine_verified boolean flag added to indicate server verification

audit.machine_verified boolean flag added to indicate server verification

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Low

OUTPUT FORMAT section forbids markdown fences and surrounding prose

OUTPUT FORMAT section forbids markdown fences and surrounding prose

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Feature Low

OUTPUT FORMAT lists required top‑level keys and provides JSON skeleton

OUTPUT FORMAT lists required top‑level keys and provides JSON skeleton

Source: granite4.1:30b@2026-05-20-audit

Confidence: low

Bugfix Medium

analyze overwrites self-graded audit score with machine-verified value

analyze overwrites self-graded audit score with machine-verified value

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Refactor Medium

system_prompt ends with dedicated OUTPUT FORMAT — STRICT section

system_prompt ends with dedicated OUTPUT FORMAT — STRICT section

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Other Medium

Added test_analyze_overrides_self_graded_audit_score to verify audit score rewriting

Added test_analyze_overrides_self_graded_audit_score to verify audit score rewriting

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Full changelog

v0.8.1 — Honest audit + tighter system prompt

v0.8.1 closes three rough edges in the product shell shipped in v0.8.0, found during an end-to-end agent run.

Fixed — audit.validation_score is now machine-verified, not self-graded

Before v0.8.1, analyze returned the memo with the LLM's own audit.validation_score and audit.validation_details. A model could write validation_score: 0.99 and the schema would accept it. From v0.8.1, the server overwrites both fields with values computed from six observable structural checks (schema valid, fact/assessment separation, unknowns acknowledged, modules match routing, watch_next present, evidence_mode within contract). The LLM's self-grade is preserved in a clearly-labeled audit.self_assessed_score field for transparency, and audit.machine_verified: true makes the rewrite explicit. audit.provenance is substantive content (per-claim basis labels) and is preserved as the model wrote it.

The score remains structural only — it is not a claim about whether the analysis is factually correct.

Improved — system prompt has an explicit output-format block

The assembled system_prompt now ends with a dedicated ===== OUTPUT FORMAT — STRICT ===== section that:

  • explicitly forbids markdown fences and surrounding prose,
  • lists the required top-level keys,
  • gives a compact valid skeleton the model can pattern-match against,
  • tells the model that audit.validation_score and validation_details are advisory and will be overwritten by the server.

This raises the chance that weaker host models return parseable JSON on the first attempt.

Added — schema fields for machine-verified audit

agenda-memo.schema.json gains two optional audit properties: machine_verified (bool) and self_assessed_score (number, 0–1). Documentation on validation_score clarifies that it is structural only and, when machine_verified is true, was computed by the server.

Added — README quickstart mentions the [llm] extra

The Quickstart now explains how to install with pip install "agenda-intelligence-md[llm]" and set ANTHROPIC_API_KEY to let analyze call the Anthropic API directly. Without the extra, the tool still returns a usable system_prompt for the host model to complete.

Tests

tests/test_product_shell.py adds test_analyze_overrides_self_graded_audit_score, which feeds the analyze pipeline a mocked LLM response with validation_score: 0.99 and missing unknowns, and asserts that the server rewrites the score downward, marks machine_verified: true, preserves self_assessed_score: 0.99, flags unknowns_acknowledged as failed, and keeps the provenance entries intact.

Unchanged

  • 16 MCP tools, request/memo schemas, geography routing, signal vendoring — all behave as in v0.8.0.
  • Live source retrieval is still not implemented.
  • No new hard dependencies; the anthropic SDK is still gated behind the [llm] extra.

Breaking Changes

  • `audit.validation_score` and `audit.validation_details` are now overwritten by server‑computed values; original LLM self‑grades moved to `audit.self_assessed_score`
  • Schema updated: added optional `audit.machine_verified` (bool) and `audit.self_assessed_score` (number 0–1)

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Agenda Intel MD

Get notified when new releases ship.

Sign up free

About Agenda Intel MD

All releases →

Related context

Earlier breaking changes

  • v0.8.0 MCP tool count increased from 11 to 16, adding five new tools.

Beta — feedback welcome: [email protected]