Forge

v0.7.4 Breaking

This release includes 1 breaking change for platform teams planning a safe upgrade.

Published 1mo AI Agents & Assistants

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agentic-ai agentic-workflow agents function-calling llama-cpp llamafile

+5 more

llm ollama python self-hosted tool-calling

ReleasePort's take

Light signal

editorial:auto 1mo

ReleasePort v0.7.4 introduces the "--max-tool-errors" flag to limit consecutive tool‑argument errors per request and fixes several parser crashes by routing malformed arguments to the tool‑error channel.

Why it matters: The new --max-tool-errors flag (default 2) caps error bursts, preventing runaway retries; bugfixes avert crashes when non‑object arguments are encountered, ensuring stable proxy and guardrails operation.

Summary

AI summary

Malformed tool-call arguments are corrected via the tool‑error channel, adding a --max-tool-errors flag.

Changes in this release

Type	Severity	Summary	CVE
Breaking	High	Deprecates pydantic `.model_` API on `ToolCall` and `TextResponse` dataclasses; construction no longer validates argument shape. Deprecates pydantic `.model_` API on `ToolCall` and `TextResponse` dataclasses; construction no longer validates argument shape. Source: llm_adapter@2026-06-03 Confidence: high	—
Feature
Feature	Low	Adds proxy option "--max-tool-errors" with default 2 to bound consecutive tool-argument errors per request. Adds proxy option "--max-tool-errors" with default 2 to bound consecutive tool-argument errors per request. Source: llm_adapter@2026-06-03 Confidence: high	—
Feature	Low	Adds 32GB model tier (Mistral‑Small‑3.2, Qwen3.5/3.6, Nemotron‑3 Nano) to eval suite and dashboard. Adds 32GB model tier (Mistral‑Small‑3.2, Qwen3.5/3.6, Nemotron‑3 Nano) to eval suite and dashboard. Source: llm_adapter@2026-06-03 Confidence: high	—
Feature	Low	Adds evaluation‑generation tracking in the dashboard, deduping to newest generation per config. Adds evaluation‑generation tracking in the dashboard, deduping to newest generation per config. Source: llm_adapter@2026-06-03 Confidence: high	—
Bugfix
Bugfix	Medium	Non‑object tool arguments no longer crash the parser; they are routed to the tool-error channel. Non‑object tool arguments no longer crash the parser; they are routed to the tool-error channel. Source: llm_adapter@2026-06-03 Confidence: high	—
Bugfix	Medium	`Guardrails.check()` now supports `action="tool_error"` for tool‑call faults. `Guardrails.check()` now supports `action="tool_error"` for tool‑call faults. Source: llm_adapter@2026-06-03 Confidence: low	—
Bugfix	Medium	Malformed tool-call arguments now self‑correct via the tool-error channel instead of triggering a retry nudge. Malformed tool-call arguments now self‑correct via the tool-error channel instead of triggering a retry nudge. Source: llm_adapter@2026-06-03 Confidence: low	—
Bugfix	Low	`Guardrails.check()` gains `action="tool_error"` for unknown tool or malformed argument faults. `Guardrails.check()` gains `action="tool_error"` for unknown tool or malformed argument faults. Source: granite4.1:30b@2026-06-03-audit Confidence: low	—

Full changelog

[0.7.4] — 2026-06-03

Malformed tool-call arguments now self-correct on the tool-error channel, and the eval suite gains its first model-size upgrade — a 32GB tier (Qwen3.5 / 3.6 27–35B, Nemotron-3 Nano, Mistral-Small-3.2) surfaced in the dashboard alongside the existing 8–14B lineup.

Added

Proxy --max-tool-errors (default 2) — bounds consecutive tool-argument errors per request, mirroring the WorkflowRunner budget. Threaded through ProxyServer and the HTTP handler.
32GB model tier in the published eval and dashboard: Mistral-Small-3.2 24B, Qwen3.5 27B / 35B-A3B, Qwen3.6 27B / 35B-A3B, Nemotron-3 Nano 30B-A3B (moved Unpublished → Current in the Model Registry).
Eval-generation tracking in the dashboard. Results gathered against different code states fold into a single view, deduped to the newest generation per config. Runs not yet re-swept (e.g. the Anthropic ablation) are carried forward and superscript-badged with a commit/date legend; Retired-tier models are carried forward but hidden behind a Show retired toggle.

Changed

Malformed tool-call arguments ride the tool-error channel. A model that emits a structurally valid call whose arguments are unparseable or not an object is now corrected via a tool-error result (role="tool", anchored to its tool_call_id) draining max_tool_errors, uniformly across all OpenAI-shape clients and all three integration modes (WorkflowRunner, proxy, Guardrails facade). This supersedes 0.7.3's "malformed args drive a retry nudge" behavior. The change is a native-mode conditioning bet — a small model plausibly self-corrects better on the channel it was pretrained on than via a trailing user nudge; in prompt mode the tool role is downgraded to a user message, so behavior there is unchanged. See ADR-016.
Guardrails.check() gains action="tool_error" for tool-call faults (unknown tool, malformed args) so middleware loops account for them on the tool channel. No consumers depended on the prior action vocabulary.
ToolCall / TextResponse are now plain dataclasses (args: Any); arg-shape validation moved to ResponseValidator. Attribute access and keyword construction are unchanged — but the pydantic .model_* API on these two exported types is gone, and construction no longer raises on a non-dict args. Only affects callers that serialized these objects via pydantic or relied on construction-time validation.

Fixed

Non-object tool args no longer crash the parser. Previously arguments decoding to a list / scalar / null raised at ToolCall construction; it is now caught at validation and routed to the tool-error channel. StepTracker.check_prerequisites additionally guards against a non-dict args reaching a direct dispatch.

Breaking Changes

.model_* API on `ToolCall` and `TextResponse` dataclasses removed, validation moved to `ResponseValidator`

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track Forge

Get notified when new releases ship.

About Forge

All releases →

Related context

Related tools

Earlier breaking changes

v0.7.5 Changes default behavior to replay no reasoning blocks.
v0.7.3 Renames `--mode {native,prompt}` to `--backend-capability {native,prompt}`; no deprecation alias.
v0.7.0 Unknown‑tool handling now replies with [UnknownToolError] on the tool channel instead of user nudges.
v0.7.0 Changes error reporting: step enforcement and prerequisite violations now emit tool‑channel messages with [StepEnforcementError] / [PrereqError].