This release adds 3 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+12 more
Summary
AI summaryAll CLI flags are now exposed via MCP tools, adding compare_agents and replay utilities.
Full changelog
What's new
- Full MCP feature parity — all CLI flags now exposed via MCP tools (heal, strict, statistical, budget, tags, variants, and more)
- New MCP tools:
compare_agents(A/B test two endpoints) andreplay(trajectory diff viewer) - 33 MCP regression tests — protocol, schema contracts, flag wiring, routing, timeouts
Fixes
- Stable JSON response contract on
run_checkregardless of flags --reportno longer opens browser from MCP server- Replay timeout increased to 120s
- Subprocess calls use
stdin=DEVNULLto prevent hangs
Install / Upgrade
pip install --upgrade evalview
Full changelog: https://github.com/hidai25/eval-view/blob/main/CHANGELOG.md
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About hidai25/eval-view
Regression testing framework for AI agents. Save golden baselines, detect behavioral drift, and block regressions in CI. Works with LangGraph, CrewAI, OpenAI, Claude, and any HTTP API.
Related context
Related tools
Beta — feedback welcome: [email protected]