Skip to content

hidai25/eval-view

v0.6.1 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agent-benchmark agent-evaluation agentic-ai ai-agents anthropic autogen
+12 more
cli crewai evaluation langchain-agent langgraph llm mcp openai-assistants pytest python regression-testing testing

Summary

AI summary

All CLI flags are now exposed via MCP tools, adding compare_agents and replay utilities.

Full changelog

What's new

  • Full MCP feature parity — all CLI flags now exposed via MCP tools (heal, strict, statistical, budget, tags, variants, and more)
  • New MCP tools: compare_agents (A/B test two endpoints) and replay (trajectory diff viewer)
  • 33 MCP regression tests — protocol, schema contracts, flag wiring, routing, timeouts

Fixes

  • Stable JSON response contract on run_check regardless of flags
  • --report no longer opens browser from MCP server
  • Replay timeout increased to 120s
  • Subprocess calls use stdin=DEVNULL to prevent hangs

Install / Upgrade

pip install --upgrade evalview

Full changelog: https://github.com/hidai25/eval-view/blob/main/CHANGELOG.md

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track hidai25/eval-view

Get notified when new releases ship.

Sign up free

About hidai25/eval-view

Regression testing framework for AI agents. Save golden baselines, detect behavioral drift, and block regressions in CI. Works with LangGraph, CrewAI, OpenAI, Claude, and any HTTP API.

All releases →

Related context

Beta — feedback welcome: [email protected]