hidai25/eval-view

v0.5.3 Breaking

This release includes breaking changes for platform teams planning a safe upgrade.

Published 4mo Developer Productivity

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agent-benchmark agent-evaluation agentic-ai ai-agents anthropic autogen

+12 more

cli crewai evaluation langchain-agent langgraph llm mcp openai-assistants pytest python regression-testing testing

Summary

AI summary

HTML report UI redesign introduces compact KPI strips, adaptive trace collapsing, collapsible diffs, and larger diagrams.

Full changelog

HTML Report Redesign

Overview tab — Compact KPI strip replaces 6-card hero grid. Removed duplicate Agent Model/Token Usage cards and Distribution donut. Full-width score chart. No-judge notice when hallucination/safety checks are skipped.

Execution Trace tab — Adaptive collapse: ≤4 tests all expanded, 5+ only first. Larger chevron buttons.

Diffs tab — Collapsible items (passed collapsed, changes expanded). Removed duplicate tool tags. Lazy-rendered trajectory diagrams behind toggle. Baseline→current score display (86.0 → 87.5 +1.5). Tooltips on lexical/semantic similarity.

Timeline tab — KPI summary strip. Side-by-side latency + cost charts. Color-coded bars by test.

All tabs — Larger Mermaid diagram fonts. Removed SVG max-height cap.

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track hidai25/eval-view

Get notified when new releases ship.

About hidai25/eval-view

Regression testing framework for AI agents. Save golden baselines, detect behavioral drift, and block regressions in CI. Works with LangGraph, CrewAI, OpenAI, Claude, and any HTTP API.

All releases →

hidai25/eval-view

Summary

HTML Report Redesign

Related context

Related tools