hidai25/eval-view

v0.5.0 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

Published 4mo Developer Productivity

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agent-benchmark agent-evaluation agentic-ai ai-agents anthropic autogen

+12 more

cli crewai evaluation langchain-agent langgraph llm mcp openai-assistants pytest python regression-testing testing

Summary

AI summary

Continuous regression detection with Slack alerts and JSONL history export is added to evalview monitor.

Full changelog

What's New

Production Monitoring (`evalview monitor`)

Continuous regression detection — runs evalview check in a loop with configurable interval (default: 5 min)
Slack alerts — webhook notifications on new regressions, recovery notifications when resolved
Smart dedup — only alerts on NEW failures, no re-alerts on persistent issues
JSONL history export — --history monitor.jsonl appends cycle data for trend analysis and dashboards
Graceful shutdown — Ctrl+C stops cleanly with cost summary
Config support — CLI flags, config.yaml, or EVALVIEW_SLACK_WEBHOOK env var

evalview monitor                                         # Check every 5 min
evalview monitor --interval 60                           # Every minute
evalview monitor --slack-webhook https://hooks.slack.com/services/...
evalview monitor --history monitor.jsonl                 # Save trends

Community Contributions

CSV export — evalview check --csv results.csv (@muhammadrashid4587)
Timeout flag — evalview check --timeout 60 (@zamadye)
Better errors — human-friendly connection failure messages (@passionworkeer)
JSONL history — --history flag for monitor (@clawtom)

Bug Fixes & Refactoring

Fixed severity comparison bug (was using string matching instead of enum comparison)
Fixed JSONL history pass count (was using fail_on filter instead of actual counts)
Extracted shared _parse_fail_statuses utility for consistent fail_on parsing
Eliminated redundant config loading in monitor loop

Deployment

# Quick background run
nohup evalview monitor --slack-webhook https://... &

# Docker
docker run -d -v $(pwd):/app -w /app evalview monitor --slack-webhook https://...

Full Changelog: https://github.com/hidai25/eval-view/compare/v0.4.1...v0.5.0

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track hidai25/eval-view

Get notified when new releases ship.

About hidai25/eval-view

Regression testing framework for AI agents. Save golden baselines, detect behavioral drift, and block regressions in CI. Works with LangGraph, CrewAI, OpenAI, Claude, and any HTTP API.

All releases →

hidai25/eval-view

Summary

What's New

Production Monitoring (`evalview monitor`)

Community Contributions

Bug Fixes & Refactoring

Deployment

Related context

Related tools

hidai25/eval-view

Summary

What's New

Production Monitoring (evalview monitor)

Community Contributions

Bug Fixes & Refactoring

Deployment

Related context

Related tools

Production Monitoring (`evalview monitor`)