Skip to content

hidai25/eval-view

v0.5.0 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agent-benchmark agent-evaluation agentic-ai ai-agents anthropic autogen
+12 more
cli crewai evaluation langchain-agent langgraph llm mcp openai-assistants pytest python regression-testing testing

Summary

AI summary

Continuous regression detection with Slack alerts and JSONL history export is added to evalview monitor.

Full changelog

What's New

Production Monitoring (evalview monitor)

  • Continuous regression detection — runs evalview check in a loop with configurable interval (default: 5 min)
  • Slack alerts — webhook notifications on new regressions, recovery notifications when resolved
  • Smart dedup — only alerts on NEW failures, no re-alerts on persistent issues
  • JSONL history export--history monitor.jsonl appends cycle data for trend analysis and dashboards
  • Graceful shutdown — Ctrl+C stops cleanly with cost summary
  • Config support — CLI flags, config.yaml, or EVALVIEW_SLACK_WEBHOOK env var
evalview monitor                                         # Check every 5 min
evalview monitor --interval 60                           # Every minute
evalview monitor --slack-webhook https://hooks.slack.com/services/...
evalview monitor --history monitor.jsonl                 # Save trends

Community Contributions

  • CSV exportevalview check --csv results.csv (@muhammadrashid4587)
  • Timeout flagevalview check --timeout 60 (@zamadye)
  • Better errors — human-friendly connection failure messages (@passionworkeer)
  • JSONL history--history flag for monitor (@clawtom)

Bug Fixes & Refactoring

  • Fixed severity comparison bug (was using string matching instead of enum comparison)
  • Fixed JSONL history pass count (was using fail_on filter instead of actual counts)
  • Extracted shared _parse_fail_statuses utility for consistent fail_on parsing
  • Eliminated redundant config loading in monitor loop

Deployment

# Quick background run
nohup evalview monitor --slack-webhook https://... &

# Docker
docker run -d -v $(pwd):/app -w /app evalview monitor --slack-webhook https://...

Full Changelog: https://github.com/hidai25/eval-view/compare/v0.4.1...v0.5.0

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track hidai25/eval-view

Get notified when new releases ship.

Sign up free

About hidai25/eval-view

Regression testing framework for AI agents. Save golden baselines, detect behavioral drift, and block regressions in CI. Works with LangGraph, CrewAI, OpenAI, Claude, and any HTTP API.

All releases →

Related context

Beta — feedback welcome: [email protected]