Skip to content

hidai25/eval-view

Developer Productivity

Regression testing framework for AI agents. Save golden baselines, detect behavioral drift, and block regressions in CI. Works with LangGraph, CrewAI, OpenAI, Claude, and any HTTP API.

Python Latest v0.8.0 · 19d ago Security brief →

Features

  • Detect silent behavior regressions in AI agents beyond simple pass/fail tests
  • Classify drift as provider/model change vs. system regression with graded confidence
  • Auto‑heal flaky failures and provide deterministic replay via captured tool calls

Recent releases

View all 32 releases →
No immediate action
v0.8.0 New feature

Cassettes + schedule cron

No immediate action
v0.7.1 Breaking risk

TOML test cases + CSV log import

No immediate action
v0.7.0 New feature

Aider CLI adapter

No immediate action
v0.6.2 New feature

Closed-model drift detection

No immediate action
v0.6.1 New feature

MCP feature parity + new tools

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

About

Stars
112
Forks
20
Languages
Python Makefile TypeScript
Downloads/week
17 ↑5%
NPM Maintainers
1
Contributors
15
TypeScript
Types included ✓

Install & Platforms

Install via
pip

Beta — feedback welcome: [email protected]