Skip to content

hidai25/eval-view

v0.7.0 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agent-benchmark agent-evaluation agentic-ai ai-agents anthropic autogen
+12 more
cli crewai evaluation langchain-agent langgraph llm mcp openai-assistants pytest python regression-testing testing

Summary

AI summary

Adds an Aider CLI adapter enabling EvalView to drive Aider as an evaluation adapter.

Full changelog

Minor release — 33 commits since 0.6.2, 14 new user-facing features.

Highlights

  • Aider CLI adapter — drive Aider as an EvalView adapter
  • Autopr loop — prod-incident → regression-test → PR, closed loop
  • Flake quarantine — known-flaky tests don't block CI, with governance metadata
  • Release verdict + evalview since — graded ship/hold verdict + change brief
  • progress / drift / slack-digest — investigative loop commands
  • Noise confirmation gate + --strict bypass — two-cycle rule before alerting
  • Slow-agent warning — real wall-clock latency regression detection
  • Observability signals — trust score, tool-loop, brittle-recovery, gaming checks
  • Improvement recommendation engine — prioritized stabilize / tighten / add-check suggestions
  • Simulation harness + decision-rationale (schema v2) — scripted multi-turn scenarios, machine-readable reasons
  • snapshot --json — CI-friendly, hardened for edge cases
  • check --explain — deep trace narrative for root-cause hypotheses
  • Token cost breakdown in check — input/output/cached tokens + cost delta vs baseline
  • Skill-doctor char-budget refinement — disable-model-invocation skills excluded

Plus ~10 fixes (mypy narrowing, dogfood hardening, slack-digest type errors, noise strict-bucket leak, snapshot --json CI hardening) and README/CLI doc improvements.

Install

pip install evalview==0.7.0
# or
npm install [email protected]

Full changelog: https://github.com/hidai25/eval-view/blob/v0.7.0/CHANGELOG.md

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track hidai25/eval-view

Get notified when new releases ship.

Sign up free

About hidai25/eval-view

Regression testing framework for AI agents. Save golden baselines, detect behavioral drift, and block regressions in CI. Works with LangGraph, CrewAI, OpenAI, Claude, and any HTTP API.

All releases →

Related context

Beta — feedback welcome: [email protected]