This release adds 3 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+12 more
Summary
AI summaryAdded variance-aware statistical testing with configurable confidence levels.
Full changelog
What's New
Statistical Pass/Fail System
- Variance-aware testing - Run tests multiple times to get statistically significant results
- Confidence levels - Configure how confident you want to be in pass/fail decisions
- CLI integration - New
--runsflag to run tests multiple times
# Run each test 5 times for statistical analysis
evalview run --runs 5
LangGraph Adapter Fix
- Fixed adapter compatibility issues for better LangGraph integration
Config-Free Runs
- Run
evalview runwithout requiring a config file - Automatically discovers test cases in the current directory
Templates
- Added test case templates for common evaluation patterns
- Quick-start templates for tool calling, RAG, and multi-turn scenarios
Node SDK License Fix
- Fixed license mismatch - now correctly uses Apache 2.0
Documentation Improvements
- Added FAQ section and comparison table to README
- Added "Run examples directly" section
- Added design partners section
- Improved README structure for better clarity
Full Changelog
https://github.com/hidai25/eval-view/compare/v0.1.4...v0.1.5
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About hidai25/eval-view
Regression testing framework for AI agents. Save golden baselines, detect behavioral drift, and block regressions in CI. Works with LangGraph, CrewAI, OpenAI, Claude, and any HTTP API.
Related context
Related tools
Beta — feedback welcome: [email protected]