This release adds 3 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+12 more
Summary
AI summaryEvalView GitHub Action adds a pytest-style testing framework for AI agents.
Full changelog
EvalView GitHub Action
Pytest-style testing framework for AI agents — now available as a GitHub Action.
Usage
- uses: hidai25/[email protected]
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
Features
- 🧪 Test LangGraph, CrewAI, OpenAI, Anthropic, and custom agents
- ⚡ Parallel test execution (4 workers by default)
- 📊 Auto-generated HTML reports
- 💬 PR comments with test results
- 🤖 LLM-as-judge output evaluation
- 💰 Cost and latency threshold checks
Action Inputs
| Input | Description | Default |
|----------------|---------------------------------|-----------------------|
| openai-api-key | OpenAI API key for LLM-as-judge | - |
| config-path | Path to config file | .evalview/config.yaml |
| max-workers | Parallel workers | 4 |
| fail-on-error | Fail on test failure | true |
Full Documentation
See https://github.com/hidai25/eval-view#github-action-recommended for complete usage
examples.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About hidai25/eval-view
Regression testing framework for AI agents. Save golden baselines, detect behavioral drift, and block regressions in CI. Works with LangGraph, CrewAI, OpenAI, Claude, and any HTTP API.
Related context
Related tools
Beta — feedback welcome: [email protected]