hidai25/eval-view

v0.2.1 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

Published 6mo Developer Productivity

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agent-benchmark agent-evaluation agentic-ai ai-agents anthropic autogen

+12 more

cli crewai evaluation langchain-agent langgraph llm mcp openai-assistants pytest python regression-testing testing

Summary

AI summary

Added slash commands /run, /test, /adapters, and /compare for chat mode execution.

Full changelog

What's New

Chat Mode Enhancements

New slash commands: /run, /test, /adapters, /compare
Natural language execution: LLM suggests commands and prompts user to run them
/compare command: Side-by-side regression detection between test runs

Execution Tracing

OpenTelemetry-style spans across all adapters
LLM call tracking with token usage and costs
Tool execution spans with timing

Documentation

Expanded Chat Mode section with slash commands table
Added Natural Language Execution examples
Updated Architecture section with tracing components

Bug Fixes

Fixed 20 mypy type errors in chat.py
Corrected Evaluations attribute access
Fixed variable shadowing issues

Installation

pip install evalview==0.2.1

Full Changelog

https://github.com/hidai25/eval-view/compare/v0.2.0...v0.2.1

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track hidai25/eval-view

Get notified when new releases ship.

About hidai25/eval-view

Regression testing framework for AI agents. Save golden baselines, detect behavioral drift, and block regressions in CI. Works with LangGraph, CrewAI, OpenAI, Claude, and any HTTP API.

All releases →