Skip to content

hidai25/eval-view

v0.1.4 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agent-benchmark agent-evaluation agentic-ai ai-agents anthropic autogen
+12 more
cli crewai evaluation langchain-agent langgraph llm mcp openai-assistants pytest python regression-testing testing

Summary

AI summary

Ollama support enables free local LLM-as-judge evaluations.

Full changelog

What's New

Ollama Support (Free Local Evaluation)

  • Ollama as LLM-as-judge - Run evaluations locally with zero API costs
  • Auto-detection - Automatically detects Ollama running on localhost:11434
  • New adapter - Test LangGraph agents powered by local Llama models
# Free local evaluation
evalview run --judge-provider ollama --judge-model llama3.2

Improved Hallucination Detection

- Reduced false positives for local models
- Unit conversions and formatting no longer flagged as hallucinations
- Confidence threshold: 90% for Ollama, 70% for cloud providers

README Updates

- Added "Who is EvalView for?" section
- Added LangSmith/Langfuse complement positioning
- New Ollama example in /examples/ollama/

Fixes

- Fixed mypy type annotation error
- Fixed action.yml description length for Marketplace

Full Changelog: https://github.com/hidai25/eval-view/compare/v0.1.3...v0.1.4

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track hidai25/eval-view

Get notified when new releases ship.

Sign up free

About hidai25/eval-view

Regression testing framework for AI agents. Save golden baselines, detect behavioral drift, and block regressions in CI. Works with LangGraph, CrewAI, OpenAI, Claude, and any HTTP API.

All releases →

Related context

Beta — feedback welcome: [email protected]