Verdict

v0.2.0 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

Published 3mo AI Coding Tools

✓ No known CVEs patched

✓ No known CVEs patched in this version

Topics

benchmarking evals llm llm-benchmarking llm-evaluation model-evaluation

+2 more

model-selection python

Summary

AI summary

Added CSV support, arbitrary JSONL field mapping, label‑free evaluation, and multi‑dimensional LLM‑as‑judge metrics.

Full changelog

Dataset

CSV support via Dataset.from_csv() — default column names input and ideal, with input_field/output_field overrides for custom schemas
Arbitrary JSONL field mapping via --input-field / --output-field CLI flags and Python API
Label-free evaluation — datasets without reference answers work end-to-end; reference-based metrics raise a clear error upfront

Metrics

Multi-dimensional LLM-as-judge via the dimensions parameter — score multiple criteria (e.g. fluency, accuracy, safety) in a single judge call

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Verdict

Get notified when new releases ship.

About Verdict