Release history
phoenix releases
AI Observability & Evaluation
All releases
122 shown
Assistant enablement + trace policy
drag-to-zoom + agent skills + model selector
- Sandboxing and Code Evaluators break existing agent configurations.
- Code Evaluators allow custom Python/TypeScript evaluate() functions for composable evaluation strategies (composite scoring, embedding similarity, LLM juries).
- Agents now support provider‑native web search/fetch capabilities when available.
Full changelog
16.0.0 (2026-05-21)
⚠ BREAKING CHANGES
- Sandboxing and Code Evaluators (#13290)
Features
Phoenix now lets you compose evaluation strategies in code.
Most eval tooling hands you a fixed menu of judge templates. Real evaluation is rarely that tidy.
Code Evaluators enable you to build evaluation criteria the way you want. You write a Python or TypeScript evaluate() function in the Phoenix UI — no SDK, no local runtime, no deploy step — and Phoenix runs it server-side, recording labels and scores as annotations on every experiment run.
Because it's just code, you control the whole strategy:
• Composite scoring: blend sub-scores (LLM judgment + deterministic rules) into one weighted metric
• Embedding-based evaluation: cosine similarity over embeddings instead of brittle string matching
• LLM juries: poll multiple models and combine verdicts into a weighted consensus
Sandboxed Code evaluators unlock the idea of agents as a judge as well. We're excited where this is heading.
- agents: Enable provider native web search / fetch when available (#13333) (41eb4fc)
- Sandboxing and Code Evaluators (#13290) (e294d93)
Bug Fixes
Generative UI + lockfile tolerance + header handling
/agents sessions summary + ATIF v1.7
- Added /agents/{agent_id}/sessions/{session_id}/summary endpoint
- Playground manipulation tools with confirmation dialog
- Default provider/model set as user preference in playground
Full changelog
15.6.0 (2026-05-10)
Features
- agents: add /agents/{agent_id}/sessions/{session_id}/summary endpoint (#13095) (419c3a0)
- agents: Playground manipulation tools w/ confirmation (#13093) (37dc417)
- playground: default provider/model as a user preference (#13135) (4487740)
- tracing: always show metrics aside, remove legacy stats header (#13134) (ea3d7e7)
Bug Fixes
Documentation
Fixed bug where deleting a dataset evaluator would remove the built‑in evaluator.
Full changelog
- dep: Multiple CVEs fixed by bumping litellm to 1.83.14
- /chat-v2 tool wiring behind experimental toggle
- Frontend REST calls typed against OpenAPI schema
- Support x-project-name HTTP header for OTLP trace ingestion
Full changelog
15.5.0 (2026-05-08)
Features
- agents: wire /chat-v2 with tools behind experimental toggle (#13009) (7706554)
- app: type frontend REST calls against the OpenAPI schema (#13060) (590669d)
- support x-project-name HTTP header for OTLP trace ingestion (#12865) (7d10386)
- Update session details turns layout (#13042) (a1be820)
Bug Fixes
- add types-aiobotocore-bedrock-runtime in container/aws extras (#13113) (0e2c175)
- deps: bump litellm floor to 1.83.14 to fix multiple CVEs (#13020) (ccf1880)
Documentation
- Filter-based DELETE endpoints for span/trace/session annotations
- Token counts included in span/trace/session payloads
Full changelog
15.4.0 (2026-05-05)
Features
- agents: agent set_time_range tool with hardened context injection (#13022) (115c08b)
- agents: ToolPart styles and subcomponents (#12894) (0d4e16b)
- api: add filter-based DELETE endpoints for span/trace/session annotations (#12928) (12779fb)
- rest-api: include token counts in span/trace/session payloads (#12926) (c0c0edb)
- simplify trace/span status icons, use status badge in panel views (#12972) (d8e3915)
- vendor passthrough tools support (#12533) (41e8fe0)
Bug Fixes
- Removed user instructions from agent PXI
Full changelog
15.3.0 (2026-05-05)
Features
- agent: remove user instructions from PXI (#13010) (6062d64)
- chat empty state shader (#12990) (5c643d9)
Bug Fixes
- api: include dataset metadata in experiment CSV export (#12897) (64cb481)
- deps: update arize-phoenix-client to 2.4.0 (#12799) (1300763)
- deps: update arize-phoenix-otel to 0.16.1 (#12997) (7bff140)
Documentation
- Move PXI prompt assembly to server side and cache prompts in agents.
- Add adapter tests for evals.
Minor fixes and improvements.
Full changelog
- **agents:** drop user-supplied connection params from /chat (SSRF)
- TanStack AI tracing integration added
- PXI prompt assembly moved server‑side with caching
- Enhanced filter condition filtering
Full changelog
15.2.0 (2026-05-03)
Features
- add TanStack AI tracing integration (#12984) (382b165)
- agents: Move PXI prompt assembly server-side and cache prompts (#12959) (f223f92)
- enhance filter condition filter (#12938) (b92bfbc)
Bug Fixes
- agents: drop user-supplied connection params from /chat (SSRF) (#12974) (775b270)
- cost: update built-in model token prices (#12960) (746247c)
- replace crypto.randomUUID with a context-safe fallback (#12987) (#12988) (62b6ace)
- ui: remove redundant Ask PXI button from trace details drawer (#12957) (863e9ca)
Documentation
- CLI: named auth profile management
- CLI: session annotations and notes
- Phoenix‑client: TS trace annotations with clarified note semantics
Full changelog
15.1.0 (2026-04-30)
Features
- api: query annotations by identifier on GET endpoints (#12952) (e2a1de9)
- cli: add named auth profile management (#12529) (ab62d3d)
- cli: add session annotations and notes (#12925) (4e20267)
- phoenix-client: add TS trace annotations + clarify note semantics + skills audit (#12923) (2993b04)
Bug Fixes
- playground: accept unwrapped AWS Bedrock tool schemas (#12937) (b450f10)
- ui: remove unused sticky positioning on Drawer dialog header (#12954) (d6a5f90), closes #12953
Documentation
- Dataset upsert
- Add session notes endpoint
- Gate v15 dataset upload params by server version
- Extend db prompt types for invocation parameters
Full changelog
2.6.0 (2026-04-29)
⚠ BREAKING CHANGES
- dataset upsert (#11860)
Features
- api: add session notes endpoint (#12902) (187df7e)
- client: gate v15 dataset upload params by server version (#12934) (e381885)
- dataset upsert (#11860) (9575738)
- extend db prompt types for invocation paramaters (#12855) (1499ca9)
Miscellaneous Chores
- Dataset upsert operation added
- Database prompt types extended to support invocation parameter definitions
Full changelog
- Support custom providers and secret store values for agents
- Add session notes endpoint
- Add JSON schema for settings configuration
- Advertise Phoenix page context to agent chat
- Move advanced PXI toggles into Assistant config
- Redact sensitive GraphQL fields with RedactedString
Full changelog
14.16.0 (2026-04-28)
Features
- agents: advertise phoenix page context to chat (#12835) (917da7b)
- agents: move advanced PXI toggles into Assistant config (#12895) (f8da519), closes #12893
- server: redact sensitive GraphQL fields with RedactedString (#12807) (7dca138)
Bug Fixes
- app: own loader-created Relay query refs in route pages (#12873) (b342096)
- cost: update built-in model token prices (#12874) (61cfc60)
- playground: avoid duplicate Responses output messages (#12890) (5570e89)
Documentation
- Custom provider credential test now requires authentication (SSRF)
- Sessions UI flag with Turns/Traces toggle
- Phoenix skills audit skill with weekly GitHub Actions
- Trace note REST endpoint for Phoenix CLI
Full changelog
- Trace note support for px CLI
- Dedicated span notes column in spans table
- Trace note REST endpoint for Phoenix CLI
- Re-exported openinference helpers, decorators, and semconv
Full changelog
14.13.0 (2026-04-24)
Features
- Add a dedicated span notes column and clean up annotation selection (#12789) (ec6565e)
- add trace note REST endpoint for Phoenix CLI (#12710) (ed66c02)
- phoenix-otel: re-export openinference helpers, decorators, and semconv (#12844) (23576da)
- ui: add trace notes column to spans table (#12847) (c83d9ec)
Bug Fixes
Documentation
- Re-exported openinference helpers, decorators, and semconv
Full changelog
- Trace annotations column to spans table
- Aggregate span info on traces
- Portal page controls in top nav
- Session summary evaluation with sidebar-title prompt rewrite
- Root span name display in session turn list
- Secrets settings page
Full changelog
14.11.0 (2026-04-22)
Features
- agent: session summary eval + sidebar-title prompt rewrite (#12780) (fab5d89)
- app: show root span name in session turn list (#12795) (c31fd97), closes #12792
- secrets: Add secrets settings page (#12797) (f911992)
- ui: add Show Table Aside toggle to project filter config (featureflagged) (#12773) (81db7aa)
Bug Fixes
- Type-aware attribute filter for GET /v1/spans
- Trace ID passthrough to experiment evaluators
- authlib version must be <1.7.0
- Token count display for chat sessions
Full changelog
Fixed Azure AsyncEngine wrapping to use ObjectProxy instead of patching dispose.
Full changelog
- Claude Opus 4.7 support in playground
- PXI consent and trace-sharing controls
- Type-aware attribute filter for GET /v1/spans REST API
Full changelog
14.9.0 (2026-04-20)
Features
- Add PXI consent and trace-sharing controls (#12740) (ebc9997)
- agent: empty-state screen for PXI chat (#12726) (0a5a180)
- agents: vendor use-stick-to-bottom for PXI chat (#12715) (3f2571d)
- cli: split out span note support (#12739) (aa8b34b)
- playground: add Claude Opus 4.7 (#12738) (9f149c6)
- rest-api: add type-aware attribute filter to GET /v1/spans (#12524) (c0badfa)
- ui: expose agentsConfig via GraphQL and surface on settings page (#12723) (9b43f53)
- ui: resizable Drawer component with Modal simplification (#12707) (f7b37b1)
Bug Fixes
- cost: update built-in model token prices (#12728) (2532320)
- graphql: remove default_factory=dict on Strawberry JSON fields to fix introspection (#12727) (e5e7f1a)
- remove force_flush from get_db_traces (#12766) (0bf480c)
Documentation
- Azure managed identity authentication for PostgreSQL
Full changelog
- Wrapt v2.x is now supported
- Session pagination for efficient trace navigation
- Trace annotation dataloader for improved data loading
- Projects page auto-refreshes on mount and 60-second interval
Full changelog
14.7.0 (2026-04-16)
Features
- session pagination (#12695) (465ff59)
- trace annotation dataloader (#12699) (8e2effd)
- ui: refresh projects page on mount and on 60s interval (#12694) (3f05e80)
Bug Fixes
- allow wrapt v2 (<3 upper bound) (#12714) (d6b0135)
- ui: simplify onboarding to use only phoenix-otel for TypeScript (#12708) (1dd5f5f)
Documentation
- Eval harness for agent system
- List-detail layout for session turns
- Slack alerts for CI model sync
- Feedback actions for agent assistant messages
- Chat metadata for assistant messages
- Span and trace annotation commands to the Phoenix CLI
- Agent capability menu
- Refactored PXI frontend tool wiring around a capability registry
Full changelog
## 14.3.1 (2026-04-14) ### Bug Fixes * Harden authFetch and add it to dataset upload path
- Assistant agent settings page
- Re-export of openinference-core from phoenix-otel with documentation
- WebSocket support for GraphQL subscriptions removed, replaced with HTTP multipart
- Name-based project URL redirects via /redirects/projects/:project_name endpoint
Full changelog
- Deprecate evals 1.0 and remove legacy experiments module
- CLI flags now follow subcommand (e.g., 'phoenix serve --dev' instead of 'phoenix --dev serve')
- Remove /v1/evaluations endpoint and Evaluations plumbing
- Deprecate evals 1.0 and remove legacy experiments module
- Agent ask_user elicitation tool
- Ephemeral experiments with ExperimentSweeper daemon
- PostgreSQL read replica routing
## 2.3.0 (2026-04-03) ### Features * atif to trace trajectory conversion utility
- get_traces method in Python and TypeScript clients
- GET /v1/user endpoint
- Python 3.14 support
## 2.13.0 (2026-04-01) ### Features * add Python 3.14 support (except Windows)
- Python 3.14 support (except Windows)
- Agent session summary improvements
- REST endpoint for secrets management
Full changelog
- REST API: DELETE endpoint for removing tags from prompt versions
- REST API: DELETE endpoint for removing prompts by identifier
- New CopyField UI component and settings layout cleanup
Full changelog
Model menu search field no longer loses focus on first keystroke.
- Agent session switcher with delete controls
- Tool call collapsing in agent chat
- PromptInput compound component
## 13.18.2 (2026-03-24) ### Bug Fixes * **deps:** update arize-phoenix-evals to 2.12.0
- Pin LiteLLM <1.82.7 to mitigate supply chain attack
- Structured data eval inputs
## 13.18.1 (2026-03-24) ### Bug Fixes * Pin LiteLLM <1.82.7 to mitigate supply chain attack
- Prompt version diff view for comparing prompt iterations
- Markdown rendering replaced with streamdown
Full changelog
- Resolves high and moderate dependency vulnerabilities
- px-gql simulated bash tool
- Copy spanID to clipboard
- Backend and frontend skills for Phoenix development
- Agent browser bash tool execution
- Expanded integration snippets for TypeScript and Python
## 2.1.0 (2026-03-18) ### Features * **client:** add span filter params to getSpans
- phoenix db migrate subcommand for standalone migrations
- GET /v1/projects/{project_identifier}/traces endpoint
- Span filtering by name, span_kind, status_code
- client.annotations module removed; use client.spans instead
- Session conversation API for Python and TypeScript clients
- Provider additions including Perplexity and Together AI
- Session conversation API for Python and TypeScript clients
- phoenix-pr-screenshot slash command skill
- client.annotations module removed; use client.spans instead
- PHOENIX_ALLOWED_PROVIDERS and PHOENIX_HIDDEN_PROVIDERS environment variables
- DELETE session API
- Dataset column drag-drop UI
- parent_id filter to GET spans endpoints
- Feature-flagged tracing onboarding empty state
- Perplexity AI as built-in provider with Sonar models
- Together AI as built-in provider with 8 model families
- 35 new model entries for cost tracking
- Cerebras, Fireworks, Groq, Moonshot as first-class providers
- Cost tracking for 298 new models
- Latest OpenAI GPT models including gpt-5 family
- trace_id filter to GET spans REST endpoints
- Filetype-agnostic dataset upload
- Session retrieval methods for Python client SDK
- Timeout parameter and list_sessions alias for sessions API
Full changelog
- Brute force login protection
- Agents pydantic vercel data stream protocol
- File dropzone component
- Container image updated to Debian 13
Full changelog
- GET endpoints added for sessions REST API
Full changelog
- ReDoS vulnerability fix in minimatch
- Multiple dependency vulnerabilities addressed
- Inline document annotation UI for retriever spans
- CLI sessions and session commands
- Configurable password policy
- GET endpoints for sessions REST API
- Conciseness classification evaluator
- Refusal evaluator
- Prompt to evals template conversion utility
- Refusal evaluator
- Migrate bundler from Vite/Rollup to rolldown-vite
- Claude Agent SDK integration
- Conciseness classification evaluator
- JSONDistanceEvaluator error message improvements
- AWS Bedrock cross-region inference model prefix preference
- Sonnet 4.6 model name support for AWS Bedrock
## 13.1.1 (2026-02-18) ### Bug Fixes * use store-and-network fetch policy for ModelMenu query
- Merge phoenix-sdk-python skill into phoenix-evals
- Session chat with dynamic markdown rendering
- Table column resizing in EvaluatorsTable and DatasetEvaluatorsTable
## 13.0.3 (2026-02-13) ### Bug Fixes * Prevent playground rerun when opening trace slide-over
- Tool response handling evaluator template
- Dataset evaluators support
- Mustache prompt template support
## 13.0.2 (2026-02-13) ### Bug Fixes * **deps:** update arize-phoenix-client to 1.29.0
## 1.29.0 (2026-02-13) ### Features * dataset evaluators * dataset evaluators
Prevents .gitignore negation patterns from leaking files into source distributions.
- Dataset evaluators system refactored
- Built-in LLM evaluator configs added to GraphQL
- Evaluator creation page and UI improvements
- Playground evaluator selection and display enhancements
- Claude Opus 4.6 model added to playground
## 1.28.1 (2026-02-09) ### Bug Fixes * add timezone validation to log_spans_dataframe
- Tool selection evaluator for both libraries
- FaithfulnessEvaluator
- Tool invocation accuracy metric
- trace_id in Scores
- Configurable email extraction from OAuth2 attributes
- Evals skill
- Tracing skill
- Tool invocation accuracy metric
- HallucinationEvaluator deprecated; migrate to FaithfulnessEvaluator
- FaithfulnessEvaluator for LLM evaluation
- span_id_key parameter to link dataset examples to traces
Full changelog
1.28.0 (2026-01-21)
Features
- add FaithfulnessEvaluator and deprecate HallucinationEvaluator (#10962) (fc8b1b5)
- add span_id_key to link dataset examples to traces (#10942) (01eb1fb)
This PR was generated with Release Please. See documentation.
- FaithfulnessEvaluator
- Dataset and experiment CLI commands
- LLM classification metric cursor rule
- Display span event attributes in UI
- Connection timeout error handling
- Correctness evaluator
- Sync and async LLM client kwargs
Fixes context.span_id column handling when DataFrame has integer index.
- JSONL dataset uploads
- Tool selection correctness metric