Skip to content

Release history

ART releases

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen3.5, GPT-OSS, Llama, and more!

All releases

58 shown

v0.5.17 Breaking risk
⚠ Upgrade required
  • vLLM upgraded to 0.17.0
  • unsloth and unsloth zoo downgraded
  • pyarrow added to Tinker extra
Breaking changes
  • Removed beta KL divergence from training loss
Notable features
  • W&B run config API
  • Tenant-scoped Tinker model aliases
  • Improved metrics in ART
Full changelog

Release Highlights

What's Changed

  • feat: Add W&B run config API (#615)
  • feat: add tenant-scoped Tinker model aliases (#614)
  • feat: Update Tinker renderers (#613)
  • feat: Update Tinker renderers (#612)
  • fix: Add pyarrow to Tinker extra (#611)
  • build: Upgrade vLLM to 0.17.0 (#610)
  • feat: Improved metrics in ART (#609)
  • ci: Auto-build and upload uv cache on miss (#608)
  • Remove beta KL divergence from training loss (#607)
  • Fix ty type checker errors and warnings (#606)
  • Clean up unused adapters before saving checkpoint (#605)
  • build: Upgrade to unsloth 2026.3.3 (#604)
  • Release v0.5.16 (#603)
  • build: Downgrade unsloth and unsloth zoo (#602)

Full Changelog: https://github.com/OpenPipe/ART/compare/v0.5.16...v0.5.17

v0.5.16 Breaking risk
Notable features
  • Serverless SFT training now records provenance on completion
Full changelog

Release Highlights

What's Changed

  • build: Downgrade unsloth and unsloth zoo (#602)
  • Release v0.5.15 (#599)
  • ci: Use hatch build in release workflow (#598)
  • Release v0.5.14 (#597)
  • Temporarily downgrade transformers (#596)
  • ci: Add package install smoke test workflow (#595)
  • Release v0.5.13 (#593)
  • fix: Use uv build in release workflow (#592)
  • Record serverless SFT provenance on training completion (#590)

Full Changelog: https://github.com/OpenPipe/ART/compare/v0.5.15...v0.5.16

v0.5.15 Breaking risk
Notable features
  • Record serverless SFT provenance on training completion
Full changelog

Release Highlights

What's Changed

  • ci: Use hatch build in release workflow (#598)
  • Temporarily downgrade transformers (#596)
  • ci: Add package install smoke test workflow (#595)
  • fix: Use uv build in release workflow (#592)
  • Record serverless SFT provenance on training completion (#590)

Full Changelog: https://github.com/OpenPipe/ART/compare/v0.5.14...v0.5.15

v0.5.14 Breaking risk

Minor fixes and improvements.

Full changelog

Release Highlights

What's Changed

  • Temporarily downgrade transformers (#596)
  • ci: Add package install smoke test workflow (#595)
  • Release v0.5.13 (#593)
  • fix: Use uv build in release workflow (#592)

Full Changelog: https://github.com/OpenPipe/ART/compare/v0.5.13...v0.5.14

v0.5.13 Breaking risk
Notable features
  • Transformers v5.x update
  • MoE LoRA conversion
  • Flex Attention for Megatron
v0.5.11 Breaking risk

Minor fixes and improvements.

Full changelog

Release Highlights

What's Changed

  • fix: lazy-import heavy deps in CLI so lightweight commands work without extras (#571)
  • Release v0.5.10 (#569)

Full Changelog: https://github.com/OpenPipe/ART/compare/v0.5.10...v0.5.11

v0.5.10 Breaking risk

Minor fixes and improvements.

Full changelog

Release Highlights

What's Changed

  • ci: bootstrap GH-managed prek cache image workflow (#566)
  • WIP: SFT (local backend) (#530)

Full Changelog: https://github.com/OpenPipe/ART/compare/prek-uv-cache...v0.5.10

v0.5.9 Breaking risk
Breaking changes
  • SkyPilot backend removed
  • TorchTune service removed
Notable features
  • Tool support for RULER evaluation
  • TinkerNativeBackend
  • Backend-First Training API
v0.5.7 Breaking risk
Breaking changes
  • Minimum openai version 2.14.0 required
  • vLLM pinned to 0.13.0
Full changelog

Release Highlights

What's Changed

  • fix: Bump minimum openai version to 2.14.0 (#504)
  • Release v0.5.6 (#502)
  • fix: Pin vLLM to 0.13.0 (#501)

Full Changelog: https://github.com/OpenPipe/ART/compare/v0.5.6...v0.5.7

v0.5.6 Breaking risk
⚠ Upgrade required
  • vLLM pinned to 0.13.0
Notable features
  • Support for LocalBackend Tinker model service
Full changelog

Release Highlights

What's Changed

  • fix: Pin vLLM to 0.13.0 (#501)
  • release: Bump version to 0.5.5 (#500)
  • feat: Add support for a LocalBackend Tinker model service (#499)

Full Changelog: https://github.com/OpenPipe/ART/compare/v0.5.5...v0.5.6

v0.5.5 Breaking risk
Notable features
  • LocalBackend Tinker model service
  • CISPO default loss
  • vLLM 0.11+ support
v0.5.3 Breaking risk
Notable features
  • strip_logprobs utility function
  • OpenEnv integration example
v0.4.12 Breaking risk
Notable features
  • Client-side error capturing
  • Serverless metrics reporting
  • Playwright agent example and LangGraph RULER integration
v0.4.11 New feature
Notable features
  • art.mcp package for Model Context Protocol integration
v0.4.9 New feature
Notable features
  • LangGraph integration with auto trajectory generation
  • RULER compatibility for general-purpose rewards
  • Multi-step agent training
v0.4.8 Bug fix
Notable features
  • Experimental standard deviation learning rate scheduling
  • Truncated importance sampling support
v0.4.5 New feature

Introduces GSPO algorithm for stable Mixture-of-Experts model training.

v0.4.3 Breaking risk
Breaking changes
  • SkyPilot moved to optional dependency; install openpipe-art[skypilot] to use SkyPilotBackend
v0.4.0 New feature
Notable features
  • RULER reward function
  • LLM-as-judge trajectory ranking
v0.3.12 Mixed
Notable features
  • Multi-device training
  • Langfuse tracing integration
  • W&B Weave integration
v0.3.11 Maintenance

## What's Changed - Limit number of metrics shown in `gather_trajectory_groups`

v0.3.9 Maintenance
Notable features
  • Model config serialization
  • Trajectory and metrics logging via remote backend
  • Asymmetric PPO clipping and Qwen 3 support
v0.3.6 Maintenance
Notable features
  • Model deployment support
  • S3 functions exposed on CLI server
  • Span duration tracking for rollouts
v0.2.0 Mixed
Notable features
  • Servable API
  • S3 model loading/pushing helpers
  • New benchmarking code
v0.1.24 Breaking risk

Trajectories no longer include default values; users must now explicitly specify values.

v0.1.20 Bug fix

## What's Changed - fix: Patch MultiStepModelRunner for Unsloth compatibility

v0.1.19 New feature

Enables training on longer sequences in memory-constrained environments.

v0.1.9 Breaking risk
Breaking changes
  • Deprecate gather_groups, introduce gather_trajectories
  • API changes
Notable features
  • Unsloth KL divergence support
  • Async generator support to mp_actors

Beta — feedback welcome: [email protected]