This release includes 2 breaking changes for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
+3 more
Summary
AI summaryUpdates Highlights, Breaking Changes, and AutoResearchClaw v0.5.0 across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Breaking | Medium |
Topic IDs renamed: T01-T25 → ML01-ML25 in ARC-Bench Topic IDs renamed: T01-T25 → ML01-ML25 in ARC-Bench Source: granite4.1:8b-q6_K@2026-05-20 Confidence: high |
— |
| Feature | Medium |
Multi-Domain Architecture supports HEP Physics, Biology, Quantum Computing, and Statistics domains with profile-driven deployment Multi-Domain Architecture supports HEP Physics, Biology, Quantum Computing, and Statistics domains with profile-driven deployment Source: granite4.1:8b-q6_K@2026-05-20 Confidence: high |
— |
| Feature | Medium |
ARC-Bench Evaluation Framework includes 50+ topics across 5 domains with rubric-based judging and baseline adapters ARC-Bench Evaluation Framework includes 50+ topics across 5 domains with rubric-based judging and baseline adapters Source: granite4.1:8b-q6_K@2026-05-20 Confidence: high |
— |
| Feature | Medium |
ColliderAgent Integration provides full HEP physics simulation pipeline (MadGraph → Pythia → Delphes) with incremental experiment mode ColliderAgent Integration provides full HEP physics simulation pipeline (MadGraph → Pythia → Delphes) with incremental experiment mode Source: granite4.1:8b-q6_K@2026-05-20 Confidence: high |
— |
| Feature | Medium |
Biology-Agent Integration offers metabolic modeling with COBRApy/Biopython, FBA simulation, and GSMM validation Biology-Agent Integration offers metabolic modeling with COBRApy/Biopython, FBA simulation, and GSMM validation Source: granite4.1:8b-q6_K@2026-05-20 Confidence: high |
— |
| Feature | Medium |
Quantum-Qiskit Skill supports Qiskit-based quantum computing experiments for quantum topics Quantum-Qiskit Skill supports Qiskit-based quantum computing experiments for quantum topics Source: granite4.1:8b-q6_K@2026-05-20 Confidence: high |
— |
| Feature | Medium |
Statistics Domain Agent enables statistical method design, experiment evaluation, and theory analysis Statistics Domain Agent enables statistical method design, experiment evaluation, and theory analysis Source: granite4.1:8b-q6_K@2026-05-20 Confidence: high |
— |
| Feature | Medium |
Requirements Gate validates LLM capabilities before pipeline execution Requirements Gate validates LLM capabilities before pipeline execution Source: granite4.1:8b-q6_K@2026-05-20 Confidence: high |
— |
| Feature | Medium |
Profile-Driven Deployment provides interactive CLI for domain profile creation and management Profile-Driven Deployment provides interactive CLI for domain profile creation and management Source: granite4.1:8b-q6_K@2026-05-20 Confidence: high |
— |
| Feature | Medium |
Incremental Experiment Mode allows resuming experiments at Stage-12 with delta-prompt assembly Incremental Experiment Mode allows resuming experiments at Stage-12 with delta-prompt assembly Source: granite4.1:8b-q6_K@2026-05-20 Confidence: high |
— |
| Feature | Medium |
Profile-Driven Deployment offers interactive CLI for creating and managing domain profiles Profile-Driven Deployment offers interactive CLI for creating and managing domain profiles Source: granite4.1:30b@2026-05-20-audit Confidence: high |
— |
| Bugfix | Medium |
Expanded Test Suite includes new tests for HEP prompt hygiene, incremental experiments, and domain integrations Expanded Test Suite includes new tests for HEP prompt hygiene, incremental experiments, and domain integrations Source: granite4.1:8b-q6_K@2026-05-20 Confidence: low |
— |
| Refactor | Medium |
researchclaw/prompts.py refactored into researchclaw/prompts/ package (domain-aware prompt banks) researchclaw/prompts.py refactored into researchclaw/prompts/ package (domain-aware prompt banks) Source: granite4.1:8b-q6_K@2026-05-20 Confidence: low |
— |
Full changelog
AutoResearchClaw v0.5.0
Highlights
- Multi-Domain Architecture: Expanded beyond ML to support HEP Physics, Biology, Quantum Computing, and Statistics domains with profile-driven deployment
- ARC-Bench Evaluation Framework: Standardized benchmark suite with 50+ topics across 5 domains (ML01-ML25, P01-P10, Q01-Q10, B01-B07, S01-S03), rubric-based judging, and baseline adapters for AIDE, Agent Laboratory, and AI-Scientist-v2
- ColliderAgent Integration: Full HEP physics simulation pipeline support (MadGraph → Pythia → Delphes) with incremental experiment mode and Stage-12 re-entry
- Biology-Agent Integration: Metabolic modeling with COBRApy/Biopython skills, FBA simulation, and GSMM validation
- Quantum-Qiskit Skill: Qiskit-based quantum computing experiment support for quantum topics
- Statistics Domain Agent: Statistical method design, experiment evaluation, and theory analysis
- Requirements Gate: LLM capability validation before pipeline execution
- Profile-Driven Deployment: Interactive CLI for domain profile creation and management
- Incremental Experiment Mode: Resume experiments at Stage-12 with delta-prompt assembly
- Expanded Test Suite: New tests for HEP prompt hygiene, incremental experiments, and domain integrations
Breaking Changes
- Topic IDs renamed: T01-T25 → ML01-ML25 in ARC-Bench
- researchclaw/prompts.py refactored into researchclaw/prompts/ package (domain-aware prompt banks)
Documentation
- Domain Integration Guide for adding new scientific domains
- Tester guides in English, Chinese, and Japanese
- ARC-Bench experiment design docs and run guides
- Showcase papers demonstrating pipeline outputs
Full Changelog: https://github.com/aiming-lab/AutoResearchClaw/compare/v0.4.0...v0.5.0
Breaking Changes
- Topic IDs renamed from T01-T25 to ML01-ML25 in ARC‑Bench
- researchclaw/prompts.py refactored into researchclaw/prompts/ package
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About AutoResearchClaw
Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper.
Related context
Related tools
Beta — feedback welcome: [email protected]