AutoResearchClaw
AI Agents & AssistantsFully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper.
Features
- Autonomously generates research papers across multiple domains
- Supports Human‑in‑the‑Loop co‑pilot collaboration with configurable intervention modes
- Provides a multi‑domain experiment agent framework (e.g., physics, biology, statistics)
- Includes the ARC‑Bench benchmark dataset for evaluating autonomous research systems
Recent releases
View all 7 releases →- 6+ intervention modes
- Idea Workshop
- Paper Co-Writer
Full changelog
v0.4.0 — Human-in-the-Loop Co-Pilot System
AutoResearchClaw is no longer purely autonomous. The new HITL Co-Pilot system transforms the pipeline into a human-AI collaborative research engine.
Highlights
- 6+ Intervention Modes:
full-auto,gate-only,checkpoint,step-by-step,co-pilot,custom,express - Idea Workshop: Brainstorm and refine hypotheses collaboratively (Stages 7-8)
- Baseline Navigator: Review and customize experiment designs (Stage 9)
- Paper Co-Writer: Section-by-section collaborative drafting (Stages 16-19)
- SmartPause: Confidence-driven dynamic intervention
- ALHF Intervention Learning: Learns from your review patterns
- Claim Verification: Inline fact-checking against collected literature
- Cost Guardrails: Budget monitoring with threshold alerts
- Pipeline Branching: Fork to explore multiple research directions
- CLI Commands:
attach,status,approve,reject,guide - 3 Adapters: CLI, WebSocket, MCP
New Files
researchclaw/hitl/— 34 modules (7,500+ lines)tests/test_hitl_*.py— 9 test files (242 tests)docs/HITL_GUIDE.md— 620-line guide- 3 new builtin skills
Testing
- 2,753 tests passed, 0 failures
Full Changelog: https://github.com/aiming-lab/AutoResearchClaw/compare/v0.3.2...v0.4.0
- VerifiedRegistry ground-truth whitelist
- Experiment diagnosis & repair
- ACP-compatible agent backends
Full changelog
What's New
Cross-Platform Support
- ACP-compatible agent backends: Claude Code, Codex CLI, Copilot CLI, Gemini CLI, Kimi CLI
- OpenClaw bridge: messaging platform integration (Discord, Telegram, Lark, WeChat)
- CLI-agent code generation backend: delegates Stages 10 & 13 to external CLI agents with budget control and timeout management
Anti-Fabrication System
- VerifiedRegistry: ground-truth whitelist from experiment results with tolerance matching
- Experiment diagnosis & repair loop: 13 deficiency categories, auto-repair with best-result selection
- Always-on sanitization: unverified numbers replaced in paper tables
Stability & Quality
- 100+ bug fixes across 8 deep audit rounds
- Modular executor refactoring (10K → 400-line facade)
--resumeauto-detection for interrupted runs- LLM retry hardening with exponential backoff
- Community-reported fixes (macOS M3, math/theoretical topics)
New Subsystems
- Assessor (paper quality scoring + venue recommendation)
- Calendar (conference deadline tracking)
- Collaboration (multi-user research coordination)
- Copilot (interactive steering modes)
- Dashboard (real-time metrics broadcasting)
- Knowledge Graph (entity extraction + visualization)
- Memory (cross-run experiment/ideation/writing memory)
- MCP (Model Context Protocol server)
- Overleaf (live sync with conflict resolution)
- Project Manager (multi-project scheduling)
- Remote Servers (SSH/SLURM/cloud execution)
- Skills Library (12 built-in domain/tooling skills)
- Trends (daily arXiv digest + opportunity finder)
- Voice (speech-to-text commands)
- Wizard (guided project setup)
Testing
- 1,935 tests passing
Full Changelog: https://github.com/aiming-lab/AutoResearchClaw/compare/v0.3.1...v0.3.2
- Beast Mode for complex code generation with 6-signal complexity scoring and CodeAgent fallback
- Cross-domain support for ML, physics, chemistry, economics, math, biology, and security
- Web integration with Google Scholar, PDF extraction, and crawling capabilities
- CodeAgent v2 with sequential file generation and hard validation gates
- MetaClaw cross-run learning integration with skill injection and +18.3% robustness improvement
- 50+ pipeline bug fixes covering metrics, citations, LaTeX escaping, and Docker sandbox
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.