Skip to content

Release history

OpenMAIC releases

Open Multi-Agent Interactive Classroom — Get an immersive, multi-agent learning experience in just one click

All releases

5 shown

Config change
v0.2.2 Breaking risk
Auth

MAIC Editor, Outline edit, Offline export

v0.2.1 Breaking risk
Notable features
  • VoxCPM2 TTS provider with voice cloning from reference audio and Auto Voice generation
  • Per-model thinking configuration mapped to provider-specific reasoning fields (Anthropic thinking, OpenAI reasoning, etc.)
  • End-of-course completion page with persistent quiz state and motion-respecting confetti animation
Full changelog

Features

  • VoxCPM2 TTS provider with voice cloning — OpenMAIC adapts to user-managed VoxCPM backends (vLLM-Omni, Nano-VLLM, official Python API). Clone any voice from a reference audio clip you upload or record in the browser, or let Auto Voice generate a fitting voice from each agent's persona at synthesis time. Voice profiles are stored locally to keep the serverless setup model. The Agent Bar exposes a searchable, previewable voice picker that draws from the global VoxCPM voice pool #496
  • Per-model thinking configuration — First-class metadata for each model's reasoning capability (effort levels, on/off toggle, adjustable budget, or fixed thinking) flows through chat and all generation paths and is mapped to the right provider-specific request fields (Anthropic thinking, OpenAI reasoning, etc.). The model selector becomes a unified provider/model/thinking popover with compact search and a much smaller toolbar footprint #494
  • End-of-course completion page with persistent quiz state — When the outline is fully materialized, students see a course-complete view with quiz score card, scene-type stat cards, and a (motion-respecting) confetti celebration. Quiz answers persist on submit and grading results persist on completion, so navigating away and back restores the reviewing state with AI feedback intact instead of resetting #484
  • Add latest released models including GPT-5.5, DeepSeek-V4 (-pro, -flash), Xiaomi MiMo (mimo-v2.5-pro, mimo-v2.5), Tencent Hy3, and OpenRouter as a multi-provider gateway #481 #487
  • Add OpenAI image generation (GPT-Image-2) as a media provider #481
  • Refresh built-in model registries across Anthropic, DeepSeek, Kimi, Qwen, MiniMax, Grok, OpenAI, GLM, SiliconFlow, and Ollama; persisted local settings now rehydrate in registry order so newly curated lists appear consistent without clearing state #481
  • Add inline search for recent classrooms on the home page with deferred filtering by name and description, keyboard-driven open/clear/collapse #476
  • Add Deep-Interactive badge on classroom thumbnails for sessions generated with Interactive Mode #478
  • Replace always-included media instruction blocks in generation prompts with conditional snippet includes gated on imageEnabled / videoEnabled — disabled capabilities are removed from the prompt entirely instead of relying on negative-override directives the model often ignored #490 (by @YizukiAme)

Bug Fixes

  • Fix language drift between outline and scene generation by unifying the languageDirective across the pipeline so the same target language flows from outline planning through every per-scene call #474

Other Changes

  • Refactor whiteboard role prompts to file-based markdown templates and add a geometry-conflict detector (overlap, line-through-bbox, canvas clipping) that surfaces problems back to the model. Eval (flash, repeat 3, gemini-3.1-pro scorer) shows overall quality 5.4 → 6.1 and overlap 6.3 → 8.1 from prompt + detector alone #485
  • Migrate orchestration prompt builders (buildStructuredPrompt, buildDirectorPrompt, buildPBLSystemPrompt) from inline TS template literals to file-based markdown templates under lib/prompts/, sharing the loader infrastructure with the generation pipeline. prompt-builder.ts 890 → 314 lines; future content tweaks land as markdown edits #459

Full Changelog: https://github.com/THU-MAIC/OpenMAIC/compare/v0.2.0...v0.2.1

v0.2.0 New feature
Notable features
  • Deep Interactive Mode with AI teacher operating UI for interactive scenes (3D visualization, simulation, game, mind map/diagram, online programming)
  • Code element support on whiteboard
  • Arabic (ar-SA) language interface support
Full changelog

[0.2.0] - 2026-04-20

Features

  • Deep Interactive Mode — Generate hands-on interactive scenes (3D visualization, simulation, game, mind map/diagram, online programming) with an AI teacher who operates the UI to guide students. Fully responsive across desktop, tablet, and mobile #461
  • Add code element support on the whiteboard — AI agents can write, display, and reference runnable code during lessons #385 (by @cosarah)
  • Add Arabic (ar-SA) interface language #431 (by @YizukiAme)
  • Add MinerU Cloud API as a PDF parsing provider, with a dedicated settings UI #438
  • Add latest OpenAI models to the default config #416 (by @donghch)
  • Add GLM-5.1 and GLM-5V-Turbo to GLM preset models #437
  • Add international base URL shortcuts for GLM, Kimi, and MiniMax in provider settings #449
  • Add anti-framing security headers (X-Frame-Options + CSP frame-ancestors) with an optional ALLOWED_FRAME_ANCESTORS override #430 (by @YizukiAme)
  • Add i18n key alignment check to CI so missing or extra translation keys fail the build #447 (by @KanameMadoka520)
  • Add whiteboard layout quality eval harness and unify it with the outline-language harness #425 #453

Bug Fixes

  • Fix classroom ZIP export to use the latest classroom name from IndexedDB #435
  • Fix spotlight cutout for text elements and add element-content variant for image/video #457

Other Changes

  • Renew the README with Deep Interactive Mode showcase and visual assets #463 (by @Shirokumaaaa)
  • Update Discord invite links across README, CONTRIBUTING, and issue templates
v0.1.1 Breaking risk
Security fixes
  • Fix DNS rebinding bypass in SSRF validation
Notable features
  • Inline language inference for outline and PBL generation
  • Custom OpenAI-compatible TTS/ASR provider support
  • Ollama as built-in provider with keyless activation
v0.1.0 Security relevant
Security fixes
  • SSRF vulnerability fixed
  • Credential forwarding vulnerability fixed
Notable features
  • Discussion TTS with per-agent voice assignment
  • Immersive Mode with full-screen view
  • Keyboard shortcuts for roundtable controls

Beta — feedback welcome: [email protected]