OpenMAIC releases - releaseport

Config change

v0.2.2 Breaking risk 1d

Auth

MAIC Editor, Outline edit, Offline export

Open

v0.2.1 Breaking risk 1mo

Notable features

VoxCPM2 TTS provider with voice cloning from reference audio and Auto Voice generation
Per-model thinking configuration mapped to provider-specific reasoning fields (Anthropic thinking, OpenAI reasoning, etc.)
End-of-course completion page with persistent quiz state and motion-respecting confetti animation

Full changelog

Features

VoxCPM2 TTS provider with voice cloning — OpenMAIC adapts to user-managed VoxCPM backends (vLLM-Omni, Nano-VLLM, official Python API). Clone any voice from a reference audio clip you upload or record in the browser, or let Auto Voice generate a fitting voice from each agent's persona at synthesis time. Voice profiles are stored locally to keep the serverless setup model. The Agent Bar exposes a searchable, previewable voice picker that draws from the global VoxCPM voice pool #496
Per-model thinking configuration — First-class metadata for each model's reasoning capability (effort levels, on/off toggle, adjustable budget, or fixed thinking) flows through chat and all generation paths and is mapped to the right provider-specific request fields (Anthropic thinking, OpenAI reasoning, etc.). The model selector becomes a unified provider/model/thinking popover with compact search and a much smaller toolbar footprint #494
End-of-course completion page with persistent quiz state — When the outline is fully materialized, students see a course-complete view with quiz score card, scene-type stat cards, and a (motion-respecting) confetti celebration. Quiz answers persist on submit and grading results persist on completion, so navigating away and back restores the reviewing state with AI feedback intact instead of resetting #484
Add latest released models including GPT-5.5, DeepSeek-V4 (-pro, -flash), Xiaomi MiMo (mimo-v2.5-pro, mimo-v2.5), Tencent Hy3, and OpenRouter as a multi-provider gateway #481 #487
Add OpenAI image generation (GPT-Image-2) as a media provider #481
Refresh built-in model registries across Anthropic, DeepSeek, Kimi, Qwen, MiniMax, Grok, OpenAI, GLM, SiliconFlow, and Ollama; persisted local settings now rehydrate in registry order so newly curated lists appear consistent without clearing state #481
Add inline search for recent classrooms on the home page with deferred filtering by name and description, keyboard-driven open/clear/collapse #476
Add Deep-Interactive badge on classroom thumbnails for sessions generated with Interactive Mode #478
Replace always-included media instruction blocks in generation prompts with conditional snippet includes gated on imageEnabled / videoEnabled — disabled capabilities are removed from the prompt entirely instead of relying on negative-override directives the model often ignored #490 (by @YizukiAme)

Bug Fixes

Fix language drift between outline and scene generation by unifying the languageDirective across the pipeline so the same target language flows from outline planning through every per-scene call #474

Other Changes

Refactor whiteboard role prompts to file-based markdown templates and add a geometry-conflict detector (overlap, line-through-bbox, canvas clipping) that surfaces problems back to the model. Eval (flash, repeat 3, gemini-3.1-pro scorer) shows overall quality 5.4 → 6.1 and overlap 6.3 → 8.1 from prompt + detector alone #485
Migrate orchestration prompt builders (buildStructuredPrompt, buildDirectorPrompt, buildPBLSystemPrompt) from inline TS template literals to file-based markdown templates under lib/prompts/, sharing the loader infrastructure with the generation pipeline. prompt-builder.ts 890 → 314 lines; future content tweaks land as markdown edits #459

Full Changelog: https://github.com/THU-MAIC/OpenMAIC/compare/v0.2.0...v0.2.1

View release on GitHub

v0.2.0 New feature 1mo

Notable features

Deep Interactive Mode with AI teacher operating UI for interactive scenes (3D visualization, simulation, game, mind map/diagram, online programming)
Code element support on whiteboard
Arabic (ar-SA) language interface support

Full changelog

[0.2.0] - 2026-04-20

Features

Deep Interactive Mode — Generate hands-on interactive scenes (3D visualization, simulation, game, mind map/diagram, online programming) with an AI teacher who operates the UI to guide students. Fully responsive across desktop, tablet, and mobile #461
Add code element support on the whiteboard — AI agents can write, display, and reference runnable code during lessons #385 (by @cosarah)
Add Arabic (ar-SA) interface language #431 (by @YizukiAme)
Add MinerU Cloud API as a PDF parsing provider, with a dedicated settings UI #438
Add latest OpenAI models to the default config #416 (by @donghch)
Add GLM-5.1 and GLM-5V-Turbo to GLM preset models #437
Add international base URL shortcuts for GLM, Kimi, and MiniMax in provider settings #449
Add anti-framing security headers (X-Frame-Options + CSP frame-ancestors) with an optional ALLOWED_FRAME_ANCESTORS override #430 (by @YizukiAme)
Add i18n key alignment check to CI so missing or extra translation keys fail the build #447 (by @KanameMadoka520)
Add whiteboard layout quality eval harness and unify it with the outline-language harness #425 #453

Bug Fixes

Fix classroom ZIP export to use the latest classroom name from IndexedDB #435
Fix spotlight cutout for text elements and add element-content variant for image/video #457

Other Changes

Renew the README with Deep Interactive Mode showcase and visual assets #463 (by @Shirokumaaaa)
Update Discord invite links across README, CONTRIBUTING, and issue templates

View release on GitHub

v0.1.1 Breaking risk 1mo

Security fixes

Fix DNS rebinding bypass in SSRF validation

Notable features

Inline language inference for outline and PBL generation
Custom OpenAI-compatible TTS/ASR provider support
Ollama as built-in provider with keyless activation

View release on GitHub

v0.1.0 Security relevant 2mo

Security fixes

SSRF vulnerability fixed
Credential forwarding vulnerability fixed

Notable features

Discussion TTS with per-agent voice assignment
Immersive Mode with full-screen view
Keyboard shortcuts for roundtable controls

View release on GitHub

All releases

Features

Bug Fixes

Other Changes

[0.2.0] - 2026-04-20

Features

Bug Fixes

Other Changes