This release includes 2 breaking changes for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
+3 more
Affected surfaces
ReleasePort's take
Moderate signalRelease v1.4.0‑beta introduces Auto Mode for agentic capability routing and a three‑layer memory subsystem, while adding numerous chat tools and UI enhancements.
Why it matters: Plan to test the new Auto Mode and memory layers in development; update any code referencing removed agents/ or prompts/ directories before upgrading to avoid breakage. No immediate security patch is required.
Summary
AI summaryAuto Mode adds agentic capability routing, a three‑layer memory subsystem, and major chat tool additions across Highlights, Chat Surface Features, and Tests.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Breaking | Medium |
Removes legacy agents/ and prompts/ directories for research, solve, question modes Removes legacy agents/ and prompts/ directories for research, solve, question modes Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Breaking | Medium |
Removes legacy main.yaml capability copy in favor of per-capability prompt files Removes legacy main.yaml capability copy in favor of per-capability prompt files Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Breaking | Medium |
Deletes the legacy main.yaml capability copy; each capability now uses its own prompt files Deletes the legacy main.yaml capability copy; each capability now uses its own prompt files Source: granite4.1:30b@2026-05-21-audit Confidence: low |
— |
| Feature | Medium |
Adds Auto Mode, a new agentic capability router choosing the right mode for each request Adds Auto Mode, a new agentic capability router choosing the right mode for each request Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Implements three-stage agent loop: ANALYZING, DELEGATING, SYNTHESIZING for Auto Mode Implements three-stage agent loop: ANALYZING, DELEGATING, SYNTHESIZING for Auto Mode Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Replaces flat memory with three-layer subsystem: L1 (raw traces), L2 (normalized), L3 (curated) Replaces flat memory with three-layer subsystem: L1 (raw traces), L2 (normalized), L3 (curated) Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Adds modular consolidator pipeline turning run traces into versioned line-oriented documents Adds modular consolidator pipeline turning run traces into versioned line-oriented documents Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Introduces Memory Workbench UI with /memory routes (graph, l1, l2, l3, resolve) Introduces Memory Workbench UI with /memory routes (graph, l1, l2, l3, resolve) Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Exposes read_memory and write_memory as first-class agent tools for chat Exposes read_memory and write_memory as first-class agent tools for chat Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Adds /settings/memory page with run controls, mode toggles, and storage status Adds /settings/memory page with run controls, mode toggles, and storage status Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Adds ask_user tool for 1-3 structured questions pausing turn until user answers Adds ask_user tool for 1-3 structured questions pausing turn until user answers Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Adds web_fetch tool with readable-content extraction and strict security guards Adds web_fetch tool with readable-content extraction and strict security guards Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Replaces save_to_notebook with write_note tool supporting append and edit modes Replaces save_to_notebook with write_note tool supporting append and edit modes Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Adds list_notebook read-only tool for notebook and records indexing Adds list_notebook read-only tool for notebook and records indexing Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Adds github_query read-only gh CLI wrapper for pr, issue, run, repo, api Adds github_query read-only gh CLI wrapper for pr, issue, run, repo, api Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Adds delete chat turn functionality with message IDs and optimistic UI handling Adds delete chat turn functionality with message IDs and optimistic UI handling Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Adds quiz follow-up chat composer for direct chat from quiz questions Adds quiz follow-up chat composer for direct chat from quiz questions Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Adds GeoGebra applet renderer for inline geometry/algebra visualization Adds GeoGebra applet renderer for inline geometry/algebra visualization Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Moves capability status copy to capabilities/prompts/{en,zh}/<name>.yaml via StatusI18n Moves capability status copy to capabilities/prompts/{en,zh}/<name>.yaml via StatusI18n Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Tracks token usage and cost via UsageTracker, exposed in /settings/capabilities Tracks token usage and cost via UsageTracker, exposed in /settings/capabilities Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Supports six render types: svg, chartjs, mermaid, html, manim_video, manim_image Supports six render types: svg, chartjs, mermaid, html, manim_video, manim_image Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Adds vertically resizable quiz answer textarea and normalizes newlines to Markdown Adds vertically resizable quiz answer textarea and normalizes newlines to Markdown Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Low |
Polishes the quiz UI: resizable answer textarea, newline normalization to Markdown paragraphs Polishes the quiz UI: resizable answer textarea, newline normalization to Markdown paragraphs Source: granite4.1:30b@2026-05-21-audit Confidence: high |
— |
| Feature | Low |
Moves capability status strings to per‑language YAML files via StatusI18n accessor, removing hard‑coded English strings Moves capability status strings to per‑language YAML files via StatusI18n accessor, removing hard‑coded English strings Source: granite4.1:30b@2026-05-21-audit Confidence: low |
— |
| Feature | Low |
Introduces UsageTracker for token usage and cost, displayed on the /settings/capabilities admin page Introduces UsageTracker for token usage and cost, displayed on the /settings/capabilities admin page Source: granite4.1:30b@2026-05-21-audit Confidence: low |
— |
| Bugfix | Medium |
Decouples multi-user identity resolution from middleware, fixing cross-user data bleed Decouples multi-user identity resolution from middleware, fixing cross-user data bleed Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Refactor | Medium |
Rewrites Deep Research as agentic-engine orchestrator with four phases and labeled steps Rewrites Deep Research as agentic-engine orchestrator with four phases and labeled steps Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Refactor | Medium |
Rewrites Deep Solve as agentic-engine orchestrator with Pre-retrieve, Plan, Solve phases Rewrites Deep Solve as agentic-engine orchestrator with Pre-retrieve, Plan, Solve phases Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Refactor | Medium |
Replaces Question/Quiz generator with coordinator and pipeline architecture Replaces Question/Quiz generator with coordinator and pipeline architecture Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Refactor | Medium |
Rebuilds chat around session-cumulative source inventory with branch-isolated manifest Rebuilds chat around session-cumulative source inventory with branch-isolated manifest Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Refactor | Medium |
Splits LlamaIndex into config.py, ingestion.py, retrievers.py, document_loader.py Splits LlamaIndex into config.py, ingestion.py, retrievers.py, document_loader.py Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Refactor | Medium |
Consolidates chat tools, hints, and arg wrappers in tools/builtin/__init__.py Consolidates chat tools, hints, and arg wrappers in tools/builtin/__init__.py Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Refactor | Medium |
Unifies all capabilities through emit_capability_result helper with shared envelope Unifies all capabilities through emit_capability_result helper with shared envelope Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Refactor | Medium |
Merges Animator menu into Visualize capability with render_type discriminator Merges Animator menu into Visualize capability with render_type discriminator Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Refactor | Medium |
Unifies capability results via emit_capability_result helper with a shared envelope (label, summary, payload, render hints) Unifies capability results via emit_capability_result helper with a shared envelope (label, summary, payload, render hints) Source: granite4.1:30b@2026-05-21-audit Confidence: low |
— |
| Refactor | Low |
Merges the standalone Animator menu into Visualize, using a render_type discriminator for six renderer types (svg, chartjs, mermaid, html, manim_video, manim_image) Merges the standalone Animator menu into Visualize, using a render_type discriminator for six renderer types (svg, chartjs, mermaid, html, manim_video, manim_image) Source: granite4.1:30b@2026-05-21-audit Confidence: low |
— |
Full changelog
DeepTutor v1.4.0-beta Release Notes
Release Date: 2026.05.21
v1.4.0-beta is the largest release since the agent-native rewrite. It folds an
end-to-end Auto Mode on top of the existing capabilities, ships a
three-layer memory subsystem (L1/L2/L3) with a dedicated workbench, rebuilds
Deep Research / Deep Solve / Question on the same agentic engine as Chat,
re-architects the chat capability + LlamaIndex RAG pipeline around a
session-cumulative source inventory, unifies the Capabilities infrastructure
and i18n, merges the Animator menu into Visualize, and reorganises
Settings, environment, and the local launcher. Several new chat tools
(ask_user, web_fetch, write_note, list_notebook, github_query) plus a
delete-chat-turn flow, quiz follow-up chat, and a GeoGebra viewer round out the
release.
Highlights
Auto Mode — Agentic Capability Router
A new auto capability sits on top of the existing modes and chooses the right
one for each request, instead of forcing the user to pick a mode up front.
- Three-stage agent loop —
ANALYZING(single LLM call, streamed as
thinking) →DELEGATING(up tomax_iterationsof router calls that emit
delegate_to_<cap>tool calls or atomic tool calls) →SYNTHESIZING(final
inline answer, either passed through from the loop or assembled by a closing
LLM call). - Routes to real capabilities —
deep_solve,deep_question,
deep_research,math_animator,visualize, plus the chat-level atomic
tools (web_search,web_fetch,rag, …) live behind the same router so
the LLM can mix retrieval and full sub-capability runs in one turn. - Bounded retries and quotas — independent retry budgets for router-LLM
errors, per-delegation failures, and arg-validation feedback; a configurable
max_same_capability_callsquota keeps the loop from spinning on one mode. - Clean conversation history — sub-capability events flow through a
forward_eventsshim that tags every content event with acall_id, so the
conversation turn-runtime filter keeps only Auto's own final synthesis in
saved history. Sub-runs are still streamed live to the UI. answer_nowfast-path — when the user asks to "answer now" the pipeline
skips analysis + delegation and produces an immediate inline reply.
Three-Layer Memory Subsystem (Memory v2)
The previous flat memory page is replaced by a structured three-layer store
with an explicit consolidation pipeline and a dedicated workbench.
- L1 / L2 / L3 layout — L1 captures raw run traces, L2 holds normalised
document records, L3 holds curated slots per surface (chat, notebook, book,
TutorBot). Per-user paths flow throughPathServiceso multi-user
deployments stay isolated. - Consolidator pipeline — modular
consolidator/modules (chunker, guards,
parse, references, runs, modes, line-doc, meta) turn run traces into
versioned line-oriented documents with stable ids, references between
layers, and a snapshot history. - Memory Workbench UI — new
/memoryroutes (graph,l1,l2,l3,
resolve) ship as standalone pages with workbench, hub, graph viewer, run
panel, and an archived-state banner. A reusableMemorySectioncomponent is
embedded where the legacy memory panel used to live. - First-class chat tools —
read_memoryandwrite_memoryare exposed
as agent tools (with i18n hints) so chat / Auto can recall and update memory
inside a turn instead of needing a separate save step. - Settings integration — Memory now has its own page under
/settings/memorywith run controls, mode toggles, and storage status.
Deep Research, Deep Solve, and Question on the Agentic Engine
The three multi-agent pipelines have been rewritten as orchestrators on top of
the shared agentic-engine primitives, deleting hundreds of bespoke prompt
files and per-agent classes.
- Deep Research →
agents/research/pipeline.py— four phases (Rephrase,
Decompose,Research blocks,Reporting) implemented as labeled steps
(THINK/TOOL/APPEND/OUTLINE/SECTION/FINISH). The dynamic
topic queue andCitationManagerare preserved; the newAPPENDlabel lets
research blocks add follow-up topics to the queue without leaving the loop.
ask_userv2 drives up to three rephrase rounds with multi-question cards. - Deep Solve →
agents/solve/pipeline.py—Pre-retrieve(KB-only),
Plan,Solve(per-stepTHINK/TOOL/FINISH/REPLANloop with a
back-edge from solve to plan), and a finalSynthesizestep. Each step's
FINISHflows into the next step's prompt context so the answer reads as
one continuous narrative. - Question / Quiz — coordinator + pipeline replace the old
generator/
idea_agent/modelsmodules; the old prompt directories have been
removed entirely. - All three drop the legacy
agents/andprompts/directories for their
respective modes, leaving one pipeline file and shared labeled-step prompts.
Chat Capability & LlamaIndex RAG Refactor
The agentic chat pipeline has been rebuilt around a session-cumulative
"Attached Sources" manifest and a cleaner LlamaIndex pipeline.
- Branch-isolated source inventory —
services/session/source_inventory.py
materialises every source attached on the active branch's ancestor chain.
Fresh sources from the current turn show a full preview; historical sources
show a one-line row with id, name, kind, size, and the turn ordinal where
they first appeared. The LLM callsread_source(id)to expand the full
text on demand. Sibling branches never leak sources into each other. - LlamaIndex pipeline split-out — dedicated
config.py,ingestion.py,
retrievers.py, anddocument_loader.pyreplace the previous monolithic
pipeline module. Storage stays backward-compatible with v1.3 versioned
indexes. - Lean agentic chat prompt —
agentic_chat.yaml(EN/ZH) was rewritten to
match the new tool surface and the source-inventory contract; the old
parallel-tool prompt scaffolding is gone. - Builtin tools registry —
tools/builtin/__init__.pyis the single place
where chat-mounted tools, hint prompts, and arg-augmentation wrappers are
registered.
Capabilities Infrastructure Unification
Every capability now goes through one shared envelope, one status-i18n loader,
and one cost-tracking surface.
emit_capability_resulthelper — every capability emits its final
result through one helper that fills the result envelope (label, summary,
payload, render hints) and the trailing usage-tracker totals consistently.StatusI18n— capability status copy lives in
capabilities/prompts/{en,zh}/<name>.yamland is loaded via a shared
StatusI18naccessor. Hard-coded English status strings have been removed
from the pipelines.UsageTrackercost surface — token usage and cost are tracked through
one tracker per capability run, exposed to the result envelope, and shown
on the new/settings/capabilitiesadmin page (live list, defaults,
per-capability override toggles).- Deprecated
main.yamlkeys removed — the legacymain.yamlcapability
copy has been deleted in favor of per-capability prompt files.
Visualize: Animator Folded Into One Capability
The standalone Animator menu has been merged into Visualize so the user picks a
visualization once and the system chooses the renderer.
render_typediscriminator —AnalysisAgentpicks one of six render
types —svg,chartjs,mermaid,html(text-emitting, three-stage
pipeline) ormanim_video/manim_image(Manim subprocess pipeline). The
result envelope carriesrender_typeso the frontend delegates to the
right viewer.- Single sidebar entry — the old
Animatormenu entry is gone; users now
go throughVisualizefor both static charts and Manim videos. The
fullscreen viewer / config panel handle all render types.
New Chat Tools
ask_user— packages 1–3 structured questions into a single payload that
pauses the same turn until the user answers. The frontend renders a card
letting the user navigate questions and submit answers in one batch; the
pipeline resumes the turn with the answers wired back as the tool result.
Used by Deep Research's Rephrase phase and available to chat / Auto.web_fetch— URL fetch with readable-content extraction, strict scheme
/ private-IP / size guards (applied both pre-flight and post-redirect),
and…[truncated]markers when output exceeds the cap.write_note— replaces the oldsave_to_notebooktool. Two modes:
appendcreates a new record (default body is the rendered transcript,
optional agent-authored body) andeditupdates an existing record by
record_id.list_notebook— read-only index / drill-down listing of the active
user's notebooks and records. Only mounted when the user actually has
notebooks, so empty runs are impossible by construction.github_query— read-onlyghCLI wrapper coveringpr,issue,
run,repo, and a GET-onlyapifallback. No mutation verbs are
reachable through the tool surface. Returns a clean "tool unavailable"
outcome whenghis not installed.
Chat Surface Features
- Delete chat turn (#443) — message items now carry a stable
id, the
session API exposesdeleteMessage, the chat reducer adds aDELETE_TURN
action, and a 409 vs 404 check rejects deletion of a still-running turn.
Optimistic temp ids are resolved before deletion to avoid orphaned UI rows. - Quiz follow-up chat composer —
FollowupChatComposerand
QuizFollowupContextlet the user start a chat thread directly from a quiz
question. The composer reuses the mainChatComposer(look, @space
pickers, KB picker, attachments, LLM selector) but routes sends through a
dedicated follow-up controller. Companionquiz-judge.tshelper supports
judging follow-up answers inline. - Quiz UI polish — quiz answer textarea is vertically resizable (#478);
question content normalises single newlines to Markdown paragraphs (#441). - GeoGebra viewer —
Geogebra.tsx,GeogebraOpenCTA.tsx, and
GeogebraTabContextadd a GeoGebra applet renderer (loaded via the
official GGB applet script) so geometry / algebra snippets can be opened
inline alongside chat answers.
Multi-User Data Isolation
Several regressions and gaps from the v1.3.x multi-user introduction were
fixed in a focused pass (#474, #465).
- Auth decoupled from middleware — multi-user identity resolution no
longer relies on global middleware state, fixing rebase regressions that
caused cross-user data bleed under specific routing orders. - Legacy session manager path capture — the older session manager
inherited the active user scope correctly, so its file paths land inside
the per-user workspace instead of the shared default. - Frontend uses
apiFetcheverywhere — every authenticated client call
now goes throughapiFetch()so the auth header is attached consistently. - SSL bypass sweep —
DISABLE_SSL_VERIFYnow reaches the codex provider
and four embedding adapters that were still missing it after v1.3.10.
Environment Settings, Installer, and Local Launcher
The install + launch story has been rewritten to remove the .env parsing
maze and make deeptutor start / deeptutor init first-class.
runtime_settings.py— system / auth / launch settings now live in
one typed module with explicit defaults (backend_port,frontend_port,
cors_origins,disable_ssl_verify,chat_attachment_dir, …) and JSON
storage underdata/user/settings/. The 280+ line legacyenv_store.py
and the two.env.examplefiles have been deleted.runtime/launcher.py— single async launcher that owns the
backend + frontend lifecycle, port discovery, readiness probes, and
cleanup. Generatesweb/.env.localso the Next.js frontend always picks
up the resolved backend port.deeptutor/runtime/banner.py— localized startup banner shared
betweendeeptutor startanddeeptutor init; reads the language
preference from interface settings so the banner matches the UI locale.init_wizard.py— interactivedeeptutor initwizard with provider
menu, env-var auto-detect for API keys, liveGET {base_url}/models
fetch, curated fallback list, and an optional connectivity probe before
save.model_catalog.pytrimmed — the catalog file shrank by ~400 lines as
per-provider boilerplate moved intoprovider_registryand adapter
modules.
Settings UI Reorganization
The single /settings page has been split into focused tabs.
- New routes —
/settings/appearance,/settings/capabilities,
/settings/embedding,/settings/llm,/settings/mcp,
/settings/memory,/settings/search,/settings/status,
/settings/tools, with a shared layout and items index. - Tools page — lists every chat-mountable tool, surfaces availability
(e.g.ghforgithub_query), and exposes per-tool toggles. - Capabilities page — pairs the new
UsageTrackercost surface with
per-capability defaults and override toggles described above.
Zulip Channel Integration
The TutorBot Zulip channel (added in v1.3.9) gets a follow-up sweep of fixes
and a self-subscribe feature (#480).
- Auto-subscribe channels for @mentions — Bot can subscribe itself to
any channel where it gets @mentioned so it actually receives the message
in topics. Subscribed-channel warnings are downgraded to info-level so
startup logs stop misleadingly flagging the success path. - All mention flag types supported —
mentioned,wildcard_mentioned,
topic_wildcard_mentioned, andstream_wildcard_mentionedall trigger
the bot, fixing channel-@-mention silence. - Attachment send fixes — re-sent attachments no longer treat the Zulip
upload path as a local file, the upload helper no longer crashes on
'str' object has no attribute 'name', and missing routing metadata is
rebuilt from_recipient_mapsoMessage must have recipientserrors
are eliminated. - Progress message dedup — internal
_tool_hintprogress events are
filtered out of channel sends so the user no longer sees duplicate "tool
starting…" lines. - Test coverage — new unit tests for attachment upload + send recovery
and channel-subscription behavior.
Tests
- New tests for the Auto pipeline, delegation, schemas, and the
autocapability surface — 1100+ lines of new coverage including
end-to-end agent-loop behavior. - Full test coverage for the new memory subsystem — chunker, consolidator,
document, ids, line-doc, merge, meta settings, modes, ops, references,
runs, store. - Per-tool unit tests for
ask_user,github_query,list_notebook,
web_fetch, andwrite_note, plus ask-user UI state helpers. - Refit chat / research / solve / question pipeline tests against the
agentic-engine labels (THINK/TOOL/APPEND/FINISH/ …). - New session / source-inventory tests covering branch isolation and
cumulative manifest behavior. - Frontend tests cover the message-branches helper, version surface, and
ask-user state machine.
Upgrade Notes
- Settings file relocation — first launch will migrate any
.env-based settings into the new JSON files under
data/user/settings/. The legacyenv_storeshim is gone; if you
scripted.envwrites externally, point them at
runtime_settings.pyor the/settingsAPI instead. deeptutor startis the recommended launcher —start_web.py/
start_tour.pycontinue to work but are now thin wrappers around the
newruntime/launcher.py. Rundeeptutor initonce to seed providers
and credentials on a fresh machine.- Animator menu users — point at Visualize instead. The
capability now picks Manim automatically when the user asks for a
video / animation; existing Manim-rendered records are unaffected. - Memory data migration — the legacy single-blob memory format is
read by the consolidator on first access and written back as L2 / L3
records. No manual step is required; old snapshots remain on disk. - Capability authors — emit results via
capabilities/_shared.emit_capability_resultand put status copy in
capabilities/prompts/{en,zh}/<name>.yaml. Hard-coded English status
strings will fail review. - Beta scope — this release ships substantial new surfaces (Auto,
Memory v2, settings split). Pin tov1.4.0-betafor production until
the GA cut; bug reports against any of the new modules are welcome.
Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.10...v1.4.0-beta
Breaking Changes
- Removal of legacy `main.yaml` capability copy; capabilities must now use per‑capability prompt files and `emit_capability_result` helper.
- Animator menu eliminated; all visualizations must be requested through the unified **Visualize** capability with a `render_type` discriminator.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
Related context
Related tools
Beta — feedback welcome: [email protected]