Skip to content

DeepTutor

v1.4.0-beta Breaking

This release includes 2 breaking changes for platform teams planning a safe upgrade.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai-agents ai-tutor clawdbot cli-tool deepresearch interactive-learning
+3 more
large-language-models multi-agent-systems llm

Affected surfaces

auth rbac deps breaking_upgrade

ReleasePort's take

Moderate signal
editorial:auto 13d

Release v1.4.0‑beta introduces Auto Mode for agentic capability routing and a three‑layer memory subsystem, while adding numerous chat tools and UI enhancements.

Why it matters: Plan to test the new Auto Mode and memory layers in development; update any code referencing removed agents/ or prompts/ directories before upgrading to avoid breakage. No immediate security patch is required.

Summary

AI summary

Auto Mode adds agentic capability routing, a three‑layer memory subsystem, and major chat tool additions across Highlights, Chat Surface Features, and Tests.

Changes in this release

Breaking Medium

Removes legacy agents/ and prompts/ directories for research, solve, question modes

Removes legacy agents/ and prompts/ directories for research, solve, question modes

Source: llm_adapter@2026-05-21

Confidence: high

Breaking Medium

Removes legacy main.yaml capability copy in favor of per-capability prompt files

Removes legacy main.yaml capability copy in favor of per-capability prompt files

Source: llm_adapter@2026-05-21

Confidence: low

Breaking Medium

Deletes the legacy main.yaml capability copy; each capability now uses its own prompt files

Deletes the legacy main.yaml capability copy; each capability now uses its own prompt files

Source: granite4.1:30b@2026-05-21-audit

Confidence: low

Feature Medium

Adds Auto Mode, a new agentic capability router choosing the right mode for each request

Adds Auto Mode, a new agentic capability router choosing the right mode for each request

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Implements three-stage agent loop: ANALYZING, DELEGATING, SYNTHESIZING for Auto Mode

Implements three-stage agent loop: ANALYZING, DELEGATING, SYNTHESIZING for Auto Mode

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Replaces flat memory with three-layer subsystem: L1 (raw traces), L2 (normalized), L3 (curated)

Replaces flat memory with three-layer subsystem: L1 (raw traces), L2 (normalized), L3 (curated)

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Adds modular consolidator pipeline turning run traces into versioned line-oriented documents

Adds modular consolidator pipeline turning run traces into versioned line-oriented documents

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Introduces Memory Workbench UI with /memory routes (graph, l1, l2, l3, resolve)

Introduces Memory Workbench UI with /memory routes (graph, l1, l2, l3, resolve)

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Exposes read_memory and write_memory as first-class agent tools for chat

Exposes read_memory and write_memory as first-class agent tools for chat

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Adds /settings/memory page with run controls, mode toggles, and storage status

Adds /settings/memory page with run controls, mode toggles, and storage status

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Adds ask_user tool for 1-3 structured questions pausing turn until user answers

Adds ask_user tool for 1-3 structured questions pausing turn until user answers

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Adds web_fetch tool with readable-content extraction and strict security guards

Adds web_fetch tool with readable-content extraction and strict security guards

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Replaces save_to_notebook with write_note tool supporting append and edit modes

Replaces save_to_notebook with write_note tool supporting append and edit modes

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Adds list_notebook read-only tool for notebook and records indexing

Adds list_notebook read-only tool for notebook and records indexing

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Adds github_query read-only gh CLI wrapper for pr, issue, run, repo, api

Adds github_query read-only gh CLI wrapper for pr, issue, run, repo, api

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Adds delete chat turn functionality with message IDs and optimistic UI handling

Adds delete chat turn functionality with message IDs and optimistic UI handling

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Adds quiz follow-up chat composer for direct chat from quiz questions

Adds quiz follow-up chat composer for direct chat from quiz questions

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Adds GeoGebra applet renderer for inline geometry/algebra visualization

Adds GeoGebra applet renderer for inline geometry/algebra visualization

Source: llm_adapter@2026-05-21

Confidence: high

Feature Medium

Moves capability status copy to capabilities/prompts/{en,zh}/<name>.yaml via StatusI18n

Moves capability status copy to capabilities/prompts/{en,zh}/<name>.yaml via StatusI18n

Source: llm_adapter@2026-05-21

Confidence: low

Feature Medium

Tracks token usage and cost via UsageTracker, exposed in /settings/capabilities

Tracks token usage and cost via UsageTracker, exposed in /settings/capabilities

Source: llm_adapter@2026-05-21

Confidence: low

Feature Medium

Supports six render types: svg, chartjs, mermaid, html, manim_video, manim_image

Supports six render types: svg, chartjs, mermaid, html, manim_video, manim_image

Source: llm_adapter@2026-05-21

Confidence: low

Feature Medium

Adds vertically resizable quiz answer textarea and normalizes newlines to Markdown

Adds vertically resizable quiz answer textarea and normalizes newlines to Markdown

Source: llm_adapter@2026-05-21

Confidence: low

Feature Low

Polishes the quiz UI: resizable answer textarea, newline normalization to Markdown paragraphs

Polishes the quiz UI: resizable answer textarea, newline normalization to Markdown paragraphs

Source: granite4.1:30b@2026-05-21-audit

Confidence: high

Feature Low

Moves capability status strings to per‑language YAML files via StatusI18n accessor, removing hard‑coded English strings

Moves capability status strings to per‑language YAML files via StatusI18n accessor, removing hard‑coded English strings

Source: granite4.1:30b@2026-05-21-audit

Confidence: low

Feature Low

Introduces UsageTracker for token usage and cost, displayed on the /settings/capabilities admin page

Introduces UsageTracker for token usage and cost, displayed on the /settings/capabilities admin page

Source: granite4.1:30b@2026-05-21-audit

Confidence: low

Bugfix Medium

Decouples multi-user identity resolution from middleware, fixing cross-user data bleed

Decouples multi-user identity resolution from middleware, fixing cross-user data bleed

Source: llm_adapter@2026-05-21

Confidence: high

Refactor Medium

Rewrites Deep Research as agentic-engine orchestrator with four phases and labeled steps

Rewrites Deep Research as agentic-engine orchestrator with four phases and labeled steps

Source: llm_adapter@2026-05-21

Confidence: high

Refactor Medium

Rewrites Deep Solve as agentic-engine orchestrator with Pre-retrieve, Plan, Solve phases

Rewrites Deep Solve as agentic-engine orchestrator with Pre-retrieve, Plan, Solve phases

Source: llm_adapter@2026-05-21

Confidence: high

Refactor Medium

Replaces Question/Quiz generator with coordinator and pipeline architecture

Replaces Question/Quiz generator with coordinator and pipeline architecture

Source: llm_adapter@2026-05-21

Confidence: high

Refactor Medium

Rebuilds chat around session-cumulative source inventory with branch-isolated manifest

Rebuilds chat around session-cumulative source inventory with branch-isolated manifest

Source: llm_adapter@2026-05-21

Confidence: high

Refactor Medium

Splits LlamaIndex into config.py, ingestion.py, retrievers.py, document_loader.py

Splits LlamaIndex into config.py, ingestion.py, retrievers.py, document_loader.py

Source: llm_adapter@2026-05-21

Confidence: high

Refactor Medium

Consolidates chat tools, hints, and arg wrappers in tools/builtin/__init__.py

Consolidates chat tools, hints, and arg wrappers in tools/builtin/__init__.py

Source: llm_adapter@2026-05-21

Confidence: high

Refactor Medium

Unifies all capabilities through emit_capability_result helper with shared envelope

Unifies all capabilities through emit_capability_result helper with shared envelope

Source: llm_adapter@2026-05-21

Confidence: low

Refactor Medium

Merges Animator menu into Visualize capability with render_type discriminator

Merges Animator menu into Visualize capability with render_type discriminator

Source: llm_adapter@2026-05-21

Confidence: low

Refactor Medium

Unifies capability results via emit_capability_result helper with a shared envelope (label, summary, payload, render hints)

Unifies capability results via emit_capability_result helper with a shared envelope (label, summary, payload, render hints)

Source: granite4.1:30b@2026-05-21-audit

Confidence: low

Refactor Low

Merges the standalone Animator menu into Visualize, using a render_type discriminator for six renderer types (svg, chartjs, mermaid, html, manim_video, manim_image)

Merges the standalone Animator menu into Visualize, using a render_type discriminator for six renderer types (svg, chartjs, mermaid, html, manim_video, manim_image)

Source: granite4.1:30b@2026-05-21-audit

Confidence: low

Full changelog

DeepTutor v1.4.0-beta Release Notes

Release Date: 2026.05.21

v1.4.0-beta is the largest release since the agent-native rewrite. It folds an
end-to-end Auto Mode on top of the existing capabilities, ships a
three-layer memory subsystem (L1/L2/L3) with a dedicated workbench, rebuilds
Deep Research / Deep Solve / Question on the same agentic engine as Chat,
re-architects the chat capability + LlamaIndex RAG pipeline around a
session-cumulative source inventory, unifies the Capabilities infrastructure
and i18n
, merges the Animator menu into Visualize, and reorganises
Settings, environment, and the local launcher. Several new chat tools
(ask_user, web_fetch, write_note, list_notebook, github_query) plus a
delete-chat-turn flow, quiz follow-up chat, and a GeoGebra viewer round out the
release.

Highlights

Auto Mode — Agentic Capability Router

A new auto capability sits on top of the existing modes and chooses the right
one for each request, instead of forcing the user to pick a mode up front.

  • Three-stage agent loopANALYZING (single LLM call, streamed as
    thinking) → DELEGATING (up to max_iterations of router calls that emit
    delegate_to_<cap> tool calls or atomic tool calls) → SYNTHESIZING (final
    inline answer, either passed through from the loop or assembled by a closing
    LLM call).
  • Routes to real capabilitiesdeep_solve, deep_question,
    deep_research, math_animator, visualize, plus the chat-level atomic
    tools (web_search, web_fetch, rag, …) live behind the same router so
    the LLM can mix retrieval and full sub-capability runs in one turn.
  • Bounded retries and quotas — independent retry budgets for router-LLM
    errors, per-delegation failures, and arg-validation feedback; a configurable
    max_same_capability_calls quota keeps the loop from spinning on one mode.
  • Clean conversation history — sub-capability events flow through a
    forward_events shim that tags every content event with a call_id, so the
    conversation turn-runtime filter keeps only Auto's own final synthesis in
    saved history. Sub-runs are still streamed live to the UI.
  • answer_now fast-path — when the user asks to "answer now" the pipeline
    skips analysis + delegation and produces an immediate inline reply.

Three-Layer Memory Subsystem (Memory v2)

The previous flat memory page is replaced by a structured three-layer store
with an explicit consolidation pipeline and a dedicated workbench.

  • L1 / L2 / L3 layout — L1 captures raw run traces, L2 holds normalised
    document records, L3 holds curated slots per surface (chat, notebook, book,
    TutorBot). Per-user paths flow through PathService so multi-user
    deployments stay isolated.
  • Consolidator pipeline — modular consolidator/ modules (chunker, guards,
    parse, references, runs, modes, line-doc, meta) turn run traces into
    versioned line-oriented documents with stable ids, references between
    layers, and a snapshot history.
  • Memory Workbench UI — new /memory routes (graph, l1, l2, l3,
    resolve) ship as standalone pages with workbench, hub, graph viewer, run
    panel, and an archived-state banner. A reusable MemorySection component is
    embedded where the legacy memory panel used to live.
  • First-class chat toolsread_memory and write_memory are exposed
    as agent tools (with i18n hints) so chat / Auto can recall and update memory
    inside a turn instead of needing a separate save step.
  • Settings integration — Memory now has its own page under
    /settings/memory with run controls, mode toggles, and storage status.

Deep Research, Deep Solve, and Question on the Agentic Engine

The three multi-agent pipelines have been rewritten as orchestrators on top of
the shared agentic-engine primitives, deleting hundreds of bespoke prompt
files and per-agent classes.

  • Deep Research → agents/research/pipeline.py — four phases (Rephrase,
    Decompose, Research blocks, Reporting) implemented as labeled steps
    (THINK / TOOL / APPEND / OUTLINE / SECTION / FINISH). The dynamic
    topic queue and CitationManager are preserved; the new APPEND label lets
    research blocks add follow-up topics to the queue without leaving the loop.
    ask_user v2 drives up to three rephrase rounds with multi-question cards.
  • Deep Solve → agents/solve/pipeline.pyPre-retrieve (KB-only),
    Plan, Solve (per-step THINK / TOOL / FINISH / REPLAN loop with a
    back-edge from solve to plan), and a final Synthesize step. Each step's
    FINISH flows into the next step's prompt context so the answer reads as
    one continuous narrative.
  • Question / Quiz — coordinator + pipeline replace the old generator /
    idea_agent / models modules; the old prompt directories have been
    removed entirely.
  • All three drop the legacy agents/ and prompts/ directories for their
    respective modes, leaving one pipeline file and shared labeled-step prompts.

Chat Capability & LlamaIndex RAG Refactor

The agentic chat pipeline has been rebuilt around a session-cumulative
"Attached Sources" manifest and a cleaner LlamaIndex pipeline.

  • Branch-isolated source inventoryservices/session/source_inventory.py
    materialises every source attached on the active branch's ancestor chain.
    Fresh sources from the current turn show a full preview; historical sources
    show a one-line row with id, name, kind, size, and the turn ordinal where
    they first appeared. The LLM calls read_source(id) to expand the full
    text on demand. Sibling branches never leak sources into each other.
  • LlamaIndex pipeline split-out — dedicated config.py, ingestion.py,
    retrievers.py, and document_loader.py replace the previous monolithic
    pipeline module. Storage stays backward-compatible with v1.3 versioned
    indexes.
  • Lean agentic chat promptagentic_chat.yaml (EN/ZH) was rewritten to
    match the new tool surface and the source-inventory contract; the old
    parallel-tool prompt scaffolding is gone.
  • Builtin tools registrytools/builtin/__init__.py is the single place
    where chat-mounted tools, hint prompts, and arg-augmentation wrappers are
    registered.

Capabilities Infrastructure Unification

Every capability now goes through one shared envelope, one status-i18n loader,
and one cost-tracking surface.

  • emit_capability_result helper — every capability emits its final
    result through one helper that fills the result envelope (label, summary,
    payload, render hints) and the trailing usage-tracker totals consistently.
  • StatusI18n — capability status copy lives in
    capabilities/prompts/{en,zh}/<name>.yaml and is loaded via a shared
    StatusI18n accessor. Hard-coded English status strings have been removed
    from the pipelines.
  • UsageTracker cost surface — token usage and cost are tracked through
    one tracker per capability run, exposed to the result envelope, and shown
    on the new /settings/capabilities admin page (live list, defaults,
    per-capability override toggles).
  • Deprecated main.yaml keys removed — the legacy main.yaml capability
    copy has been deleted in favor of per-capability prompt files.

Visualize: Animator Folded Into One Capability

The standalone Animator menu has been merged into Visualize so the user picks a
visualization once and the system chooses the renderer.

  • render_type discriminatorAnalysisAgent picks one of six render
    types — svg, chartjs, mermaid, html (text-emitting, three-stage
    pipeline) or manim_video / manim_image (Manim subprocess pipeline). The
    result envelope carries render_type so the frontend delegates to the
    right viewer.
  • Single sidebar entry — the old Animator menu entry is gone; users now
    go through Visualize for both static charts and Manim videos. The
    fullscreen viewer / config panel handle all render types.

New Chat Tools

  • ask_user — packages 1–3 structured questions into a single payload that
    pauses the same turn until the user answers. The frontend renders a card
    letting the user navigate questions and submit answers in one batch; the
    pipeline resumes the turn with the answers wired back as the tool result.
    Used by Deep Research's Rephrase phase and available to chat / Auto.
  • web_fetch — URL fetch with readable-content extraction, strict scheme
    / private-IP / size guards (applied both pre-flight and post-redirect),
    and …[truncated] markers when output exceeds the cap.
  • write_note — replaces the old save_to_notebook tool. Two modes:
    append creates a new record (default body is the rendered transcript,
    optional agent-authored body) and edit updates an existing record by
    record_id.
  • list_notebook — read-only index / drill-down listing of the active
    user's notebooks and records. Only mounted when the user actually has
    notebooks, so empty runs are impossible by construction.
  • github_query — read-only gh CLI wrapper covering pr, issue,
    run, repo, and a GET-only api fallback. No mutation verbs are
    reachable through the tool surface. Returns a clean "tool unavailable"
    outcome when gh is not installed.

Chat Surface Features

  • Delete chat turn (#443) — message items now carry a stable id, the
    session API exposes deleteMessage, the chat reducer adds a DELETE_TURN
    action, and a 409 vs 404 check rejects deletion of a still-running turn.
    Optimistic temp ids are resolved before deletion to avoid orphaned UI rows.
  • Quiz follow-up chat composerFollowupChatComposer and
    QuizFollowupContext let the user start a chat thread directly from a quiz
    question. The composer reuses the main ChatComposer (look, @space
    pickers, KB picker, attachments, LLM selector) but routes sends through a
    dedicated follow-up controller. Companion quiz-judge.ts helper supports
    judging follow-up answers inline.
  • Quiz UI polish — quiz answer textarea is vertically resizable (#478);
    question content normalises single newlines to Markdown paragraphs (#441).
  • GeoGebra viewerGeogebra.tsx, GeogebraOpenCTA.tsx, and
    GeogebraTabContext add a GeoGebra applet renderer (loaded via the
    official GGB applet script) so geometry / algebra snippets can be opened
    inline alongside chat answers.

Multi-User Data Isolation

Several regressions and gaps from the v1.3.x multi-user introduction were
fixed in a focused pass (#474, #465).

  • Auth decoupled from middleware — multi-user identity resolution no
    longer relies on global middleware state, fixing rebase regressions that
    caused cross-user data bleed under specific routing orders.
  • Legacy session manager path capture — the older session manager
    inherited the active user scope correctly, so its file paths land inside
    the per-user workspace instead of the shared default.
  • Frontend uses apiFetch everywhere — every authenticated client call
    now goes through apiFetch() so the auth header is attached consistently.
  • SSL bypass sweepDISABLE_SSL_VERIFY now reaches the codex provider
    and four embedding adapters that were still missing it after v1.3.10.

Environment Settings, Installer, and Local Launcher

The install + launch story has been rewritten to remove the .env parsing
maze and make deeptutor start / deeptutor init first-class.

  • runtime_settings.py — system / auth / launch settings now live in
    one typed module with explicit defaults (backend_port, frontend_port,
    cors_origins, disable_ssl_verify, chat_attachment_dir, …) and JSON
    storage under data/user/settings/. The 280+ line legacy env_store.py
    and the two .env.example files have been deleted.
  • runtime/launcher.py — single async launcher that owns the
    backend + frontend lifecycle, port discovery, readiness probes, and
    cleanup. Generates web/.env.local so the Next.js frontend always picks
    up the resolved backend port.
  • deeptutor/runtime/banner.py — localized startup banner shared
    between deeptutor start and deeptutor init; reads the language
    preference from interface settings so the banner matches the UI locale.
  • init_wizard.py — interactive deeptutor init wizard with provider
    menu, env-var auto-detect for API keys, live GET {base_url}/models
    fetch, curated fallback list, and an optional connectivity probe before
    save.
  • model_catalog.py trimmed — the catalog file shrank by ~400 lines as
    per-provider boilerplate moved into provider_registry and adapter
    modules.

Settings UI Reorganization

The single /settings page has been split into focused tabs.

  • New routes/settings/appearance, /settings/capabilities,
    /settings/embedding, /settings/llm, /settings/mcp,
    /settings/memory, /settings/search, /settings/status,
    /settings/tools, with a shared layout and items index.
  • Tools page — lists every chat-mountable tool, surfaces availability
    (e.g. gh for github_query), and exposes per-tool toggles.
  • Capabilities page — pairs the new UsageTracker cost surface with
    per-capability defaults and override toggles described above.

Zulip Channel Integration

The TutorBot Zulip channel (added in v1.3.9) gets a follow-up sweep of fixes
and a self-subscribe feature (#480).

  • Auto-subscribe channels for @mentions — Bot can subscribe itself to
    any channel where it gets @mentioned so it actually receives the message
    in topics. Subscribed-channel warnings are downgraded to info-level so
    startup logs stop misleadingly flagging the success path.
  • All mention flag types supportedmentioned, wildcard_mentioned,
    topic_wildcard_mentioned, and stream_wildcard_mentioned all trigger
    the bot, fixing channel-@-mention silence.
  • Attachment send fixes — re-sent attachments no longer treat the Zulip
    upload path as a local file, the upload helper no longer crashes on
    'str' object has no attribute 'name', and missing routing metadata is
    rebuilt from _recipient_map so Message must have recipients errors
    are eliminated.
  • Progress message dedup — internal _tool_hint progress events are
    filtered out of channel sends so the user no longer sees duplicate "tool
    starting…" lines.
  • Test coverage — new unit tests for attachment upload + send recovery
    and channel-subscription behavior.

Tests

  • New tests for the Auto pipeline, delegation, schemas, and the
    auto capability surface — 1100+ lines of new coverage including
    end-to-end agent-loop behavior.
  • Full test coverage for the new memory subsystem — chunker, consolidator,
    document, ids, line-doc, merge, meta settings, modes, ops, references,
    runs, store.
  • Per-tool unit tests for ask_user, github_query, list_notebook,
    web_fetch, and write_note, plus ask-user UI state helpers.
  • Refit chat / research / solve / question pipeline tests against the
    agentic-engine labels (THINK / TOOL / APPEND / FINISH / …).
  • New session / source-inventory tests covering branch isolation and
    cumulative manifest behavior.
  • Frontend tests cover the message-branches helper, version surface, and
    ask-user state machine.

Upgrade Notes

  • Settings file relocation — first launch will migrate any
    .env-based settings into the new JSON files under
    data/user/settings/. The legacy env_store shim is gone; if you
    scripted .env writes externally, point them at
    runtime_settings.py or the /settings API instead.
  • deeptutor start is the recommended launcherstart_web.py /
    start_tour.py continue to work but are now thin wrappers around the
    new runtime/launcher.py. Run deeptutor init once to seed providers
    and credentials on a fresh machine.
  • Animator menu users — point at Visualize instead. The
    capability now picks Manim automatically when the user asks for a
    video / animation; existing Manim-rendered records are unaffected.
  • Memory data migration — the legacy single-blob memory format is
    read by the consolidator on first access and written back as L2 / L3
    records. No manual step is required; old snapshots remain on disk.
  • Capability authors — emit results via
    capabilities/_shared.emit_capability_result and put status copy in
    capabilities/prompts/{en,zh}/<name>.yaml. Hard-coded English status
    strings will fail review.
  • Beta scope — this release ships substantial new surfaces (Auto,
    Memory v2, settings split). Pin to v1.4.0-beta for production until
    the GA cut; bug reports against any of the new modules are welcome.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.10...v1.4.0-beta

Breaking Changes

  • Removal of legacy `main.yaml` capability copy; capabilities must now use per‑capability prompt files and `emit_capability_result` helper.
  • Animator menu eliminated; all visualizations must be requested through the unified **Visualize** capability with a `render_type` discriminator.

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track DeepTutor

Get notified when new releases ship.

Sign up free

About DeepTutor

"DeepTutor: Agent-Native Personalized Learning Assistant"

All releases →

Beta — feedback welcome: [email protected]