Skip to content

Release history

DeepTutor releases

"DeepTutor: Agent-Native Personalized Learning Assistant"

All releases

40 shown

Upgrade now
v1.4.2 Mixed
Auth

Upgrade Notes, Tests, ver1-4-1.md

Upgrade now
v1.4.1 Breaking risk
RCE / SSRF Auth RBAC

Shell exec disabled + resource isolation

Review required
v1.4.0 Breaking risk
Auth Breaking upgrade

Reasoning effort normalization + turn recovery

Review required
v1.4.0-beta Breaking risk
Auth RBAC Dependencies +1 more

Auto Mode + Memory v2 + Chat tools

v1.3.10 Breaking risk
⚠ Upgrade required
  • If using Matrix with E2EE, install the `matrix-e2e` extra or its requirements file and ensure libolm is available.
  • `DISABLE_SSL_VERIFY=true` is allowed only in non‑production environments; it remains blocked when ENVIRONMENT=prod or production.
Breaking changes
  • Matrix no longer installs E2EE by default; `deeptutor[matrix-e2e]` or `requirements/matrix-e2e.txt` must be used to enable encrypted rooms.
Notable features
  • Remote single‑user Docker works out of the box again when AUTH_ENABLED=false without extra CORS settings.
  • `DISABLE_SSL_VERIFY` now propagates to all OpenAI SDK paths for self‑signed LLM endpoints (blocked in prod).
  • Code blocks are protected from citation rewrite, preserving array indexes and other code content.
Full changelog

DeepTutor v1.3.10 Release Notes

Release Date: 2026.05.10

v1.3.10 is a focused reliability release for the issues reported after v1.3.9.
It restores smoother remote Docker access, makes self-signed LLM endpoints work
consistently across SDK-backed providers, protects code snippets from citation
rewrites, and splits Matrix E2EE into an explicit opt-in dependency.

Highlights

Remote Docker and CORS Recovery

  • Remote single-user Docker works out of the box again - when
    AUTH_ENABLED=false, DeepTutor now accepts browser origins over HTTP/HTTPS so
    LAN or remote-server frontends no longer hit the v1.3.8/v1.3.9 CORS
    regression reported in #463.
  • Authenticated deployments stay explicit - when AUTH_ENABLED=true, CORS
    still requires a concrete allowlist through CORS_ORIGIN or CORS_ORIGINS,
    preserving the credentialed-auth safety boundary.
  • Multiple deployment origins are supported - CORS_ORIGINS accepts comma
    or newline separated values, and both Docker Compose files pass the setting
    through to the backend container.
  • Settings no longer drop network flags - CORS_ORIGIN, CORS_ORIGINS, and
    DISABLE_SSL_VERIFY are part of the canonical .env write order.

Provider TLS and Rendering Fixes

  • DISABLE_SSL_VERIFY now reaches OpenAI SDK paths - OpenAI-compatible,
    Azure OpenAI, executor, TutorBot, and legacy embedding SDK clients all receive
    a shared httpx.AsyncClient(verify=False) when the flag is enabled, fixing
    self-signed HTTPS LLM endpoints reported in #464.
  • Production still blocks unsafe TLS bypasses - ENVIRONMENT=prod or
    ENVIRONMENT=production rejects DISABLE_SSL_VERIFY, with a single warning
    logged in non-production use.
  • Code blocks keep array indexes intact - Markdown citation linkification now
    masks fenced and inline code before rewriting references, so values[0] stays
    code instead of becoming a #references citation link (#468).

Matrix Install Compatibility

  • Matrix no longer installs E2EE by default - the standard matrix extra and
    requirements/matrix.txt now use plain matrix-nio, avoiding the
    python-olm / libolm build failures seen on macOS Python 3.14 and Apple
    Clang 21 (#462).
  • Encrypted rooms are an explicit add-on - install deeptutor[matrix-e2e]
    or requirements/matrix-e2e.txt when E2EE support is needed and libolm is
    available.
  • Runtime failures are clearer - Matrix defaults to non-E2EE mode, and
    enabling E2EE without crypto dependencies now raises an actionable install
    message instead of failing at import time.

Multi-User Runtime Compatibility

  • Default workspace paths stay stable outside user scope - when no current
    multi-user context is active, path resolution falls back to the default data
    workspace rather than forcing an admin scope.
  • Legacy test and monkeypatch hooks remain available - session and settings
    routers keep compatibility shims used by tests and older integrations.
  • Local agent artifacts are ignored - .claude/ is now excluded from Git so
    local worktrees and agent metadata do not accidentally enter releases.

Tests

  • Added CORS setting tests for unauthenticated remote origins and authenticated
    explicit allowlists.
  • Added shared OpenAI SDK HTTP-client tests across provider-core, Azure,
    executors, TutorBot, and embedding adapters.
  • Added Markdown display tests for prose citations, fenced code, inline code,
    and explicit backticked citations.
  • Added Matrix dependency split tests to keep default installs free of
    matrix-nio[e2e].
  • Re-ran targeted Python tests, web node tests, Ruff checks, and diff whitespace
    validation for the release patch.

Upgrade Notes

  • If you run remote Docker with AUTH_ENABLED=false, no extra CORS setting is
    required for normal HTTP/HTTPS browser origins.
  • If you run a shared or authenticated deployment with AUTH_ENABLED=true, set
    CORS_ORIGIN or CORS_ORIGINS to the exact frontend origin(s), for example
    https://learn.example.com.
  • Use DISABLE_SSL_VERIFY=true only for local, self-signed, or air-gapped test
    LLM endpoints. It remains blocked in ENVIRONMENT=prod and
    ENVIRONMENT=production.
  • Matrix installs are now non-E2EE by default. For encrypted Matrix rooms,
    install .[matrix-e2e] or requirements/matrix-e2e.txt, ensure libolm is
    present, and set e2ee_enabled=true in the Matrix channel config.
  • If you previously installed .[matrix] only to get non-encrypted Matrix
    messaging, reinstalling after this release should no longer require native
    libolm build tooling.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.9...v1.3.10

v1.3.9 Breaking risk
⚠ Upgrade required
  • Install or refresh the `.[tutorbot]` extra (or `requirements/tutorbot.txt`) to include `zulip>=0.8.0,<1.0.0`. Configure Zulip bots with site, email, apiKey, allowFrom, and groupPolicy; use mention for safer deployments.
  • If using `LLM_REASONING_EFFORT=minimal` with DeepSeek, DashScope, VolcEngine, BytePlus, or MiniMax, keep the setting; v1.3.9 translates it to provider‑specific disable payload.
  • Verify provider limits after raising context window ceiling; large configured windows may now be honored instead of being capped at 65,536 tokens.
Breaking changes
  • Maximum context window raised to 1,000,000 tokens; previously capped at 65,536 tokens for large‑model fallback.
Notable features
  • Zulip added as a TutorBot channel with mention/open policies and LaTeX/KaTeX conversion.
  • NVIDIA NIM provider support integrated into TutorBot configuration and registry detection.
Full changelog

DeepTutor v1.3.9 Release Notes

Release Date: 2026.05.09

v1.3.9 builds on the v1.3.8 multi-user foundation with broader TutorBot
deployment options, safer provider routing for thinking models, and a smoother
web onboarding path. It adds Zulip and NVIDIA NIM support, improves startup
ergonomics, and folds in the main issue fixes reported after the last release.

Highlights

TutorBot Channel and Provider Expansion

  • Zulip is now a TutorBot channel - bots can listen to private messages and
    stream topics, enforce allow_from, choose mention-only or open stream
    replies, and bridge Zulip's event queue into the async TutorBot bus.
  • Math and files work better in Zulip - LaTeX is converted to Zulip-friendly
    KaTeX markup, upload/download calls use configurable retry behavior, and
    attachment filenames include upload-path digests to avoid collisions.
  • Zulip topics keep conversations separated - stream topics now become part
    of the chat/session key, with a stable (no topic) fallback for empty topics.
  • TutorBot supports NVIDIA NIM - nvidia_nim is available in TutorBot
    provider config and registry detection, including NIM's streaming behavior
    that omits unsupported stream_options.

Model and Runtime Reliability

  • Configured context windows are respected - the safety ceiling is raised to
    1,000,000 tokens while the large-model fallback remains 65,536, so explicit
    128K-style model settings are no longer silently clamped.
  • Qwen vision detection is fixed - Qwen VL models are treated as
    vision-capable across DashScope, OpenAI-compatible, and custom bindings.
  • Minimal thinking mode is provider-safe - DeepSeek, DashScope, VolcEngine,
    BytePlus, and MiniMax no longer receive a rejected top-level
    reasoning_effort=minimal; DeepTutor sends the provider-specific disable
    signal instead.
  • DeepSeek v4 costs are tracked - research token accounting includes
    deepseek-v4-flash and deepseek-v4-pro pricing entries.

Web and CLI Polish

  • deeptutor start launches the full web stack - the CLI now delegates to
    scripts/start_web.py so backend and frontend can be started from one
    command, and launcher failures propagate through the CLI exit code.
  • Sidebar onboarding is clearer - primary navigation icons now expose
    scoped, localized tooltips with descriptions and keyboard focus support.
  • Multi-line user messages stay readable - chat message rendering preserves
    Shift+Enter line breaks, fixing code blocks and structured prompts that were
    previously collapsed into one line.
  • Assigned resources are easier to understand - model-selection summaries
    and read-only knowledge-base actions now present clearer labels for
    non-admin, grant-scoped sessions.

Multi-User and Session Store Parity

  • Assigned model options match the selector contract - non-admin LLM choices
    now return profile names, model names, labels, and active/default metadata in
    the same shape expected by the web model selector.
  • PocketBase sessions support more chat flows - message metadata can be
    persisted, last-message lookup is available, and message deletion works with
    PocketBase string IDs as well as SQLite integer IDs.
  • Regenerate remains storage-neutral - turn retry logic can remove the last
    assistant message without assuming the backing session store uses integer
    primary keys.

Tests

  • Added Zulip channel coverage for config parsing, permission checks, duplicate
    filtering, mentions, stream topic scoping, attachment extraction, retry
    behavior, LaTeX conversion, typing status, sending, uploads, and startup
    failures.
  • Added TutorBot NVIDIA NIM provider tests for registry detection, schema
    acceptance, and streaming request compatibility.
  • Added LLM regression tests for Qwen vision capability, explicit context-window
    budgets, and minimal-thinking provider kwargs.
  • Added CLI coverage so deeptutor start propagates the launcher exit code.
  • Added research token-pricing coverage for the DeepSeek v4 model entries.

Upgrade Notes

  • Install or refresh the .[tutorbot] extra, or requirements/tutorbot.txt, to
    include the new zulip>=0.8.0,<1.0.0 dependency before enabling Zulip bots.
  • Configure Zulip bots with site, email, apiKey, allowFrom, and
    groupPolicy; use mention for safer stream deployments and open only
    when every stream message should reach the bot.
  • If you use LLM_REASONING_EFFORT=minimal with DeepSeek, DashScope,
    VolcEngine, BytePlus, or MiniMax, keep the setting as-is; v1.3.9 translates it
    to the correct provider-specific disable payload.
  • Large configured context windows may now be honored instead of capped at
    65,536 tokens, so verify provider limits and expected prompt-cost behavior.
  • Optional PocketBase deployments should ensure the messages collection has a
    metadata_json JSON field before relying on regenerate/session metadata
    parity.

What's Changed

  • fix: raise context_window ceiling and add qwen vision support by @wedone in https://github.com/HKUDS/DeepTutor/pull/442
  • fix: add deepseek-v4-flash and deepseek-v4-pro to model pricing table by @Starfie1d1272 in https://github.com/HKUDS/DeepTutor/pull/447
  • fix(llm): stop sending reasoning_effort=minimal as top-level param to providers that reject it by @Starfie1d1272 in https://github.com/HKUDS/DeepTutor/pull/453
  • feat: add deeptutor start command to launch backend and frontend together by @Starfie1d1272 in https://github.com/HKUDS/DeepTutor/pull/445
  • fix(web): preserve newlines in user chat messages by @kagura-agent in https://github.com/HKUDS/DeepTutor/pull/449
  • feat(tutorbot): add Zulip channel support by @wedone in https://github.com/HKUDS/DeepTutor/pull/452
  • feat: tooltips for sidebar by @philliplagoc in https://github.com/HKUDS/DeepTutor/pull/457
  • fix: add TutorBot NVIDIA NIM provider support by @Bortlesboat in https://github.com/HKUDS/DeepTutor/pull/455

New Contributors

  • @philliplagoc made their first contribution in https://github.com/HKUDS/DeepTutor/pull/457
  • @Bortlesboat made their first contribution in https://github.com/HKUDS/DeepTutor/pull/455

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.8...v1.3.9

v1.3.8 New feature
⚠ Upgrade required
  • Existing single‑user installs remain unchanged unless `AUTH_ENABLED=true` is set.
  • For shared deployments set `AUTH_ENABLED=true`, leave `POCKETBASE_URL` blank, register the first admin via `/register`, and assign models before regular users start chats.
  • Upgrade backs up both `data/` and new `multi-user/` directories; multi‑worker setups must carefully bootstrap the first admin due to an in‑process promotion lock.
Notable features
  • Authenticated multi-user deployments with isolated per‑user workspaces
  • Admin‑managed access, model profile grants, and scoped knowledge bases
  • Integrated frontend auth routes (/login, /register) and comprehensive multi‑user documentation
Full changelog

DeepTutor v1.3.8 Release Notes

Release Date: 2026.05.08

v1.3.8 brings DeepTutor's optional multi-user mode into the main release line.
It keeps local single-user installs unchanged while adding authenticated shared
deployments with isolated user workspaces, admin-managed access, and clearer
deployment guidance.

Highlights

Multi-User Workspaces

  • Authentication can gate shared deployments - enabling AUTH_ENABLED
    adds login, registration, JWT sessions, and a first-user admin flow.
  • Each user gets isolated data - ordinary users work under
    multi-user/<uid>/ with separate chat history, memory, notebooks, and
    knowledge bases, while admins keep the main workspace.
  • Admin grants control access - /admin/users lets admins create users and
    assign allowed model profiles, knowledge bases, skills, and copied spaces
    without exposing API keys.

Safer Runtime Boundaries

  • Knowledge and RAG stay scoped - assigned knowledge bases are visible with
    badges, and non-admin RAG calls no longer fall back silently to admin data.
  • Model routing honors grants - non-admin chat turns use an assigned model
    profile and fail early if no LLM is available.
  • Settings are redacted for users - non-admin settings show theme, language,
    and model summaries, while provider secrets and endpoints remain admin-only.

Deployment and UI

  • Frontend auth routes are included - /login, /register, auth-aware
    middleware, logout controls, and admin navigation are wired into the web app.
  • Multi-user docs are now first-class - README and translated READMEs
    document setup, workspace layout, audit logs, env vars, and production
    caveats.
  • Optional PocketBase remains documented - PocketBase can still be used as a
    sidecar path, but true multi-user deployments should leave POCKETBASE_URL
    unset and use the built-in JSON/SQLite backend.

Tests

  • Added multi-user tests for identity migration, first-admin registration,
    grants, settings restrictions, scoped interface preferences, skill access, and
    RAG fallback prevention.
  • Added status-redaction coverage so non-admin users do not receive provider
    model or search endpoint details.

Upgrade Notes

  • Existing local installs stay in single-user mode unless AUTH_ENABLED=true.
  • For real multi-user deployments, set AUTH_ENABLED=true, keep
    POCKETBASE_URL blank, create the first admin through /register, and assign
    models before ordinary users start chat turns.
  • New deployment state is stored under multi-user/; back up both data/ and
    multi-user/ before upgrading shared instances.
  • Multi-worker deployments should bootstrap the first admin carefully because
    first-user promotion is protected by an in-process lock.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.7...v1.3.8

v1.3.7 New feature
⚠ Upgrade required
  • Set `LLM_REASONING_EFFORT` in `.env` for global thinking control; leave empty to auto-detect.
  • Knowledge-base metadata may now include `last_indexed_at`, `last_indexed_count`, and `last_indexed_action`.
  • Co-Writer clear/template actions are recoverable through undo until the user leaves the current draft.
Notable features
  • Thinking-model compatibility: reasoning output kept separate, DeepSeek effort configurable via `LLM_REASONING_EFFORT`, custom gateway headers preserved, and structured generation more tolerant.
  • Knowledge index visibility: activity recorded with timestamps, counts, actions; UI shows history panels.
  • Co-Writer editing safety: confirmation dialogs before clearing/non‑empty replace, enhanced undo (Ctrl/Cmd+Z/Y), clearer toolbar controls.
Full changelog

DeepTutor v1.3.7 Release Notes

Release Date: 2026.05.04

v1.3.7 focuses on thinking-model compatibility, clearer knowledge-base index
history, and safer Co-Writer editing. It keeps provider-specific reasoning
output under control while making index activity easier to understand in the UI.

Highlights

Thinking-Model and Gateway Compatibility

  • Reasoning output stays separate - OpenAI-compatible and TutorBot providers
    keep reasoning_content out of visible answer text, and streaming avoids
    replaying internal scratchpad as final content.
  • DeepSeek thinking can be configured from .env - LLM_REASONING_EFFORT
    is documented and applied through the resolver path. Use minimal to disable
    DeepSeek thinking, or high / max to enable it.
  • Custom gateway headers are preserved - chat and explicit LLM calls inherit
    profile extra_headers, fixing gateways that require custom headers such as
    a User-Agent override.
  • Structured generation is more tolerant - book blocks and question ideation
    now handle fenced, repaired, list-shaped, or otherwise imperfect JSON outputs
    more reliably.

Knowledge Index Visibility

  • Index activity is recorded - create, upload, and re-index flows now store
    last_indexed_at, indexed document count, and the index action in knowledge
    metadata.
  • Progress payloads describe real index changes - backend status updates can
    distinguish metadata-only completion from an actual vector-index update.
  • The Knowledge UI shows index history - detail, settings, and index-version
    panels display the latest index time and document count when available.

Co-Writer Editing Safety

  • Clear and template actions ask first - replacing a non-empty draft now
    opens a confirmation dialog before the editor is cleared or overwritten.
  • Undo is more dependable - pending typing snapshots are committed before
    toolbar edits, and editor shortcuts support Ctrl/Cmd+Z, Shift+Cmd+Z, and
    Ctrl/Cmd+Y.
  • Toolbar controls are clearer - destructive and template actions now have
    distinct tones, focus states, labels, and accessible tooltips.

Tests

  • Added OpenAI-compatible provider tests to keep reasoning_content separate
    from visible response content in both service and TutorBot paths.
  • Expanded LLM factory tests for inherited extra_headers, inherited
    reasoning_effort, and reasoning-only streaming behavior.
  • Added knowledge manager coverage for recording last_indexed_* metadata only
    when the index actually changes.

Upgrade Notes

  • Set LLM_REASONING_EFFORT in .env if you need global thinking control.
    Leave it empty to let DeepTutor auto-detect behavior from the active model.
  • Knowledge-base metadata may now include last_indexed_at,
    last_indexed_count, and last_indexed_action.
  • Co-Writer clear/template actions are recoverable through undo until the user
    leaves the current draft.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.6...v1.3.7

v1.3.6 New feature
⚠ Upgrade required
  • Send `llm_selection` as `{"profile_id":"...","model_id":"..."}` for explicit model routing; omission falls back to system default.
  • TutorBot configs may include `llm_selection`; legacy `model` overrides remain supported.
  • Configure launch ports via `.env` or environment variables (`BACKEND_PORT`, `FRONTEND_PORT`); old `data/user/settings/env.json` port block is ignored.
Notable features
  • Unified chat turns now carry `profile_id`/`model_id` via WebSocket payload and session preferences for explicit model targeting.
  • Settings endpoint returns safe, credential‑omitted provider/model options; shared UI selector used by Chat and TutorBot.
  • TutorBot can persist and reload its LLM selection without full restart, improving stability of bot history assembly.
Full changelog

DeepTutor v1.3.6 Release Notes

Release Date: 2026.05.03

v1.3.6 focuses on making model routing explicit across DeepTutor. Users can
choose configured LLM profiles from chat and TutorBot flows, runtime services
resolve those choices without leaking provider secrets, and RAG/knowledge-base
index handling is more defensive when persisted embeddings are invalid.

Highlights

Catalog-Based Model Selection

  • Chat can target a configured model - unified chat turns now carry a
    profile_id and model_id selection through the WebSocket payload, session
    preferences, turn snapshots, and regenerate flows.
  • Settings exposes safe LLM options - the new settings options endpoint
    returns display-ready provider/model choices while omitting credentials and
    connection secrets from the response.
  • Runtime model overrides are scoped per turn - selected profiles are
    resolved through the provider catalog for the active request without writing
    temporary choices back to disk or changing global defaults.
  • Model-selector UI is shared - chat and TutorBot screens use the same
    configured-model selector, with localized labels and system-default handling.

TutorBot Model Control

  • Bots can persist model selections - TutorBot create/update flows now accept
    llm_selection, validate it against the configured catalog, and store it with
    each bot.
  • Running bots can reload their LLM - changing a bot's model updates the
    active agent loop instead of requiring a full bot restart.
  • Recent bot history is steadier - TutorBot history assembly now sorts by
    message timestamp with stable tie-breaking before taking the latest context.
  • Bot chat route changes are cleaner - the web chat page cancels in-flight
    bot requests and resets transient reasoning state when switching bots.

RAG and Knowledge Reliability

  • Invalid vectors trigger rebuilds - re-indexing no longer treats a matching
    document signature as reusable when the existing vector store fails embedding
    validation.
  • Full rebuilds use fresh version directories - complete knowledge-base
    rebuilds write to a new flat index version while leaving failed old storage
    available for inspection.
  • RAG tool logs can stream to clients - retrieval runs can forward captured
    INFO-level process logs as raw tool events when an event sink is available.
  • Knowledge health checks recognize bad embeddings - invalid persisted
    vectors are surfaced earlier instead of producing opaque search failures.

Provider and Launch Fixes

  • OpenAI Responses token limits are normalized - Responses API calls now map
    chat-style max_completion_tokens and max_tokens to max_output_tokens,
    fixing the SDK error reported for newer OpenAI models in #437.
  • Azure and OpenAI-compatible paths share the mapping - both streaming and
    non-streaming Responses API routes use the same conversion helper.
  • Launch ports come from .env and environment variables - setup and launch
    helpers now keep backend/frontend port behavior aligned around the project
    .env file instead of the older runtime settings JSON.

Web UX Polish

  • Skill names validate before save - the Skills editor slugifies names,
    flags invalid input inline, and prevents silent API failures for uppercase
    letters, spaces, underscores, or other unsupported characters.
  • Skill editor modals are opaque across themes - the editor now uses the
    page background token, avoiding text bleed-through in translucent themes.
  • Space navigation is easier to scan - Space mini-navigation, notebook,
    question-bank, skills, and session-list spacing were tightened with clearer
    card and divider treatment.

Tests

  • Added model-selection service tests for safe option listing, active markers,
    invalid profile/model rejection, and non-mutating catalog overrides.
  • Added unified WebSocket turn-runtime tests for persisted LLM selections,
    invalid selections, model switching, snapshots, and regenerate behavior.
  • Added TutorBot API and manager tests for llm_selection persistence,
    validation, runtime reload, and default-model behavior.
  • Added settings, provider-runtime, and LLM-config tests for scoped catalog
    selection and per-turn config precedence.
  • Added RAG and knowledge-router tests for invalid vector stores, re-index
    rebuild decisions, and storage version resolution.
  • Added OpenAI Responses converter tests for token-limit aliases, precedence,
    None filtering, and input immutability.
  • Added frontend slug tests for skill-name normalization and validation.

Upgrade Notes

  • Chat and TutorBot clients that want explicit model routing should send
    llm_selection as { "profile_id": "...", "model_id": "..." }. Omitting it
    continues to use the configured system default.
  • TutorBot configuration files may now contain llm_selection. Existing bot
    configs without that field continue to load, and legacy model values remain
    usable as model-name overrides.
  • Launch ports should be configured in .env or process environment variables
    (BACKEND_PORT / FRONTEND_PORT). The old data/user/settings/env.json
    port block is no longer used as a launch-port source.
  • Knowledge bases with stale or invalid persisted vectors may rebuild on the
    next re-index even when document signatures have not changed.
  • Skill names are now normalized and validated as lowercase slugs of up to 64
    characters using letters, numbers, and hyphens.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.5...v1.3.6

v1.3.5 Breaking risk
⚠ Upgrade required
  • Update Node.js installation to version 20.9 or newer for local web installs.
  • `start_web.py` and setup helpers now read `data/user/settings/env.json`/`interface.json` first; adjust settings via the Settings page or rerun `start_tour.py` when changing ports.
  • Local OpenAI‑compatible embedding servers should use an empty API key; avoid relying on the removed placeholder `sk-no-key-required` which is no longer transmitted as an auth header.
Breaking changes
  • Minimum Node.js version increased to 20.9
  • `start_web.py` now prioritizes `data/user/settings/env.json` and `interface.json` over `.env` for runtime settings
Notable features
  • Setup Tour writes backend/frontend ports into `data/user/settings/env.json` for consistent later launches
  • RAG tool calls strictly reject empty queries, providing a safer fallback to the user's original question
Full changelog

DeepTutor v1.3.5 Release Notes

Release Date: 2026.05.02

v1.3.5 focuses on making local setup and knowledge-base chat more reliable. The
launcher now follows the same runtime settings users configure in the web app,
RAG tool calls are stricter about real search queries, and local embedding
servers no longer receive placeholder auth headers.

Highlights

Smoother Local Launch

  • Setup Tour writes launch ports - the guided installer now records backend
    and frontend ports in data/user/settings/env.json, so later launches can use
    the same choices.
  • start_web.py reads runtime settings first - backend/frontend ports and UI
    language come from web settings when available, with .env kept as fallback.
  • Cleaner process handling - the launcher records started processes, detects
    port conflicts, waits for readiness, and exposes scripts/stop_web.py for
    cleaning up recorded backend/frontend processes.
  • Setup requirements are clearer - README and environment examples now align
    around Node.js 20.9+, install profiles, complete embedding endpoint URLs, and
    optional attachment storage.

More Reliable RAG Tool Calls

  • RAG queries must be non-empty - tool schemas, prompts, and built-in checks
    now reject blank queries early instead of passing empty input into retrieval.
  • Chat-side fallback is safer - when a model omits the RAG query, the agentic
    pipeline can reuse the user's actual question as the retrieval query.
  • ReAct calls accept simple string input - rag actions that provide a
    string are normalized to {"query": ...}, reducing fragile tool-call failures.

Local Embedding Compatibility

  • No fake API key for local embedding providers - runtime config no longer
    injects sk-no-key-required for local embedding servers.
  • Placeholder keys are not sent as auth headers - OpenAI-compatible
    embedding requests suppress Authorization and api-key when the configured
    key is the local placeholder, which helps LM Studio, Ollama, vLLM, and similar
    servers.
  • Embedding examples are easier to follow - English and Chinese sample env
    files now explain that EMBEDDING_HOST is the exact endpoint DeepTutor calls.

Web UX Polish

  • Dark-mode provider dropdown is readable - the Settings provider selector
    now uses the theme background token, fixing the white native dropdown popover
    reported on Edge/Chromium.
  • Settings controls are more consistent - select fields and setup tour
    spotlight behavior were tightened for a steadier settings experience.
  • Book reference payloads are normalized more defensively - selected book
    references keep the same behavior with cleaner filtering and deduplication.

Tests

  • Added launch settings tests for runtime settings precedence, .env fallback,
    and invalid-port handling.
  • Added start_web.py tests for translation, state persistence, and recorded
    process matching.
  • Added Setup Tour coverage for dependency profiles, Math Animator selection,
    Node.js version validation, and saved launch ports.
  • Added RAG/tool tests for non-empty query schemas, blank-query rejection, and
    fallback query behavior.
  • Added embedding runtime and adapter tests for local providers, placeholder API
    keys, and auth header suppression.

Upgrade Notes

  • Local web installs now require Node.js 20.9 or newer.
  • start_web.py and setup helpers prefer data/user/settings/env.json and
    interface.json over .env; edit the web Settings page or rerun
    start_tour.py when changing launch ports.
  • Local OpenAI-compatible embedding servers should use an empty API key unless a
    real key is required. Avoid relying on sk-no-key-required as a transmitted
    credential.
  • Custom RAG callers should always provide a non-empty query; blank queries now
    fail fast by design.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.4...v1.3.5

v1.3.4 New feature
⚠ Upgrade required
  • Refresh dependencies after upgrading; CLI extra now requires defusedxml>=0.7.1 for Office document extraction.
  • Custom WebSocket clients should pass `book_references` and `language` on turn start messages.
  • Set `LLM_REASONING_EFFORT` to tune global reasoning effort if using reasoning models.
Notable features
  • Book page chat sessions persist per page via the new page-chat-session API
  • Books can be rebuilt from an approved spine, clearing content while keeping the outline
  • Regular chat turns can attach and cite selected book pages as context
Full changelog

DeepTutor v1.3.4 Release Notes

Release Date: 2026.05.01

v1.3.4 turns the Book Engine and chat workspace into a tighter learning loop:
book pages can now carry their own persistent chat sessions, books can be
rebuilt from an existing spine, and regular chat turns can cite selected book
pages alongside Space context. This release also improves language consistency,
DeepSeek-style reasoning output handling, document extraction for RAG, logging
infrastructure, and the public documentation around DeepTutor's arXiv paper.

Highlights

Book Engine, Page Chat, and Book References

Book generation and reading now preserve more of the user's context and make it
easier to iterate on a generated book without starting over.

  • Book page chat uses the unified stream protocol - the page chat panel now
    uses the shared WebSocket client and stream-event renderer used by the main
    chat workspace, so tool output, assistant events, attachments, and restored
    session history behave consistently.
  • Page chat sessions are persisted per book page - each page can be bound to
    a chat session_id through the new page-chat-session API, and reopening the
    reader restores the page's conversation when available.
  • Books can be rebuilt from the approved spine - the new rebuild flow clears
    generated page content and progress while keeping the confirmed outline, then
    restarts compilation from that structure.
  • Single-page regeneration keeps learner notes - forced recompilation can
    reset generated content while preserving user-authored note blocks and key
    transition metadata.
  • Regular chat can cite book pages - the chat composer can attach selected
    books and pages as request context, persist them in the turn snapshot, restore
    them when sessions hydrate, and show them as removable context chips.
  • Book context is cleaner for reasoning models - selected book pages are
    converted into bounded text references with thinking tags stripped before they
    are injected into chat or page-side conversations.

Chat Language and Reasoning-Model Behavior

Chat turns now follow the user's current language setting more reliably and are
more tolerant of providers that return reasoning content differently.

  • Language is part of each chat turn - WebSocket requests can carry the
    current language, and both agentic chat and classic chat append explicit
    language instructions so answers match the active UI language.
  • Regenerate and Answer Now use the current app language - new turns and
    regenerated turns read the latest stored language instead of relying only on
    older session preferences.
  • DeepSeek-style empty-content responses recover better - OpenAI-compatible
    providers can fall back to reasoning_content when a model returns an empty
    visible content field.
  • Book block writing can tune reasoning effort - LLM-backed book block
    generation now passes reasoning_effort, and structured JSON retries can
    lower effort when reasoning-heavy models fail to return parseable JSON.

RAG, Documents, and Knowledge Base Recovery

Document ingestion now uses the same extraction path across more file types and
keeps re-indexing available in more recovery states.

  • Office files route through parser extraction - .xlsx and .pptx files
    now join PDF and DOCX in the parser-backed routing path, with spreadsheet and
    presentation categories available to downstream RAG logic.
  • LlamaIndex loading uses shared document extraction - parser-routed files
    are read through extract_text_from_path(), use file-type-specific size
    limits, and avoid unnecessary character truncation during indexing.
  • DOCX extraction has a safer fallback path - when python-docx cannot read
    a file, the extractor can parse OOXML content through defusedxml instead of
    failing immediately.
  • Knowledge Base re-index controls are less fragile - the web UI can expose
    re-index actions for error and mismatch states without requiring an already
    initialized RAG runtime, as long as source documents are available.
  • Scanned or empty documents fail more clearly - extraction and validation
    now distinguish byte limits, character limits, empty parsed content, and
    unsupported parser results more consistently.

Settings, Runtime State, and Logging

This release continues the infrastructure cleanup needed for long-running local
and server deployments.

  • Settings shows clearer runtime state - backend, LLM, embedding, and search
    status are displayed as service cards with online state, timestamps, runtime
    model details, and pending-apply indicators.
  • LLM_REASONING_EFFORT is configurable - reasoning effort can be supplied
    through environment configuration and is included in runtime summaries.
  • Logging uses the standard logger surface - routers, agents, providers, RAG
    code, and TutorBot integrations move away from the old custom logger module
    toward standard logging.getLogger(__name__) usage plus a focused Loguru
    bridge.
  • Raw RAG debug log forwarding is quieter - the RAG service no longer
    forwards low-level logging handler output into user-facing event streams by
    default.
  • CI and lint coverage were refreshed - workflow and test changes cover the
    logging configuration path, process-log streaming, and lint consistency.

Documentation and Localization

The project documentation now reflects the paper release and keeps localized
README files aligned with the latest release cadence.

  • The arXiv paper is linked from the README - the main README badge and News
    section now point to 2604.26962.
  • Localized READMEs were refreshed - translated README files include the
    latest release list, arXiv/news updates, and the expanded language navigation.
  • Book, chat, settings, and rebuild copy is localized - English and Chinese
    app strings now cover the new Book chat, rebuild, language, attachment, and
    runtime-state surfaces.

Tests

  • Added chat-language prompt coverage for per-turn language directives and
    language-aware agentic chat behavior.
  • Added Book Engine coverage for book context extraction, page-chat session
    binding, rebuild controls, forced page recompilation, and LLM JSON writing.
  • Added RAG and document-loader coverage for parser-routed files, Office
    extraction paths, file-size limits, and re-index eligibility helpers.
  • Added provider/runtime coverage for LLM_REASONING_EFFORT, OpenAI-compatible
    reasoning fallback behavior, and provider runtime summaries.
  • Added logging tests for configuration, context propagation, Loguru bridging,
    process-log extraction, and task log streaming.
  • Updated frontend tests for document attachment handling, version reporting,
    and Knowledge Base re-index helper behavior.

Upgrade Notes

  • CLI and server installs should refresh dependencies after upgrading. The CLI
    extra and requirements/cli.txt now include defusedxml>=0.7.1 for safer
    XML parsing during Office document extraction.
  • Custom WebSocket clients can pass book_references and language on turn
    start messages. Clients that persist request snapshots should store book
    references alongside notebooks, history, skills, memory, and attachments.
  • Deployments that use reasoning models can set LLM_REASONING_EFFORT to tune
    reasoning effort globally; per-profile and per-model values remain available
    as lower-priority fallbacks.
  • Integrations that consumed raw RAG debug log events should rely on structured
    status and tool events instead of low-level forwarded logger output.
  • Book clients should call the new page-chat-session and rebuild APIs when they
    need page-level conversation persistence or spine-preserving regeneration.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.3...v1.3.4

v1.3.3 New feature
⚠ Upgrade required
  • SQLite session databases are migrated with a new `messages.metadata_json` column on first open
  • Custom WebSocket clients must now explicitly send `memory_references: ["summary"]`, `[\
Notable features
  • NVIDIA NIM becomes a first‑class LLM provider with auto‑detection and usage‑metering disabled
  • Gemini embeddings are fully integrated end‑to‑end (model gemini-embedding-001, 3072 dimensions)
  • Space unifies chat history, notebooks, question‑bank items, skills, and memory into a single context picker
Full changelog

DeepTutor v1.3.3 Release Notes

Release Date: 2026.04.30

v1.3.3 is a fast follow-up release after v1.3.2. It expands provider coverage
with NVIDIA NIM and Gemini embeddings, makes Space the unified place to attach
chat history, notebooks, question-bank items, skills, and memory to a turn, and
continues the stability work around RAG re-indexing, thinking-model cleanup,
TutorBot history, and persisted session context.

Highlights

Provider and Embedding Coverage

DeepTutor now covers more hosted provider setups out of the box and keeps the
runtime configuration path aligned with the Setup Tour and .env examples.

  • NVIDIA NIM is a first-class LLM provider - provider auto-detection now
    recognizes nvapi- keys and NVIDIA API bases, defaults to
    https://integrate.api.nvidia.com/v1, and avoids sending
    stream_options.include_usage because NIM can hang when that option is
    present.
  • Gemini embeddings are available end to end - embedding runtime metadata,
    endpoint validation, Setup Tour choices, model suggestions, and .env
    examples now include Gemini, with gemini-embedding-001, 3072 dimensions, and
    GEMINI_API_KEY fallback support.
  • Provider-specific embedding keys survive Settings writes - .env writes
    preserve keys such as SiliconFlow, DashScope, Cohere, Jina, and Gemini instead
    of only preserving the older core provider set.
  • Dependency resolution is less fragile - the NumPy upper bound was relaxed
    to support current Manim installs in deeptutor[all], and Windows setup docs
    now call out the Visual Studio Build Tools / C++ workload prerequisite.

Space, Chat Context, Skills, and Memory

The chat composer now treats all learning context as Space context, instead of
splitting references, skills, and memory across separate controls.

  • Space opens on Chat History - the Space entry point now lands on the new
    Chat History page, where previous conversations can be searched, refreshed,
    renamed, deleted, and reopened directly from the Space workspace.
  • One Space menu powers toolbar and @ mentions - the old inline
    AtMentionPopup was replaced by a shared Space menu for chat history,
    notebooks, question-bank items, skills, and memory, whether opened from the
    toolbar or by typing @.
  • Skills selection is clearer - skills now open in a full picker with
    search, tags, explicit multi-select, and Auto mode, instead of a small inline
    dropdown beside the composer.
  • Memory can be attached per turn - users can select the running summary,
    profile, or both through the new Memory picker. The request sends
    memory_references, and the backend only injects the selected memory files.
  • Context chips show the full turn setup - selected history, notebooks,
    question-bank items, skills, and memory all appear as removable chips before
    send; sent user messages also show matching request-snapshot badges.
  • Answer Now and session hydration keep context - replayed turns and loaded
    sessions now hydrate notebooks, history references, question-bank references,
    skills, memory references, and attachments from persisted message metadata.

Session Persistence and Message Normalization

Conversation state now records more of the user's actual send-time context and
handles non-text message content more defensively.

  • Message metadata is persisted - the SQLite session store adds a
    metadata_json column and stores a request_snapshot for user messages,
    including capability, tools, selected KBs, language, config, attachments,
    Space references, skills, and memory selections.
  • WebSocket turns accept memory and skills explicitly - incoming payloads
    normalize memory_references to summary / profile, normalize skills into
    a string list, and materialize both into message metadata.
  • TutorBot history handles multimodal content - bot history and recent bot
    previews normalize string, array, object, and image-style content into safe
    display text, while internal reasoning_content is stripped from API
    responses.
  • Frontend message previews are safer - shared message-content utilities
    now accept unknown content, stringify custom objects, render image parts as
    [image], and truncate previews consistently across chat and session lists.

Memory, Notebook, and Thinking-Model Cleanup

The thinking-output cleanup introduced in v1.3.2 now reaches more durable
storage surfaces and rejects malformed memory rewrites before they can corrupt
profile or summary files.

  • Memory rewrites must match the expected shape - profile and summary
    refreshes now verify allowed section headings before writing. If a thinking
    model answers the user instead of returning structured memory, the write is
    rejected rather than persisted.
  • Memory context is explicit - build_memory_context() now only includes
    summary and/or profile when those files are requested, matching the new
    per-turn Memory picker and avoiding accidental default memory injection.
  • Notebook summaries are cleaned and repaired - notebook writes, streaming
    summary saves, and notebook loads strip thinking tags from summaries; older
    notebook records are repaired on read when possible.
  • Streaming summary chunks are cleaner - generated notebook summaries are
    assembled, cleaned, and emitted after cleanup, so empty or scratchpad-only
    chunks are not streamed to clients.

RAG and Knowledge Base Resilience

RAG validation now catches more invalid persisted indexes before retrieval and
returns clearer events when the user needs to re-index.

  • Stale processing KBs recover when an index is ready - if kb_config.json
    is stuck at processing or initializing but a ready LlamaIndex version is
    already on disk, Knowledge Base info reports ready and hides the stale
    progress bar instead of leaving the UI in a perpetual processing state.
  • More vector stores are validated - LlamaIndex storage now checks the
    default vector store, storage_context.vector_stores, and persisted
    *vector_store.json embedding dictionaries for null, dropped, non-numeric,
    non-finite, or inconsistent vectors.
  • Invalid-index failures emit user-facing status events - RAG search now
    sends a structured error status with needs_reindex through the tool event
    stream and avoids treating known invalid-index failures as successful
    retrieval attempts.
  • Low-level vector errors are less exposed - known invalid embedding/index
    failures are logged and surfaced as re-index guidance instead of raw
    NoneType * float style tracebacks in user-facing logs.

Tests

  • Added Knowledge Manager coverage for promoting stale processing /
    initializing status to ready when a valid index version already exists.
  • Added provider coverage for NVIDIA NIM registry metadata, stream-option
    behavior, Gemini embedding runtime defaults, endpoint validation, Setup Tour
    provider choices, and .env key preservation.
  • Added session and WebSocket coverage for metadata_json, request snapshots,
    skills normalization, memory reference parsing, and turn materialization.
  • Added memory and notebook coverage for thinking-tag stripping, invalid memory
    rewrite rejection, selective memory-context injection, and summary repair on
    read.
  • Added RAG/LlamaIndex coverage for multi-vector-store validation,
    disk-persisted invalid vectors, needs_reindex status events, and sanitized
    raw logs.
  • Added TutorBot and frontend message-content coverage for non-string,
    multimodal, object, image, and truncated message content.

Upgrade Notes

  • Existing SQLite session databases are migrated in place with a new
    messages.metadata_json column the first time the session store opens them.
  • Custom WebSocket clients that relied on implicit memory injection should now
    pass memory_references: ["summary"], ["profile"], or both. Empty or absent
    memory references intentionally mean "do not attach long-term memory".
  • Knowledge bases that still report invalid persisted vectors should be
    re-indexed after confirming the active embedding provider, model, dimension,
    and endpoint URL.
  • Notebook summary streaming clients should expect cleaned summary output after
    assembly rather than relying on every raw model chunk being forwarded.
  • NVIDIA NIM users should configure an OpenAI-compatible model under the new
    provider and keep stream_options.include_usage disabled for this gateway.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.2...v1.3.3

v1.3.2 Breaking risk
Breaking changes
  • Embedding configuration changed from base URLs to full endpoints; auto-migrated for known providers, custom gateways require manual configuration review
Security fixes
  • Prevents reasoning model scratchpad output from leaking into long-term memory
Notable features
  • Explicit embedding endpoint URLs with auto-migration for known providers
  • Invalid index detection with re-index guidance for RAG
  • Thinking tag removal from long-term memory
Full changelog

DeepTutor v1.3.2 Release Notes

Release Date: 2026.04.29

v1.3.2 is a focused stability release after v1.3.1. It tightens the embedding
endpoint contract, makes LlamaIndex RAG recover more cleanly from stale or
invalid indexes, and prevents reasoning-model scratchpad output from leaking
into long-term memory.

Highlights

Transparent Embedding Endpoint URLs

Embedding configuration is now explicit about the exact URL DeepTutor will call.
This removes the hidden "base URL vs endpoint URL" ambiguity that could make a
successful Settings test behave differently from a Knowledge Base re-index.

  • Settings now shows Endpoint URL for embeddings - the Web settings page
    labels embedding URLs as endpoint URLs and explains that DeepTutor posts to
    the visible URL exactly, without appending /embeddings or /api/embed at
    request time.
  • Provider defaults are full endpoints - OpenAI, OpenRouter, Jina, vLLM/LM
    Studio, and SiliconFlow default to /embeddings; Ollama defaults to
    /api/embed; Cohere defaults to /embed; DashScope keeps its native
    multimodal embedding endpoint.
  • Legacy base URLs are migrated safely - saved embedding profiles using
    old-style bases such as https://api.openai.com/v1,
    https://openrouter.ai/api/v1, or http://localhost:11434 are normalized to
    the full endpoint form and persisted back to the model catalog. Custom
    OpenAI-compatible URLs are left untouched.
  • Misconfigured endpoints fail early - the embedding client now rejects
    known-provider URLs that point to a root/base path instead of the real
    embedding endpoint, with an actionable message before indexing starts.
  • OpenRouter embedding uses exact-URL HTTP - public embedding providers no
    longer route through the OpenAI SDK's hidden path-appending behavior.
    custom_openai_sdk remains available for legacy configs, but is hidden from
    the Settings provider dropdown.
  • Connection-test diagnostics match runtime behavior - embedding tests now
    report "POSTed exactly as shown in Settings", matching the adapter behavior
    used by RAG indexing and retrieval.

RAG Re-index and Retrieval Resilience

The LlamaIndex pipeline now refreshes embedding state more aggressively and
turns invalid persisted vectors into clear re-index guidance instead of raw
Python or NumPy errors.

  • Cached pipelines pick up Settings changes - initialize, search, and
    incremental add paths reconfigure LlamaIndex before use, so a long-lived
    pipeline does not keep embedding model, dimension, or endpoint settings from
    an older Settings session.
  • Embedding clients refresh when config changes - the shared embedding
    client is recreated when the resolved runtime config changes, and the
    LlamaIndex CustomEmbedding adapter fingerprints the active config before
    reusing a cached client.
  • Persisted index vectors are validated before retrieval - LlamaIndex
    storage now checks the saved vector store for null, non-numeric, non-finite,
    dropped, or inconsistent vectors before running similarity search.
  • Invalid indexes return a re-index hint - known failures such as
    unsupported operand type(s) for *: 'NoneType' and 'float', vector shape
    mismatches, and newly detected invalid persisted vectors now return
    needs_reindex: true with a user-facing explanation.
  • Embedding connectivity checks use the same validation path - the
    pre-index smoke test validates provider output with the same batch validator
    used during indexing and retrieval.
  • RAG error logs are quieter when the fix is known - classified invalid
    embedding/index failures are logged as actionable warnings instead of noisy
    full tracebacks.

Memory Cleanup for Thinking Models

Memory refresh now strips private reasoning blocks before they can become
durable user memory.

  • Thinking tags are removed before writes - profile and summary rewrites run
    through the shared clean_thinking_tags() helper after code-fence cleanup, so
    <think> / <thinking> blocks from reasoning models are not saved into
    PROFILE.md or SUMMARY.md.
  • Existing memory files self-repair on read - if an older memory file
    already contains closed or unclosed thinking tags, reading the snapshot cleans
    the content and writes the repaired version back to disk when possible.
  • Manual memory edits use the same cleanup - direct memory writes also pass
    through the cleaner, keeping UI edits, refreshes, and runtime reads aligned.

Settings and Runtime Polish

  • Embedding provider choices are less confusing - Settings no longer offers
    the legacy custom_openai_sdk provider in the public dropdown, while existing
    saved profiles continue to resolve for backwards compatibility.
  • Model catalog normalization is persisted - catalog loads now save when
    normalization changes active profile/model IDs or embedding endpoint URLs,
    preventing the same migration from repeating on every startup.
  • OpenAI-compatible embedding errors are clearer - non-JSON or HTML
    embedding responses now point to wrong endpoint/model pairings without
    incorrectly suggesting only one gateway-specific cause.
  • Deep Solve ReAct calls are aligned again - the solver loop no longer
    passes a stale attachments keyword into SolverAgent.process(), avoiding a
    runtime TypeError while keeping attachment forwarding on the planner and
    replan calls where it is supported.

Tests

  • Added endpoint migration coverage for OpenAI, OpenRouter, Ollama, and custom
    embedding profiles.
  • Added Settings API coverage for full endpoint provider choices and hidden
    custom_openai_sdk.
  • Added embedding client coverage for endpoint validation, OpenRouter's raw HTTP
    adapter path, client refresh on config changes, and exact URL transparency.
  • Added LlamaIndex coverage for stale embedding-client refresh, repeated
    settings reconfiguration, invalid persisted vector detection, and re-index
    hints for invalid indexes.
  • Added memory coverage for closed and unclosed thinking tags, plus repair of
    existing memory files during reads.
  • Ran targeted Deep Solve/RAG capability tests covering solver runtime wiring
    after the stale attachments argument fix.

Upgrade Notes

  • Embedding URLs in Settings should now be full endpoint URLs. Existing known
    provider profiles are migrated automatically, but custom gateways should be
    reviewed manually if they use non-standard paths.
  • If Knowledge Base search still reports invalid embedding vectors, re-index the
    affected KB after confirming the active embedding provider, model, dimension,
    and endpoint URL.
  • Memory files containing old <think> blocks will be cleaned the next time the
    Memory page or memory service reads them; this read can update the underlying
    PROFILE.md or SUMMARY.md file to persist the cleaned version.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.1...v1.3.2

v1.3.1 Breaking risk
Notable features
  • Safer RAG routing: LLM can't override selected knowledge base, no silent fallbacks
  • Embedding validation: responses checked for null values, invalid dimensions, and shape mismatches before use
  • Docker improvements: memory volume mounting, memory migration safety, TutorBot auto-start preservation, IME-safe messaging
Full changelog

DeepTutor v1.3.1 Release Notes

Release Date: 2026.04.28

v1.3.1 is a stability release after v1.3.0. It focuses on safer RAG routing,
stronger embedding validation, more reliable Docker/TutorBot restarts, and a
set of small but important Web UX fixes.

Highlights

Safer RAG and Knowledge Base Routing

  • Selected KB is now trusted system state - chat RAG calls no longer let the
    LLM invent or override kb_name; the model only sees a query, and DeepTutor
    routes it to the KB selected in the UI/session.
  • No silent fallback when RAG has no KB - if RAG is enabled but no KB is
    selected, the turn skips KB retrieval with a clear progress message instead
    of accidentally querying stale/default state.
  • Default KB aliases are handled consistently - default, current,
    selected, and Chinese equivalents resolve to the configured default KB for
    tool calls, file listing, and re-index APIs, while a real KB named default
    still wins over the alias.
  • Incremental document adds use the service layer - adding files now goes
    through RAGService.add_documents, keeping add/search/re-index behavior on
    the same provider and index-version path.
  • RAG internals are easier to maintain - the LlamaIndex pipeline was split
    into focused modules for loading, embedding, storage, errors, and orchestration;
    smart multi-query retrieval now lives in SmartRetriever.

Embedding and Index Reliability

  • Embedding responses are validated before use - DeepTutor now rejects
    dropped vectors, null values, non-numeric values, non-finite values, and
    inconsistent dimensions before they reach LlamaIndex.
  • Connection tests probe batch behavior - embedding smoke tests now send a
    tiny batch, catching providers that only return one vector or change
    dimensions between inputs.
  • Bad indexes fail with actionable messages - null-vector or shape mismatch
    retrieval failures now return a re-index hint instead of exposing low-level
    NoneType * float style errors.
  • Index status is less noisy during writes - empty in-progress version
    directories no longer mark brand-new KBs as needs_reindex, and failed empty
    version folders are cleaned up when possible.
  • Embedding examples were clarified - services/embedding/.env.example
    now documents full endpoint URL semantics and concrete provider examples for
    OpenAI, SiliconFlow, Ollama, Cohere, Jina, vLLM, Azure OpenAI, DashScope, and
    OpenAI-SDK-style gateways.

Docker, Memory, and TutorBot Runtime

  • Docker images can expose the app version - APP_VERSION is passed through
    to both backend and frontend runtime environment variables.
  • Shared memory has its own Docker volume - compose files now mount
    ./data/memory:/app/data/memory; README persistence docs were updated.
  • Memory migration is safer - legacy SUMMARY.md and PROFILE.md files are
    copied into data/memory even when the target directory already exists, while
    existing target files are preserved.
  • Memory refreshes are serialized - concurrent profile/summary rewrites now
    run under a lock to avoid racing writes.
  • TutorBot restart intent is preserved - graceful server shutdown keeps each
    bot's auto_start flag intact for the next Docker/host restart, while manual
    stops still disable auto-start.
  • TutorBot shares long-term memory - started bot agents now receive the
    shared memory directory instead of running without it.

Web and Authoring UX Fixes

  • IME-safe message sending - chat composers no longer submit when Enter is
    being used to confirm Chinese/Japanese/Korean IME candidates.
  • Knowledge list stays fresh in chat - the chat page reloads KB metadata on
    focus, page show, and visibility changes, so newly created or re-indexed KBs
    appear without a full refresh.
  • Markdown preview protects pseudo-tags - unknown HTML-like tags such as
    <think> are escaped for display while preserving source line counts for
    editor/preview sync.
  • Co-Writer output strips reasoning tags - closed, aliased, attributed, and
    unclosed <think> / <thinking> blocks are removed from final edit output.
  • Theme initialization runs before hydration - the theme script is rendered
    from the server so dark/light preference is applied before React hydrates.
  • Knowledge UI feedback is clearer - index version chips have cleaner labels,
    failed empty active indexes are explained, and 404s from newer UI vs older
    Docker backend now suggest pulling/recreating the container.
  • Memory page feedback is clearer - refresh calls distinguish "checked, no
    long-term updates" from real failures.

Startup and Windows Robustness

  • CLI/server streams tolerate legacy code pages - startup scripts and API
    runners use replacement-safe text streams to avoid Unicode crashes on Windows
    locales such as GBK/CP936.
  • Child processes inherit safer encoding - web/tour startup paths set
    PYTHONIOENCODING=utf-8:replace.
  • Generated frontend env files are UTF-8 - start_web.py now writes
    web/.env.local with explicit UTF-8 encoding.
  • GHCR compose pulls fresher images - docker-compose.ghcr.yml now uses
    pull_policy: always for the published image.
  • Docs/locales were refreshed - README persistence notes were updated, the
    Polish README link was added, and English/Chinese UI copy gained missing
    labels for knowledge, memory, TutorBot soul templates, and Co-Writer drafts.

Tests

  • Added/updated coverage for chat RAG KB routing, default KB aliases, knowledge
    file/re-index APIs, embedding batch validation, LlamaIndex invalid-vector
    failures, in-progress index-version status, incremental add storage layout,
    TutorBot auto-start/shared memory, memory migration, Windows CLI encoding,
    IME keyboard handling, markdown tag escaping, and reasoning-tag cleanup.

Upgrade Notes

  • Docker users should pull and recreate the container so the Web UI and backend
    knowledge APIs stay in sync.
  • If a KB was indexed with a broken or changed embedding provider, use
    Re-index from the Knowledge page after fixing the embedding settings.
  • If you persist data manually, add the new shared memory mount:
    ./data/memory:/app/data/memory.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.0...v1.3.1

v1.3.0 Breaking risk
Notable features
  • Versioned knowledge base indexes with embedding-aware access patterns and background re-indexing API
  • Embedding adapters for OpenAI SDK, Aliyun DashScope, SiliconFlow, OpenRouter, and multimodal support
  • Knowledge management page split into dedicated tabs with file browser, re-index controls, and live task status
Full changelog

DeepTutor v1.3.0 Release Notes

Release Date: 2026.04.27

Highlights

Versioned Knowledge Base Indexes and Re-index Workflow

Knowledge bases now keep vector indexes per embedding configuration instead of treating one llamaindex_storage/ directory as the whole truth. Each new index version records an embedding signature and metadata, so switching models no longer has to overwrite the previous index, and switching back can reuse a ready version when the signature matches.

  • Versioned storage layout — new indexes are written as flat version-N/ directories with meta.json; legacy root and nested llamaindex_storage layouts remain readable for existing installs.
  • Embedding-aware reads and writes — RAG search, document additions, manager statistics, and delete/cleanup paths now resolve storage through the active embedding signature. If no ready version matches the current embedding model, retrieval returns a clear needs_reindex signal instead of failing later with an opaque storage error.
  • Background re-index APIPOST /api/v1/knowledge/{kb_name}/reindex rebuilds a KB against the active embedding configuration, streams logs through the existing task channel, and returns noop: true when a matching ready index already exists.
  • Index status surfaced to the UI — KB summaries now include index-version metadata, active-match state, embedding mismatch flags, progress, and re-index readiness so the frontend can explain why a KB needs rebuilding.

Knowledge Management Page Rebuild

The Knowledge page has been split from a monolithic screen into a focused master-detail workspace for day-to-day KB operations.

  • Dedicated KB detail tabs — Files, Add documents, Index versions, and Settings now live as separate sections with a compact header showing provider, embedding model, default status, update time, and live task status.
  • Raw file browser and inline preview — the new Files tab lists documents from the KB raw/ directory and opens them in an inline preview pane, reusing the chat preview pipeline for PDFs, images, Markdown, code/text, and fallback downloads. The file list can collapse to reclaim preview space.
  • Re-index controls and logs — the Index versions section shows active, stale, legacy, and inactive versions, with a one-click Re-index action and live process logs for rebuild tasks.
  • Progress and history hooksuseKnowledgeBases, useKnowledgeProgress, and useKnowledgeHistory merge server state with live WebSocket/SSE progress, auto-refresh active work, and keep recent create/upload/re-index outcomes visible.

Embedding Runtime, Provider Coverage, and Dimension Discovery

The embedding stack was tightened around provider-specific behavior instead of assuming every service behaves like OpenAI's default endpoint.

  • No more hard-coded 3072 default — embedding dimension now starts as unknown/empty and is auto-filled from the provider response after a successful test connection. The test probe deliberately avoids sending dimensions so it measures the model's native vector length before any Matryoshka truncation.
  • Full endpoint URL semantics — httpx-based embedding adapters now treat EMBEDDING_HOST / catalog URLs as the exact endpoint to call, while the new openai_sdk adapter keeps SDK-style /v1 base URL behavior. .env.example documents concrete endpoint examples for OpenAI, Cohere, Jina, Ollama, SiliconFlow, and Aliyun DashScope.
  • New embedding adapters and bindings — added official OpenAI SDK embedding support, Aliyun DashScope native multimodal embeddings, SiliconFlow presets, OpenRouter/custom OpenAI-SDK profiles, provider batch limits, multimodal provider flags, and per-provider API-key fallbacks.
  • Multimodal embedding requestsEmbeddingRequest now accepts structured contents plus DashScope enable_fusion; Cohere v2, Jina, OpenAI-compatible gateways, and DashScope handle multimodal payloads through their own adapter rules.
  • Better provider errors — embedding failures now preserve provider status, body, model, and URL context, with clearer handling for 4xx responses and non-JSON/HTML gateway replies.

LLM Reasoning Streams and Vision Attachment Robustness

Reasoning traces and image attachments now flow through more provider paths with less provider-specific surprise.

  • Reasoning deltason_reasoning_delta is wired through the base LLM provider contract, OpenAI-compatible streaming, Azure SDK streaming, Anthropic paths, and OpenAI Responses parsing. Streaming output wraps reasoning text in <think>...</think> before normal content resumes.
  • DeepSeek reasoning defaults — DeepSeek reasoning model patterns can auto-inject a high reasoning effort when the caller has not specified one, matching providers that require an explicit switch to surface thinking output.
  • Vision URL capability flags — provider/model capabilities now distinguish "supports vision" from "accepts image URLs." Moonshot/Kimi vision models and Anthropic-style adapters force local attachment URLs into inline base64 when possible.
  • Local attachment URL resolution/api/attachments/... image URLs can be resolved back through the attachment store and sent as base64 to providers that reject remote URL form; unresolved external URLs are counted as dropped image inputs instead of silently pretending they were sent.

Space Hub, Skills Tags, and Personal Library UX

Personal learning artifacts have been gathered under a new Space area in the sidebar.

  • Space navigation/space redirects to /space/notebooks and the new mini-nav groups Notebooks, Question Bank, Skills, and Memory with a shared section header style.
  • Notebooks section — notebooks can be created, deleted, searched, opened, and inspected with rendered record previews. Saved TutorBot, chat, research, and Co-Writer outputs receive distinct badges and can link back to the original chat session when metadata is available.
  • Question Bank section — quiz entries can be filtered by all/bookmarked/wrong-only, grouped with categories, renamed, removed from categories, bookmarked, deleted, and opened back in their source context.
  • Memory section — the Memory page moves into Space with edit/preview modes for summary and profile, manual save, refresh-from-session, clear actions, unsaved-change status, and localized feedback.
  • Skills section — user-authored skills now support tags in frontmatter plus a .tags.json vocabulary. The API adds tag list/create/rename/delete endpoints, and the UI can filter by tag, manage tags, rename skills, and edit tag assignments while preserving the existing SKILL.md workflow.

Dependency Layers, TutorBot Debugging, and Windows Startup

The install story has been reorganized around pyproject extras, with requirements files kept as mirrors for Docker/CI environments. This also makes TutorBot setup and channel debugging less ambiguous: the agent engine, channel SDKs, Matrix native dependencies, and core server provider imports now live in clearly separated layers.

  • Extras hierarchy.[cli] now includes RAG, document parsing, and built-in provider SDKs; .[server] builds on CLI with FastAPI/uvicorn; .[tutorbot], .[matrix], .[math-animator], .[dev], and .[all] layer optional capabilities explicitly.
  • TutorBot dependency boundaryrequirements/tutorbot.txt now mirrors .[tutorbot], depends on server.txt, and keeps channel/agent dependencies such as cron, MCP, Telegram, Feishu/Lark, DingTalk, Slack, QQ, Socket.IO, socks, and message-pack tooling in the TutorBot layer instead of mixing them into the base install.
  • Matrix channel split-out — Matrix / Element support has its own .[matrix] extra and requirements/matrix.txt, with matrix-nio[e2e], Markdown sanitization helpers, and explicit libolm setup notes for native encryption dependencies.
  • Runtime dependency fix (#391)loguru and json-repair moved into the server dependency layer because provider-core imports need them before TutorBot is involved. This fixes clean server installs that previously crashed on missing modules.
  • Windows launcher robustness (#391, #398)scripts/start_web.py now reads backend/frontend subprocess output as UTF-8 with replacement, avoiding UnicodeDecodeError on Windows locales such as GBK.
  • Docs and CLI hints — README, Chinese README, CLI README, and CLI error messages now point users to pip install -e ".[cli]" / pip install -e ".[server]" instead of older requirements-first commands. A new requirements/matrix.txt mirrors the Matrix extra and documents the native libolm prerequisite.

Bug Fixes

  • Knowledge upload/create diagnostics (#392, #405) — KB initialization, upload, and re-index tasks now propagate failure details and stack traces through task logs; the UI can show richer errors instead of appearing to do nothing when background ingestion fails.
  • KB name validation — HTTP and CLI creation paths now reject path-like or URL-reserved characters while preserving Unicode-friendly names, preventing invalid KB folders and unsafe routes.
  • Case-insensitive document discovery — KB directory scanning and CLI document collection now use the shared file router, so uppercase extensions such as .PDF and .MD are accepted consistently.
  • Safer document filenames — uploaded filenames are normalized, path fragments are stripped, and extensions are lowercased before validation and storage.
  • Raw file serving safety — KB raw-file endpoints resolve paths strictly under the raw/ directory and reject traversal attempts.
  • Model catalog environment overlay.env values are only synced into the catalog while it still looks pristine, avoiding accidental overwrites once users have multiple custom profiles.
  • Research reporting fallback (#404) — the reporting agent's JSON-fallback warning now uses an f-string so loggers that do not apply % formatting still include the section title cleanly.

Test Suite Expansion

  • Knowledge/RAG — new coverage for KB naming, index-version allocation and read priority, legacy-to-flat storage compatibility, LlamaIndex storage layout, raw directory initialization, case-insensitive file routing, KB deletion, and API upload edge cases.
  • Embedding/config — new tests cover dimension auto-fill from test probes, catalog .env overlay behavior, DashScope and OpenAI SDK adapters, Qwen3 send_dimensions, URL transparency, non-JSON provider responses, and multimodal embedding requests.
  • LLM/multimodal — new tests validate reasoning/vision capability behavior and local attachment URL conversion for providers that require inline base64 images.
  • CLI and validation — CLI KB collection and document validator tests cover uppercase extensions, Chinese filenames, and Windows-style path stripping.

Community Contributions

  • @jonathanzhan1975 — Fix Windows server startup and missing server runtime dependencies affecting clean Web/TutorBot installs (#391)
  • @kagura-agent — Clean up reporting-agent fallback logging for JSON parse failures (#404)

Recent open discussions after v1.2.5 also shaped this release window, especially KB upload/create failures (#392, #405), TutorBot Agent restart/state reports (#385), Windows startup reports (#398), dependency-installation pain (#402), JSON robustness feedback (#400), and the next wave of Space/Memory/project organization requests (#397, #401, #403).

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.2.5...v1.3.0

v1.2.5 New feature
Notable features
  • Persistent attachment store with safe /api/attachments/ serving
  • File preview drawer for PDFs, images, code, Office docs
  • Broadened text/code support (JSONC, Vue, Kotlin, Solidity)
Full changelog

DeepTutor v1.2.5 Release Notes

Release Date: 2026.04.25

Highlights

Attachment Store and File Preview Drawer

Chat attachments are now persisted as first-class session artifacts instead of living only as inline base64 blobs in message rows. The turn runtime writes every uploaded file to a new attachment store before document extraction, records a stable URL on the message, and drops the bulky base64 payload from persisted chat history.

  • Backend attachment storedeeptutor/services/storage/attachment_store.py introduces a pluggable AttachmentStore protocol and a default LocalDiskAttachmentStore rooted at data/user/workspace/chat/attachments. The path can be overridden with CHAT_ATTACHMENT_DIR, documented in .env.example.
  • Safe attachment serving — a new /api/attachments/{session_id}/{attachment_id}/{filename} router serves stored files with traversal-safe path resolution, atomic writes on upload, inline Content-Disposition, UTF-8 filename handling, and private no-cache headers.
  • Stable attachment metadataAttachment now carries an id plus extracted_text; TurnRuntimeManager generates missing IDs, persists original bytes, stores the preview URL, and keeps extracted text from Office/text documents so the UI can show exactly what the assistant read.
  • Right-side preview drawerFilePreviewDrawer adds a Claude-style side panel for chat files. On desktop it squeezes the chat column; on smaller screens it overlays. The shell stays mounted for instant open/close, while heavy preview bodies are deferred until after the slide animation to avoid jank.
  • Preview renderers — PDFs render in the browser viewer, images and SVGs render with native <img>, Markdown reuses the main Markdown renderer, code/text files use RichCodeBlock with syntax highlighting, Office files show backend-extracted plain text, and unsupported/legacy files fall back to a download card.
  • Attachment actions — pending composer chips and sent message cards are clickable, with keyboard focus rings, download, copy-link, Escape-to-close, and graceful "legacy file not stored" messaging for old sessions.

Broader Code and Text Attachment Coverage

The v1.2.4 text/code attachment support has been widened substantially, keeping the backend RAG router, chat extractor, frontend upload allowlist, icons, and preview highlighter aligned.

  • More accepted formatsFileTypeRouter.TEXT_EXTENSIONS and TEXT_LIKE_EXTS now include JSONC/JSON5, MJS/CJS/MTS/CTS, Vue/Svelte, Kotlin scripts, Groovy/Gradle, C#/Zig/Nim, Objective-C, Perl/Lua/Julia/Dart, Haskell/Clojure/Elixir/Erlang/OCaml/F#, Lisp/Scheme/Racket, Solidity, fish/vim, GraphQL/protobuf, CMake/Makefile, Terraform/HCL, nginx config, and Dockerfile-style files.
  • Central syntax mappingweb/lib/code-languages.ts maps extensions and special filenames (Dockerfile, Makefile, CMakeLists.txt, dotfiles, etc.) to Prism language names so preview classification and code highlighting stay in sync.
  • Frontend helpers exportedextOf() is now exported from web/lib/doc-attachments.ts for the preview pipeline, and document icons/categories were expanded to match the new extension set.

Attachment-Aware Deep Capabilities

Uploaded attachments now flow into more agent pipelines, not just the default chat stage.

  • Base agent parity — non-streaming LLM calls now use the same prepare_multimodal_messages() path as streaming calls when attachments are present, including image stripping/logging for non-vision models.
  • Deep Solve — instead of extracting only the first image URL, DeepSolveCapability forwards image attachments through MainSolver into planner, solver, and replan calls, so multimodal problems remain visible throughout the Plan-ReAct-Write loop.
  • Deep Question — topic generation and follow-up answering pass attachments to the underlying agents. Mimic mode can now use [Attached Documents] text directly when uploaded PDFs have already been extracted and stripped from base64 storage.
  • Deep Research — research planning accepts attachments and forwards them into the first planning LLM call, whether that is rephrase or decompose, without duplicating the same image/file context in later planning turns.
  • Visualize — visualization analysis now receives chat attachments, enabling diagrams or screenshots to influence render-type selection and data extraction.

TutorBot Export and Notebook Capture

TutorBot chat sessions now have the same capture paths as regular chat:

  • The Agent chat page adds Save to Notebook and Download Markdown actions in the header.
  • SaveToNotebookModal, notebook-api, backend notebook request types, and RecordType now recognize a new tutorbot record type.
  • The Knowledge page displays TutorBot notebook entries with their own violet badge and bot icon.
  • Restored TutorBot chat history now re-snaps to the bottom across multiple frames so Markdown/KaTeX growth after first paint does not leave the user above the latest message.

Setup Tour Diagnostics and Install Robustness

The guided setup tour now explains dependency failures instead of failing silently:

  • Bootstrap dependency installation captures stdout/stderr and prints the real pip error plus a manual retry command.
  • uv resolution checks common install locations (~/.local/bin, ~/.cargo/bin, Homebrew paths) before attempting installation, and reports clear next steps if uv installs successfully but is still not on PATH.
  • uv install failures now show localized English/Chinese hints for Python wheel availability, stale shell PATH, PyPI mirror issues, and direct installer options.
  • Node.js/npm version checks resolve executables through shutil.which() before running --version, improving Windows .cmd/.bat compatibility after Python subprocess hardening.

Chat and Knowledge UX Fixes

  • Auto-scroll reliabilityuseChatAutoScroll now waits for message content before attaching its mutation observer, fixing missed bottom-scroll behavior when reopening sessions whose message container was initially empty.
  • Preview animation polishglobals.css adds a chat-preview-shell transition so the drawer slide and chat-column squeeze move together at the same 220 ms timing.
  • Knowledge upload picker — KB file inputs no longer rely on the browser/OS accept filter, which could hide valid files on some systems; in-app validation still enforces the supported-file policy after selection.
  • Localized preview copy — English and Chinese strings were added for preview loading, copy link, unavailable previews, legacy files, remove attachment, and Office extracted-text explanation.

Test Suite Expansion

  • Capability attachment forwardingtests/core/test_capabilities_runtime.py adds coverage for Deep Solve, Deep Question, Deep Research, and Visualize attachment propagation, including mimic mode with extracted document text.
  • Research planning edge casetests/agents/research/test_research_pipeline_rag.py verifies that attachments are forwarded to decompose when rephrase is enabled but performs zero iterations, ensuring the first actual planning LLM call still sees the uploaded context.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.2.4...v1.2.5

v1.2.4 Breaking risk
Notable features
  • Text/code attachment support (Markdown, JSON, YAML, CSV, LaTeX, HTML, code files)
  • One-command setup tour with dependency installation
  • Chat Markdown export
Full changelog

DeepTutor v1.2.4 Release Notes

Release Date: 2026.04.25

Highlights

Support More Attachment Formats in Chat

The v1.2.3 document-attachment pipeline has been expanded beyond Office files. Chat attachments now accept the same text-like formats as the Knowledge Base ingestion router: Markdown, plain text, logs, JSON/YAML/TOML, CSV/TSV, LaTeX/BibTeX, HTML/XML/SVG, stylesheets, scripts, and common source-code files (.py, .js, .ts, .tsx, .java, .cpp, .go, .rs, .sql, .sh, and more).

  • Backend parity with KB routingdeeptutor/utils/document_extractor.py imports FileTypeRouter.TEXT_EXTENSIONS, so chat attachments and KB uploads share one source of truth for text/code formats. FileTypeRouter.decode_bytes() centralizes the UTF-8 / UTF-8-BOM / GBK / GB18030 / Latin-1 / CP1252 fallback chain.
  • SVG as readable source.svg was added to the RAG text extension set and is treated as a document attachment instead of a vision image, letting the LLM inspect the XML source while the frontend still renders a safe thumbnail preview.
  • Typed attachment UXweb/lib/doc-attachments.ts now exposes OFFICE_EXTS, TEXT_LIKE_EXTS, SVG detection, richer MIME/extension fallback, and category-specific icons for code, shell, JSON, config, data, markup, stylesheets, plain text, Office docs, and SVG.
  • Composer copy updated — the drag-and-drop hint now advertises "Images, Office docs, code & text" instead of the older Office-only list.

One-Command Setup Tour

scripts/start_tour.py has evolved from a pure configuration wizard into a 7-step fresh-install path that can detect the local environment, install dependencies, and then guide provider configuration.

  • Dependency installation step — the tour checks Python, uv, Node.js, and npm; installs backend requirements, installs DeepTutor in editable mode, and runs frontend npm install with live terminal output.
  • uv pip compatibility (#376) — Python dependency installation now prefers uv pip when available, and binds --python <current-interpreter> so packages land in the interpreter running the tour instead of an unrelated .venv / $VIRTUAL_ENV.
  • Windows npm detection (#381) — npm invocations use npm.cmd on Windows, fixing version checks and install commands on systems where npm is exposed as a command shim rather than an executable.
  • Provider registry-driven LLM choices — the wizard now reads from the full PROVIDERS registry, groups providers by mode, highlights common services, and includes newer options such as custom_anthropic, OpenRouter, SiliconFlow, Volcengine / BytePlus coding providers, GitHub Copilot, OpenAI Codex, llama.cpp, OVMS, MiniMax, Mistral, Qianfan, Step Fun, and Xiaomi MiMo.
  • Search and config hints — Serper is available in the search-provider list; Azure/API-version and proxy prompts now show inline guidance in both English and Chinese.
  • Long-list terminal UXscripts/_cli_kit.py renders long select menus inside a scrolling window with "more above/below" indicators, preventing stale terminal rows when the provider list exceeds the screen height.

Chat Markdown Export

The chat page now has a Download Markdown action next to "Save to Notebook" and "New Chat". web/lib/chat-export.ts serializes the current conversation as Markdown with a title, ISO export timestamp, role headings, capability labels, and attachment metadata, then downloads it with a sanitized date-stamped filename. This gives users a lightweight local export path for sharing, archiving, or moving a conversation into external notes.

Knowledge Base Management UI Polish

The Knowledge page was tightened up for denser day-to-day management:

  • Creation and upload drop zones are now simpler dashed cards with concise inline upload-policy summaries.
  • KB cards moved from four large stat panels to compact rows showing document count, index readiness, last-updated time, provider, embedding model, live progress, and storage path.
  • Default/delete actions were simplified visually, reducing card height and making large KB lists easier to scan.

Documentation and Localization Refresh

The README family was updated to match the new install story and provider surface:

  • Polish READMEassets/README/README_PL.md adds a complete Polish translation of the project README (#379).
  • Setup docs — the main README and multilingual READMEs now describe the guided tour as the recommended fresh-clone path: create a Python environment, run python scripts/start_tour.py, then use python scripts/start_web.py for daily launch.
  • Provider docs — provider tables were refreshed for the v1.2.3 provider registry, including custom_anthropic, MiniMax's canonical endpoints, coding-plan providers, local providers, and clarified authentication notes for OpenAI Codex / GitHub Copilot.
  • Release/news sections — multilingual READMEs now include v1.2.2 and v1.2.3 summaries, a contributing-guide callout, and the 20k-star community milestone.

UI and Theme Fixes

  • Theme-aware popovers — composer capability/tool/reference/skill menus now use --popover plus backdrop blur instead of --card, improving contrast in the Glass theme and dark popover contexts.
  • Native color-scheme hints — light, dark, Snow, and Glass theme roots declare the correct color-scheme, improving native controls and browser-rendered surfaces.
  • Smooth-scroll hydration — global smooth scrolling is now gated behind html[data-scroll-behavior="smooth"], with the attribute set by the root layout to avoid applying it unintentionally.
  • Sidebar logo sizing — sidebar logo images now pin explicit rendered width/height classes to prevent small layout shifts.

Cleanup

Removed the stale nanobot submodule pointer and the deprecated scripts/extract_numbered_items.sh stub from the repository. The v1.2.x codebase now relies on the in-tree document extraction and RAG routing paths instead of that legacy helper.

Test Suite Expansion

  • Backend document extractiontests/utils/test_document_extractor.py now covers plain text, Python source, JSON, CSV, Markdown, UTF-8-BOM input, GBK fallback decoding, and SVG source extraction.
  • Frontend attachment classificationweb/tests/doc-attachments.test.ts now covers text/code acceptance, SVG-as-document routing, case-insensitive SVG filename detection, and the new icon categories for code, JSON, config, shell, data, markup, stylesheets, and SVG.

What's Changed

  • fix: use npm.cmd for version detection on Windows by @jonathanzhan1975 in https://github.com/HKUDS/DeepTutor/pull/381
  • fix(scripts): prefer uv pip over python -m pip in start_tour.py by @rogercsi in https://github.com/HKUDS/DeepTutor/pull/376
  • docs: add Polish translation of README by @kKamUL in https://github.com/HKUDS/DeepTutor/pull/379

New Contributors

  • @jonathanzhan1975 made their first contribution in https://github.com/HKUDS/DeepTutor/pull/381
  • @rogercsi made their first contribution in https://github.com/HKUDS/DeepTutor/pull/376
  • @kKamUL made their first contribution in https://github.com/HKUDS/DeepTutor/pull/379

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.2.3...v1.2.4

v1.2.3 Breaking risk
Notable features
  • Document attachments (PDF, DOCX, XLSX, PPTX) in chat with preview cards
  • Model thinking block display with collapsible ModelThinkingCard
  • LLM provider core refactor with dedicated modules per provider family
Full changelog

DeepTutor v1.2.3 Release Notes

Release Date: 2026.04.24

Highlights

Document Attachments in Chat

Chat now accepts non-image file attachments (PDF, DOCX, XLSX, PPTX) alongside images. A new paperclip button in the composer opens a system file picker, and drag-and-drop / paste have been extended from images-only to all supported document types. Attached documents are rendered as typed preview cards (colour-coded icon + filename + size label) in both the pending-attachment bar and the message history. On the backend, a new document_extractor module extracts plain text from the uploaded bytes using PyMuPDF / pypdf / python-docx / openpyxl / python-pptx (all optional, graceful fallback) and injects the content into the [Attached Documents] section of the effective user message, so the LLM can read the file without a separate RAG call. File-type classification, per-file and total size limits, and duplicate-name detection are handled by web/lib/doc-attachments.ts on the frontend and DocumentValidator on the backend.

Model Thinking Block Display

Responses from reasoning models (DeepSeek-R1, Claude with extended thinking, QwQ, etc.) often contain <think> / <|thinking|> scratchpad blocks that were previously rendered as raw text. A new parseModelThinkingSegments parser (web/lib/think-segments.ts) splits assistant content into alternating text and thinking segments, and AssistantResponse now renders each thinking segment as a collapsible ModelThinkingCard — collapsed by default — so users can peek at the model's chain-of-thought without the scratchpad overwhelming the conversation. Incomplete (still-streaming) thinking blocks show a pulsing indicator and expand automatically.

Tri-State send_dimensions for Embeddings (#368)

Some OpenAI-compatible embedding providers reject the dimensions parameter that OpenAI's text-embedding-3-* models require. A two-part fix: first, the adapter now only sends dimensions when the model name matches text-embedding-3* (PR #368 by @jefflv). Second, a new Send Dimensions tri-state toggle (Auto / On / Off) was added to the Settings page, the .env store (EMBEDDING_SEND_DIMENSIONS), and the model catalog, giving operators explicit control. Auto (the default) preserves the #368 heuristic; On forces the parameter for providers that accept it on custom models; Off suppresses it unconditionally. The toggle is reflected end-to-end: catalog → provider_runtime.ResolvedEmbeddingConfig.send_dimensionsOpenAICompatibleEmbeddingAdapter.

LLM Provider Core Refactor

The monolithic factory.py has been rewritten around a new provider_core/ package (~3 000 lines) that gives each provider family its own module: openai_compat_provider, anthropic_provider, azure_openai_provider, github_copilot_provider, and openai_codex_provider, all inheriting from a shared BaseProvider. A provider_factory.py resolves the correct runtime provider from config, and factory.py itself shrank from ~600 to ~490 lines by deriving presets dynamically from the PROVIDERS registry instead of maintaining hard-coded dictionaries. A new context_window.py module exposes resolve_effective_context_window(), used by ContextBuilder to base the history-budget calculation on the model's true context window rather than the max_tokens output cap — improving long-conversation history recall for models with large windows.

Soul Template Editor (PR #373)

TutorBot's "Soul" (system personality) can now be authored directly inside the Agents page. The new inline editor lets users create, preview, and save SOUL.md templates with YAML frontmatter, replacing the previous workflow of hand-editing files on disk. Contributed by @srinivasrk.

Co-Writer Save to Notebook

The Co-Writer toolbar gains a Save to Notebook button (📓 icon). Clicking it opens the existing SaveToNotebookModal with a new co_writer record type, so Co-Writer drafts appear alongside chat and quiz entries in the Knowledge page's Notebooks tab. Backend: co_writer was added to the notebook record type enum, the summarize agent prompts, and the API Literal type.

Knowledge Base Management Improvements

  • Drag-and-drop upload — the Knowledge page now accepts file drops directly onto KB cards (or the creation form) with real-time extension/size validation, duplicate-name detection, and a typed file-selection list before upload starts.
  • /supported-file-types endpoint — a new REST endpoint returns the server's current upload policy (accepted extensions, per-file and per-PDF size caps) so the web client stays in sync without hard-coding.
  • Richer KB cards — each card now displays creation / last-updated timestamps, embedding model name, dimension, reindex-needed badge, and live progress bars with percent readout during indexing.
  • Delete resilience (#370)shutil.rmtree now uses an onerror handler that clears the read-only bit and retries, preventing Docker bind-mount and Windows permission errors from leaving a KB stuck in the list.
  • Progress persistenceProgressTracker now writes a snapshot file alongside the kb_config.json entry and emits task-stream events, so WebSocket subscribers and page reloads can recover live indexing state without relying on in-memory callbacks. When a KB reaches ready, the progress blob is removed from config to keep the card looking like a stable resource.

Question Generation Language Fidelity

Extracted the language-directive system from the Book engine's _language.py into a shared services/prompt/language.py module. The Deep Question agents — Generator, IdeaAgent, and FollowupAgent — now call append_language_directive() on their system prompts, ensuring that quiz questions, answer choices, and follow-up answers respect the user's configured language instead of occasionally drifting to English.

Settings Page Tour Redesign

The Run Tour button moved from the page bottom to the top toolbar, sitting alongside Save Draft and Apply. The guided tour itself was expanded to walk through the full save-and-test cycle (Save → Diagnostics → Apply). The former bottom area was replaced with a concise configuration note explaining the runtime priority of model_catalog.json over .env.

CLI: Notebook add-md and replace-md

Two new deeptutor notebook sub-commands let users add or replace Markdown content in a notebook record directly from the terminal, useful for scripted workflows and CI pipelines.

Bug Fixes

  • TutorBot pure-CJK bot name crash — creating a bot whose name contains only non-ASCII characters produced an empty slug, breaking the API route. A stable ASCII fallback (bot-<hash>) is now generated, and the frontend surfaces a creation error toast instead of silently failing.
  • React 19 I18nProvider render-time setState warningi18n.init() was being called synchronously during the first render of I18nProvider, triggering a React 19 warning about state updates in the render phase. Initialization is now performed at module-load time; the provider body only syncs document.documentElement.lang via useEffect.
  • Skills default to offskillsAutoMode now defaults to false so new users aren't surprised by automatic skill injection until they intentionally enable it.
  • Moonshot default Base URL — changed from https://api.moonshot.ai/v1 to https://api.moonshot.cn/v1 to match the provider's current canonical endpoint, with all multilingual README files updated.
  • Version badge display — fixed a minor rendering issue in the sidebar version badge introduced during the v1.2.2 merge.

Provider Registry Enhancements

  • custom_anthropic — a new provider spec for Anthropic-API-compatible endpoints (e.g. AWS Bedrock proxies) that routes through the Anthropic backend instead of OpenAI-compat.
  • thinking_style — new ProviderSpec field allowing providers like Volcengine to advertise how they signal extended-thinking mode.
  • Alias expansionlm-studio, anthropic-compatible, openai-compatible (hyphenated) are now recognized; canonical_provider_name() uses to_snake() for more robust normalization.

Test Suite Expansion

Net +1 900 test lines across 15 new or expanded files: test_document_extractor.py (245), test_language_prompts.py (156), think-segments.test.ts (122), test_provider_runtime.py (103), test_llm_probe_config.py (116), test_manager_delete.py (79), doc-attachments.test.ts (75), test_context_window_detection.py (60), test_progress_tracker.py (58), quiz-question-type.test.ts (58), plus expansions to test_factory_provider_exec.py, test_context_builder.py, test_chat_params_config.py, test_config_module.py, test_knowledge_router.py, and test_start_tour.py.

What's Changed

  • fix(embedding): gate dimensions for text-embedding-3 models by @Jeff-Lv in https://github.com/HKUDS/DeepTutor/pull/368
  • feat: Revamp soul template creation and usage by @srinivasrk in https://github.com/HKUDS/DeepTutor/pull/373
  • feat(cli): add add-md and replace-md commands to notebook group by @zbinxp in https://github.com/HKUDS/DeepTutor/pull/371

New Contributors

  • @Jeff-Lv made their first contribution in https://github.com/HKUDS/DeepTutor/pull/368
  • @zbinxp made their first contribution in https://github.com/HKUDS/DeepTutor/pull/371

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.2.2...v1.2.3

v1.2.2 Breaking risk
Notable features
  • User-authored skills system with SKILL.md
  • Chat input performance overhaul for long conversations
  • Auto-fallback for response_format rejection
Full changelog

DeepTutor v1.2.2 Release Notes

Release Date: 2026.04.22

Highlights

User-Authored Skills System

Introduced a full Skills subsystem that lets users create, edit, and activate custom SKILL.md files from the web UI. Each skill lives under data/user/workspace/skills/<name>/SKILL.md with YAML frontmatter (name, description, optional triggers) and a Markdown body that is injected verbatim into the chat system prompt when active.

  • BackendSkillService (deeptutor/services/skill/service.py) provides CRUD + listing + selection with strict name validation (^[a-z0-9][a-z0-9-]{0,63}$), frontmatter parsing, and directory-level isolation. REST API mounted at /api/v1/skills with GET /list, GET /{name}, POST /create, PUT /{name}, DELETE /{name}.
  • Frontend — a new Skills tab in the Knowledge management page (web/app/(utility)/knowledge/page.tsx) with inline SKILL.md editor, and a skill picker menu in the chat composer. Skills can be toggled individually or set to auto mode (sends ["auto"] to the backend). web/lib/skills-api.ts provides a cached client-side API layer. UnifiedChatContext carries skills through the WebSocket payload.

Chat Input Performance Overhaul (#351, #360, #362)

Eliminated input lag in long conversations through deep state colocation:

  • ComposerInput — extracted from ChatComposer into its own memo'd component so frequent keystroke re-renders are isolated from the rest of the composer (capability panels, reference chips, tool menus). Exposes an imperative ComposerInputHandle (clear(), getValue()) to avoid lifting input state.
  • SimpleComposerInput — a lightweight variant for the TutorBot chat page that strips @-mention and capability overhead entirely, fixing residual lag reported after the initial refactor.
  • React.memo on config panelsQuizConfigPanel, MathAnimatorConfigPanel, ResearchConfigPanel, and VisualizeConfigPanel are now wrapped in React.memo; parent callbacks in ChatPage stabilized with useCallback.
  • @-mention helpers exportedshouldOpenAtPopup and stripTrailingAtMention moved to ComposerInput.tsx as named exports for cross-component reuse.

Auto-Fallback for response_format Rejection

Extended the v1.2.1 static supports_response_format guard with a runtime auto-recovery path. When a provider unexpectedly returns HTTP 400 for response_format={"type":"json_object"} (common with LM Studio + Gemma/Qwen-style models), both LLM execution paths now detect the error, drop the field, and retry once:

  • aiohttp path (cloud_provider.py) — _looks_like_unsupported_response_format heuristic detects the rejection in _openai_complete and _openai_stream; on match, the request is retried without response_format and the (binding, model) pair is cached.
  • OpenAI SDK path (executors.py) — _create_with_format_fallback wraps client.chat.completions.create, catching BadRequestError and applying the same retry + cache logic.
  • Runtime cache (capabilities.py) — disable_response_format_at_runtime / is_response_format_disabled_at_runtime record discovered incompatibilities in a module-level set so subsequent calls skip response_format upfront without paying the retry cost.
  • _answer_now.pystream_synthesis now checks supports_response_format before attaching response_format, preventing the 400 in the first place for fast-path answer flows.

LAN / Remote Access Fix (#340)

When another machine on the local network opens the web UI, the build-time NEXT_PUBLIC_API_BASE (typically http://localhost:8001) would resolve to the remote browser's own loopback instead of the server. A new resolveBase() helper in web/lib/api.ts detects this mismatch and swaps the hostname for window.location.hostname at runtime, so apiUrl() and wsUrl() reach the correct backend regardless of which machine opened the page.

Sidebar Version Badge & GitHub Link

The sidebar now displays the current build version alongside a status indicator:

  • /api/version route — a Next.js ISR endpoint that queries api.github.com/repos/HKUDS/DeepTutor/releases/latest (hourly revalidation, optional GITHUB_TOKEN for rate-limit headroom) and returns the latest tag, name, URL, and publish date.
  • VersionBadge — compares NEXT_PUBLIC_APP_VERSION (injected at build time via next.config.js / git describe --tags) against the fetched latest release. Shows a green dot when up-to-date, amber when outdated, and neutral when unknown. Clicking navigates to the release page.
  • Dockerfile — new APP_VERSION build arg piped into the Next.js env so Docker-based deployments also get accurate version display.
  • GitHub icon — a direct link to the repository added to both collapsed and expanded sidebar states.

Deep Solve Image Attachment Support

DeepSolveCapability now extracts the first image attachment (data-URI or URL) from context.attachments via a new _first_image_url helper and passes it to the planner and solver agents. Internally, PlannerAgent and SolverAgent were refactored to use the Attachment dataclass through BaseAgent's unified multimodal pipeline instead of manually constructing multimodal message arrays — removing two identical _build_multimodal_messages helper functions. BaseAgent.stream_llm also gained logic to construct messages from system_prompt + user_prompt when attachments are provided without explicit messages, and logs when images are stripped for non-vision models.

Agentic Chat Pipeline Attachment Passthrough

The _stage_responding and _stage_acting methods in AgenticChatPipeline now call _prepare_messages_with_attachments after building messages, ensuring that user-uploaded images are forwarded to the LLM in every chat stage — not just the initial thinking stage.

TutorBot WebSocket Resilience (#354)

Hardened the /api/v1/tutorbot/{bot_id}/ws endpoint:

  • Auto-start — if the bot is configured but not running when a WebSocket connects, the endpoint now calls mgr.start_bot() automatically instead of immediately closing with 4004. Unknown bot IDs still receive a JSON error payload before close.
  • _safe_send wrapper — all ws.send_json calls go through a helper that catches WebSocketDisconnect / RuntimeError, preventing unhandled exceptions when the client drops mid-stream.
  • Graceful disconnect_handle_user_messages catches WebSocketDisconnect on receive_text and breaks cleanly.

Settings Page: API Key Masking (#355)

The API Key field on the Settings page is now rendered as type="password" by default with an Eye / EyeOff toggle button. The visibility state resets when switching between services or profiles.

Book Library UI Overhaul

Replaced the minimal "Select a book" placeholder with a full BookLibrary component (web/app/(workspace)/book/components/BookLibrary.tsx) featuring search, status-filtered cards with chapter/page counts, creation and deletion actions, and status badges (Draft / Outline / Compiling / Ready). BookSidebar was simplified to a single-book reader view with a "← All books" back button, and the book page now conditionally shows either the library or the sidebar based on the current view.

Visualization Fullscreen Mode

SVG, Mermaid, and ChartJS visualizations now have a Maximize button (top-right) that opens the graphic in a fullscreen overlay with Escape-to-close. HTML iframe visualizations are excluded since they already provide their own "Open in new tab" affordance.

Bug Fixes

  • Embedding adapter ConnectError not retried (#353) — added httpx.ConnectError to the retry exception tuple in OpenAICompatibleEmbeddingAdapter, so transient connection failures during embedding are retried with exponential backoff instead of raising immediately.
  • RAG None-embedding hardeningCustomEmbedding._aget_query_embedding and _aget_text_embedding now raise a clear ValueError when the provider returns None instead of crashing later in similarity computation. The batch method _aget_text_embeddings now determines the fallback zero-vector dimension from sibling vectors or the configured dim, and raises if neither is available (preventing silent persistence of zero-length vectors).
  • KB delete crash on missing directoryKnowledgeBaseManager.delete_knowledge_base now resolves the path directly via self.base_dir / name and handles the case where the on-disk folder was already removed, cleaning up the orphaned kb_config.json entry instead of raising FileNotFoundError.
  • CLI parse_json_object whitespace — the function now strips leading/trailing whitespace before parsing, so trailing newlines from shell piping no longer cause json.JSONDecodeError.

Test Suite Expansion

  • TutorBot WebSocket — 2 new tests covering auto-start of stopped-but-configured bots and JSON error payload for unknown bot IDs (tests/api/test_tutorbot_router.py).
  • CLI helpers — 2 tests for parse_json_object whitespace handling and invalid JSON (tests/cli/test_common.py).
  • Route params — 3 tests for the new firstParam utility (web/tests/route-params.test.ts).
  • API resolve base — tests for the LAN hostname swap logic (web/tests/api-resolve-base.test.ts).

What's Changed

  • fix: add ConnectError to embedding retry exceptions by @S-A-D-4 in https://github.com/HKUDS/DeepTutor/pull/353
  • feat: Hide the API key on settings page by @srinivasrk in https://github.com/HKUDS/DeepTutor/pull/355
  • fix : Tutorbot websocket resilience and CLI config parsing edge case by @srinivasrk in https://github.com/HKUDS/DeepTutor/pull/354
  • Fix #296: [Bug]:LLM 0, EMBEDDING 0, SEARCH 0 by @JiwaniZakir in https://github.com/HKUDS/DeepTutor/pull/340
  • perf(web): decouple chat input state to resolve lag in long conversat… by @jiakeboge in https://github.com/HKUDS/DeepTutor/pull/360
  • Perf/optimize chat input by @jiakeboge in https://github.com/HKUDS/DeepTutor/pull/362

New Contributors

  • @S-A-D-4 made their first contribution in https://github.com/HKUDS/DeepTutor/pull/353
  • @JiwaniZakir made their first contribution in https://github.com/HKUDS/DeepTutor/pull/340

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.2.1...v1.2.2

v1.2.1 New feature
Notable features
  • Regenerate last response across CLI (/regenerate), WebSocket (type: regenerate), and Web UI
  • Per-stage token limits (responding, answer_now, thinking, observing, acting, react_fallback) and temperature configurable via agents.yaml
  • Fixed dark code blocks unreadable on light theme, None embeddings crash in LlamaIndex, and Gemma model json_object response format rejection
Full changelog

DeepTutor v1.2.1 Release Notes

Release Date: 2026.04.21

Highlights

Per-Stage Token Limits & Temperature for Chat (#348)

Promoted the agentic chat pipeline to a first-class config citizen in agents.yaml. A new capabilities.chat block exposes per-stage max_tokens (responding, answer_now, thinking, observing, acting, react_fallback) and a shared temperature, deep-merged over baked-in defaults via services/config/loader.py::get_chat_params(). The responding and answer_now budgets jump from the previous hard-coded 1800 to 8000, eliminating the mid-sentence truncation that was clipping long answers. Internally, _ChatLimits.from_config coerces every legacy shape (missing keys, scalar instead of dict, partial overrides) into a stable dataclass so existing installs keep working without touching their YAML. 10 new unit tests cover loader resolution, deep-merge precedence, and dataclass coercion.

Regenerate Last Response — CLI, WebSocket, Web UI (#349)

Added a real regenerate flow that re-runs the previous user message in place, working uniformly across every entry point:

  • CLI/regenerate (alias /retry) inside the deeptutor chat REPL.
  • WebSockettype: "regenerate" message on /api/v1/ws, with optional overrides for capability, tools, knowledge_bases, language, config.
  • Web UI — a per-message Regenerate button (RefreshCcw icon) on the last assistant turn for chat-capability replies.

On the backend, TurnRuntimeManager.regenerate_last_turn rolls back the trailing assistant via the new SQLiteSessionStore.delete_message / get_last_message helpers, then reuses start_turn with _persist_user_message=False and _regenerate=True so the user row isn't duplicated and memory_service.refresh_from_turn isn't run a second time. Pre-flight checks raise non-fatal regenerate_busy (another turn is running) or nothing_to_regenerate (no prior user message) errors instead of silently failing. The _stage_responding LLM call also gained empty-response diagnostics that surface a structured warning when the model returns no content. 18 new tests cover all three reject paths, the delete-then-restart flow, the memory-refresh-skip contract, and the WebSocket round-trip.

UI Harmony Polish for Regenerate

Two follow-up tweaks so the new button matches the rest of the chat UI and behaves predictably under server rejection:

  • i18n — added Regenerate keys to web/locales/{en,zh}/app.json (Regenerate / 重新生成) and switched ChatMessages.tsx from a hardcoded "Regenerate" string to t("Regenerate"), matching the existing t("Copy") pattern in the same row.
  • Optimistic-pop rollback — when the server rejects a regenerate request pre-flight (regenerate_busy / nothing_to_regenerate), the optimistic POP_LAST_ASSISTANT + STREAM_START placeholder is now restored via a new RESTORE_ASSISTANT reducer action. The popped message is held in a per-key pendingRegenerateRef and cleared on done or any terminal result, so the transcript never silently loses the user's last reply.

Bug Fixes

  • Dark code blocks unreadable on light theme (#352) — the hard-coded #1f2937 / #292524 code-block background combined with .prose pre (forcing #D6D3D1) and .prose code:not(.md-code-block__code):not(.md-inline-code) (overriding to var(--foreground)) was producing near-black text on near-black backgrounds in light mode. Added a higher-specificity .md-renderer .md-code-block { ,pre,code } rule that pins #e5e7eb regardless of theme, and tagged the <code> elements in RichMarkdownRenderer and SimpleMarkdownRenderer fallbacks with the existing md-code-block__code class so the :not() guard kicks in. Thanks @DarkGenius.
  • None embeddings crashed LlamaIndex pipeline (#347, fixes #346) — when an embedding provider returns null for a chunk's vector, the None ended up in the vector index and blew up np.dot(NoneType) during similarity computation. Two-layer fix: _extract_embeddings_from_response now uses or [] instead of get(key, default) so explicit None values are caught, and CustomEmbedding._get_text_embeddings validates the batch result and substitutes a zero vector for any None slot. Thanks @kagura-agent.
  • Gemma models rejected json_object response_format (#345, fixes #344) — Gemma served through LM Studio (and similar local OpenAI-compatible servers) responds 400 "'response_format.type' must be 'json_schema' or 'text'" when handed response_format={"type": "json_object"}. Added supports_response_format: False to the existing gemma MODEL_OVERRIDES entry so the json_object path is skipped; the existing extract_json_object utilities in the visualize and math-animator agents already parse JSON from plain text, so all callers continue to work without further changes. Thanks @octo-patch.

Test Suite Expansion

Net +575 test lines: 10 cases for the chat-params loader / _ChatLimits coercion (tests/services/config/test_chat_params_config.py), 18 cases for the regenerate flow including all three reject paths, the in-place delete + restart, the memory-refresh skip, and the end-to-end no-duplicate-user contract (tests/services/session/test_regenerate.py), 14 cases for supports_response_format model overrides (tests/services/llm/test_capabilities.py), and a regression test for the None-embedding extraction path (tests/services/embedding/test_extract_embeddings.py).

What's Changed

  • fix(rag): guard against None embeddings in LlamaIndex pipeline by @kagura-agent in https://github.com/HKUDS/DeepTutor/pull/347
  • fix: disable json_object response_format for gemma models by @octo-patch in https://github.com/HKUDS/DeepTutor/pull/345
  • fix(web): ensure readable text in dark code blocks on light theme by @DarkGenius in https://github.com/HKUDS/DeepTutor/pull/352
  • feat(chat): make per-stage token limits and temperature configurable via agents.yaml by @DarkGenius in https://github.com/HKUDS/DeepTutor/pull/348
  • feat(chat): regenerate last response (CLI, WebSocket, Web UI) by @DarkGenius in https://github.com/HKUDS/DeepTutor/pull/349

Community Contributions

  • @DarkGenius — Make per-stage chat token limits configurable via agents.yaml (#348)
  • @DarkGenius — Regenerate last response across CLI / WebSocket / Web UI (#349)
  • @DarkGenius — Ensure readable text in dark code blocks on light theme (#352)
  • @kagura-agent — Guard against None embeddings in the LlamaIndex pipeline (#347)
  • @octo-patch — Disable json_object response_format for Gemma models (#345)

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.2.0...v1.2.1

v1.2.0 Breaking risk
Breaking changes
  • Guided Learning module (deeptutor/agents/guide/) and entire /guide web UI removed
Notable features
  • Book Engine: multi-agent pipeline (Ideation, Source Exploration, Spine Synthesis, Page Planning, Block Compilation) with 14 block types
  • Multi-document Co-Writer workspace with persistent per-document storage
  • Interactive HTML visualization support for stateful content
Full changelog

DeepTutor v1.2.0 Release Notes

Release Date: 2026.04.20

Highlights

Book Engine — Multi-Agent "Living Book" Compiler

Introduced a brand-new Book Engine (deeptutor/book/) that compiles user inputs — chat history, notebooks, knowledge bases, and free-form intent — into structured, block-based, interactive "living books". The engine sits parallel to ChatOrchestrator and drives a five-stage multi-agent pipeline:

  1. Ideation — an LLM proposes a book outline from the user's intent and source material.
  2. Source exploration — a SourceExplorer agent performs deep RAG retrieval and knowledge-base health checks (kb_health.py) to surface the most relevant passages.
  3. Spine synthesis — a SpineSynthesizer agent merges the proposal with explored sources into a chapter/page tree (Spine → Chapter → Page).
  4. Page planning — a PagePlanner agent designs each page as an ordered sequence of typed blocks.
  5. Block compilation — a BookCompiler dispatches each block to its dedicated generator.

14 block types ship in Phase 1: text, callout, quiz, flash cards, code, figure, deep dive, animation, interactive, timeline, concept graph, section, user note, and a placeholder for blocks still compiling. Each generator has its own bilingual (en/zh) YAML prompts and can call RAG helpers for grounded content.

The web UI (web/app/(workspace)/book/) includes a BookCreator wizard (intent → proposal → spine confirmation), a SpineEditor for drag-and-drop chapter reordering, a PageReader with an outline nav rail and per-block renderers (concept graphs rendered as interactive force-directed diagrams, quizzes with inline grading, flash cards with flip animations, etc.), a BookProgressTimeline for real-time compilation tracking, a BookChatPanel for in-context Q&A, and a BookHealthBanner that warns when the underlying KB is unhealthy. A per-book WebSocket stream fans out compilation events to all connected clients.

Backend: POST /api/v1/book/create, POST /confirm-proposal, POST /confirm-spine, POST /compile-page, GET /list, GET /{book_id}, DELETE /{book_id}, plus WS /ws/{book_id}. CLI: deeptutor book create, deeptutor book list, deeptutor book show, deeptutor book delete.

Legacy Guided Learning Removed

The deeptutor/agents/guide/ module (guide manager, 4 agents, 8 prompt YAMLs) and the entire /guide web UI (components, hooks, types, API router — ~5,300 lines) have been removed. The Book Engine supersedes Guided Learning with a richer, more extensible architecture.

Multi-Document Co-Writer Workspace

Co-Writer is no longer a single-document scratchpad. Each document now gets its own persistent directory under data/user/workspace/co-writer/documents/ with atomic manifest writes (CoWriterStorage). The web UI routes to per-document pages (/co-writer/[docId]) and the sidebar shows a CoWriterRecent section for quick access. New API endpoints handle full document CRUD (GET /list, POST /create, GET /{doc_id}, PATCH /{doc_id}, DELETE /{doc_id}).

Interactive HTML Visualizations

The Visualize capability now supports a fourth render type — html — alongside svg, chartjs, and mermaid. When the LLM determines that the user request requires "user interaction + state changes + mixed text/graphics" (e.g. draggable demos, step-by-step walkthroughs, clickable practice exercises), it produces a self-contained single-file HTML page that renders inside an iframe. A local validation pass (is_valid_html_document) checks the output before serving; if the model returns something unrenderable, a styled fallback template is injected instead. The LLM review stage is skipped for HTML pages (saving 30–60s of latency with negligible quality loss). A new figure constraint mode restricts the LLM to svg/chartjs/mermaid only — used internally by the Book Engine's figure block.

Question Bank @-Mention in Chat

A new QuestionBankPicker component lets users @-mention individual Question Bank entries directly in the chat composer. Selected entries are resolved by the turn runtime (_build_question_bank_context) into structured Markdown context — including question text, options, user/reference answers, and explanations — and injected alongside notebook and history references so the LLM can reason over specific past quiz performance.

Prompt Externalization — Phase 2

Continued the migration of hard-coded LLM strings into editable YAML files:

  • answer_nowquestion, research, solve, and visualize capabilities now load their fast-path prompts from prompts/{en,zh}/answer_now.yaml.
  • notebook agentsanalysis_agent and summarize_agent prompts moved to YAML; the Python modules now delegate to the prompt manager.
  • Capability modules (_answer_now.py, deep_question.py, deep_research.py, deep_solve.py, visualize.py) slimmed down accordingly.

Co-Writer Module Restructured

Moved deeptutor/agents/co_writer/ to deeptutor/co_writer/ (top-level service module) to reflect its standalone nature alongside deeptutor/book/.

Sidebar Overhaul

  • "Guided Learning" nav entry replaced with Book (Library icon).
  • Added BookRecent and CoWriterRecent sidebar sections with per-item navigation.
  • Sidebar collapsed/expanded state lifted into AppShellContext so it persists across route transitions.
  • Collapsed sidebar refined: logo + expand toggle layered with hover reveal, circular "New Chat" button, subtle dividers, and consistent spacing.

Capability & Config Panel Refresh

Updated shared type definitions and config panels across Quiz, Research, Visualize, and Math Animator to align with new capability options (e.g. html render mode, expanded locale keys). TracePanels in the chat UI received layout and styling improvements.

README & Localization Update

All nine localized README files (AR, CN, ES, FR, HI, JA, PT, RU, TH) updated to reflect the Book Engine, multi-document Co-Writer, and other v1.2.0 features.

Bug Fixes

  • Channel manager import-time crashloguru.logger was imported at module scope, causing ImportError when TutorBot dependencies were absent. Replaced with a lazy _logger() factory function.
  • Obsolete test_app_facade.py — removed a stale test module that referenced deleted code paths.
  • Missing __init__.py in test packages — added init files across tests/agents/, tests/api/, tests/cli/, tests/knowledge/, tests/scripts/, tests/services/llm/, tests/services/memory/, tests/services/search/, tests/services/session/, and tests/tools/ to fix import resolution.

Test Suite & CI

  • CI now installs requirements/tutorbot.txt and caches it alongside server/cli requirements.
  • Removed the obsolete test_app_facade from the CI test manifest.
  • Updated conftest.py for RAG pipeline tests; refreshed prompt parity, capabilities runtime, LLM probe config, factory provider, model catalog, config loader, and notebook service tests to match restructured imports.
  • Added favicon and apple-touch-icon assets for PWA metadata.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.1.2...v1.2.0

v1.1.2 Breaking risk
Breaking changes
  • RAG providers other than llamaindex removed; legacy rag_provider values auto-coerced to llamaindex
  • RAG scaffolding modules removed: chunkers, embedders, indexers, parsers, retrievers, pipeline orchestrator
  • PATCH /tutorbot/{bot_id} now returns HTTP 422 for invalid channel payloads instead of silently persisting
Security fixes
  • Channel secrets no longer exposed in API responses; tokens and passwords masked by default
Notable features
  • Schema-driven channels tab auto-discovers all channel types (Telegram, Slack, Discord, Matrix, Email, Feishu) from Pydantic schema with token reveal toggle
  • Channel secrets masked in API responses by default; plaintext available via ?include_secrets=true query parameter
  • Chat prompts externalized to editable YAML files per language
Full changelog

DeepTutor v1.1.2 Release Notes

Release Date: 2026.04.18

Highlights

Schema-Driven Channels Tab with Token Reveal (#338)

The Channels tab in the Agents page is no longer hard-coded for Telegram. It now auto-discovers every channel (Telegram, Slack, Discord, Matrix, Email, Feishu, …) and renders a form directly from each channel's Pydantic config schema — no per-channel front-end code required. Secret fields (tokens, passwords, API keys) render as masked inputs with an eye-toggle for explicit reveal. A last_reload_error banner warns when live listeners failed to restart after a config change.

Channel Secret Masking

API responses no longer expose raw channel secrets. Tokens and passwords are replaced with *** by default; the admin edit form uses ?include_secrets=true to fetch plaintext when needed. Create and update responses are likewise masked.

Channel Config Validation & Reload Hardening

PATCH /tutorbot/{bot_id} now validates channel payloads upfront and returns a 422 with structured errors instead of silently persisting bad config. reload_channels is serialised with a per-instance lock to prevent duplicate listeners, and any failure is recorded in last_reload_error so the UI can surface it.

RAG Simplified to a Single Pipeline

Removed ~2,600 lines of unused RAG scaffolding (chunkers, embedders, indexers, parsers, retrievers, pipeline orchestrator, type definitions) that existed as placeholders for never-shipped backends. The RAG service is now a thin wrapper over the single LlamaIndex pipeline. Legacy rag_provider values (e.g. lightrag) are silently coerced to llamaindex and the KB is flagged for re-indexing.

Centralized File Type Routing

Consolidated file-type classification into a single FileTypeRouter module with a flat API (get_document_type, classify_files, get_supported_extensions, etc.). The old per-provider extension helpers are gone — there's only one provider. Unknown extensions still fall through to content sniffing before being rejected.

No More Phantom Knowledge Bases

Closed every code path that could silently call RAG against a non-existent KB:

  • deep_solve — strips the rag tool when no KB is attached and warns the user.
  • deep_research — drops kb from sources, warns, and aborts if no sources remain.
  • SolveToolRuntime — returns a graceful "no KB selected" observation instead of crashing, keeping the ReAct loop alive.
  • ResearchPipeline — returns a structured "skipped" event instead of falling back to the old DE-all placeholder.
  • DecomposeAgent — no longer defaults to ai_textbook; disables RAG when no KB is provided.

Externalized Chat Prompts

Moved all hard-coded zh/en strings out of AgenticChatPipeline into editable YAML files (agentic_chat.yaml for each language). Stage labels, system prompts, user templates, and UI notices are now configurable without code changes. Falls back gracefully if the YAML is missing.

Thai README (#337)

Added README_TH.md with Thai-language documentation.

Bug Fixes

  • Research pipeline crashed without a KB — the DE-all fallback KB no longer exists in most installs; now short-circuits with a structured skip event.
  • Decompose agent tried RAG against ai_textbook — replaced the hard-coded default with None and a defensive guard.
  • Bad channel config persisted silently — now rejected at the API boundary with a 422 before reaching disk.
  • Concurrent reload_channels created duplicate listeners — serialised via an asyncio lock; failure leaves the bot channel-less with a clear error instead of half-rebuilt.
  • Channel tokens leaked in API responses — now masked by default across all endpoints.

Test Suite

Added 6 new test modules (1,042 lines total): file-type routing, KB config migration, channel schema introspection, channel secret masking, RAG/KB consistency at the capability layer, and research pipeline RAG safety. Extended existing tests for the tool runtime, knowledge router, TutorBot router, and RAG pipeline modules.

What's Changed

  • feat(tutorbot): Channels tab, Telegram UI, API channel reload, token … by @srinivasrk in https://github.com/HKUDS/DeepTutor/pull/338
  • docs: add Thai README documentation by @DoctorNasa in https://github.com/HKUDS/DeepTutor/pull/337
  • release: v1.1.2 — CI fix & release notes cleanup by @pancacake in https://github.com/HKUDS/DeepTutor/pull/341

New Contributors

  • @DoctorNasa made their first contribution in https://github.com/HKUDS/DeepTutor/pull/337

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.1.1...v1.1.2

v1.1.1 Breaking risk
Notable features
  • Answer Now fast-paths added to all capabilities (chat, deep_solve, deep_question, deep_research, math_animator, visualize)
  • Co-Writer editor with resizable splitter and bidirectional scroll sync via source line mapping
  • Save-to-Notebook now supports message selection mode with quick presets and role-based icons
Full changelog

DeepTutor v1.1.1 Release Notes

Release Date: 2026.04.17

Highlights

Universal "Answer Now" Escape Hatch — Per-Capability Fast Paths

Promoted "Answer now" from a chat-only affordance to a universal interrupt that respects each capability's output shape. A new shared helper deeptutor/capabilities/_answer_now.py provides the gate (extract_answer_now_context) and the prompt-friendly trace summary, and every built-in capability now owns its own fast-path branch at the top of run():

  • chat — synthesize the final markdown answer from the partial trace (existing behavior).
  • deep_solve — skip planning + reasoning, jump straight into the writer.
  • deep_question — skip ideation/templates, emit the full quiz JSON in one structured call (still rendered by QuizViewer).
  • deep_research — skip rephrase/decompose/research, write the report directly from accumulated evidence.
  • math_animator — skip analysis + design + summary but keep code generation + render, so the user still gets a real animation.
  • visualize — skip analysis + review, emit the final renderable code in one structured call.

Each fast-path preserves the same result envelope as the normal pipeline (so the Quiz / MathAnimator / Visualization viewers render unchanged) and prepends a > ⚡ Skipped X stage(s) notice so users know it was a best-effort early exit. The orchestrator no longer re-routes answer_now to chat; it now keeps active_capability and only falls back to chat if the originally selected capability has been removed from the registry. The frontend matches: handleAnswerNow no longer overrides the snapshot's capability, and a single-shot guarded AnswerNowRow component renders the action inline below the streaming trace panel.

Co-Writer — Resizable Split & Line-Anchored Scroll Sync

The Co-Writer page picked up a draggable splitter between the editor and preview panes (with a persisted ratio in localStorage) and a true bidirectional scroll-sync that survives soft-wrapped lines. Each preview block now carries a data-source-line attribute pointing back at its starting line in the markdown source (provided by remark's AST positions); on the editor side a hidden mirror element mimics the textarea's wrap geometry so we can read the real pixel-y of every source line. With both sides expressed as pixel coordinates the sync becomes a single piecewise-linear interpolation in either direction, with a per-source-line cache that invalidates only when the content or wrap width changes.

Save-to-Notebook — Message Selection Mode

SaveToNotebookModal now accepts an optional messages prop that flips the modal into "selection mode": the user picks exactly which user/assistant turns to include, and the transcript + userQuery shipped to the backend are rebuilt from the selected subset. Quick presets ("Select all", "Last turn", "Last 3 turns") and an auto-derived title that tracks the first selected user message keep the flow fast for the common cases. The modal also now uses Check / MessageSquare / User icons to distinguish roles at a glance, and reports loading state for the notebook list separately from the save spinner.

Real Notebook System Adoption Across the Stack

Migrated every remaining Notebook surface off the legacy quiz-category API onto the real /api/v1/notebook/* endpoints. A new web/lib/notebook-api.ts block exports typed helpers — listNotebooks, getNotebook, createNotebook, updateNotebook, deleteNotebook, deleteNotebookRecord — alongside the preserved quiz-only helpers. The Knowledge → Notebooks tab, the Guide page's notebook reference picker (useNotebookSelection), and the Save-to-Notebook modal all now resolve UUIDs end-to-end, so records saved from Co-Writer, Chat, or Guided Learning are immediately discoverable as references everywhere.

Unified Collapsible Settings Panel

Extracted the collapsible "Settings" section that previously only existed in ResearchConfigPanel into a shared CollapsibleConfigSection component. The Quiz, Math Animator, and Visualize panels now share the exact same chevron + summary header, and each form ships a summarizeXxxConfig helper so the collapsed state shows a meaningful one-liner (e.g. Custom · 5q · Hard · MCQ or Mimic · paper.pdf · max 10). The chat page now keeps a single panelCollapsed state for whichever capability is active, auto-expands on capability switch, and auto-collapses after sending a message so the composer stays compact during conversation.

Streaming Stop Button & Composer Polish

Replaced the spinner-inside-the-Send-button with a dedicated Stop button that appears in place of Send while a turn is streaming. A faint ring slowly rotates around the rim to signal "still working — click to cancel", with a white square front-and-center as the click target. The header above the messages (capability label + Save / New chat buttons) is now always rendered, and the messages container picks up a soft mask gradient at the top and bottom so streaming content fades in/out instead of clipping at the scroll edge. In Deep Research mode, sources moved into a dropdown with a compact summary line of the active picks, matching the pattern used by the tool selector.

TutorBot Config Manager Refactor

Rewrote TutorBotManager's config persistence into a small public API (load_bot_config, save_bot_config, merge_bot_config) with three meaningful improvements: writes are now atomic (write-temp + Path.replace) so a killed process never leaves a half-written config.yaml; merges have explicit-clear semanticsNone means "leave as-is", an empty string or empty dict is an intentional clear — so clients can deliberately wipe a description or channels list; and the API endpoint forwards only model_dump(exclude_unset=True) so omitted fields fall through to the on-disk value. New regression tests cover the atomic-write contract, the corrupt-yaml fallback, and the four merge-semantics cases.

Markdown Renderer Refinements

The MarkdownRenderer family gained a trackSourceLines prop that propagates data-source-line attributes through every block element (headings, lists, paragraphs, etc.) and bypasses the line-shifting normalization passes (processMarkdownContent, normalizeMarkdownForDisplay) so AST positions stay faithful for editor/preview sync consumers. RichCodeBlock now skips react-syntax-highlighter entirely for unlabeled / text / plaintext fences (eliminating Prism "unknown language" warnings) and renders them as a tidy plain-monospace block. Mermaid detection was also extended to recognize editor.md style ```flow, ```seq, and ```sequence fences that get rewritten to mermaid by the preprocessor.

Theme & Guide UI Refresh

Tightened the default light and Snow themes with deeper foregrounds, warmer borders, and a slightly more saturated --primary (#B0501E) for better legibility against the new card surfaces. The Guided Learning page (/guide) was migrated off hardcoded slate-* / indigo-* palettes onto the design tokens (var(--card), var(--primary), var(--muted-foreground), etc.) so it now respects the theme switcher. HistorySessionPicker got the same treatment, plus a fix for session timestamps that were being treated as milliseconds instead of seconds (which produced 1970 dates).

System Message Rendering Fix

Backend system messages (e.g. quiz follow-up grounding context written by the turn runtime) are now filtered out at the UnifiedChatContext.hydrateMessages boundary and again defensively in ChatMessageList, so they never surface as ghost chat bubbles in the UI while still flowing into the LLM context as intended.

Bug Fixes

  • TutorBot channel config wiped on every server restart (#332)create_and_start_bot was constructing a fresh BotConfig with empty defaults on every call, which _save_bot_config then persisted over the existing config.yaml, wiping user-configured channels (e.g. Telegram). The endpoint now loads the existing config first and overlays only client-supplied fields.
  • selective_access_log middleware crash on every non-200 response (#334 / #335) — the middleware passed four args to uvicorn's AccessFormatter which expects five (omitting http_version), raising ValueError: not enough values to unpack on every error response. Now reads http_version from the ASGI scope with a 1.1 fallback.
  • 15 npm security vulnerabilities (#330) — bumped jspdf 4.0.0 → 4.2.0 (9 CVEs incl. critical PDF injection), next 16.1.1 → 16.2.3 (8 CVEs incl. HTTP smuggling, CSRF bypass), mermaid 11.12.2 → 11.14.0, and the matching eslint-config-next. npm audit fix swept up the indirect chain (flatted, lodash-es, minimatch, picomatch, dompurify, ajv, brace-expansion). End state: 0 vulnerabilities, no breaking changes.
  • .env.example_CN — removed an accidental // README suffix on the provider-list comment.

Test Suite Expansion

Added a new tests/services/tutorbot/test_manager_config.py module covering load/save round-trips, the corrupt-yaml fallback, atomic temp-file writes, failure-recovery after a mid-write OSError, and all four merge_bot_config semantics (no existing config, omitted-field passthrough, None-as-not-provided, and empty-value-as-explicit-clear). Extended tests/api/test_tutorbot_router.py with the explicit-clear test class. Rewrote the orchestrator answer-now routing tests to pin the new contract — active_capability is preserved when answer_now_context is set, the orchestrator falls back to chat only when the original capability is missing, and emits a clear error when neither is registered. Added a new tests/capabilities/test_answer_now.py module with 28 cases covering the shared helpers (extract_answer_now_context, format_trace_summary truncation/i18n, make_skip_notice, labeled_block, join_chunks) and every per-capability fast-path: chat, deep_solve, deep_question (including the unparseable-JSON fallback), deep_research, visualize (including code-fence stripping and invalid render_type recovery), and math_animator (verifying that run_analysis/run_design/run_summary are not invoked while run_code_generation/run_render are). A reverse test confirms that capabilities still take their normal pipeline when no answer_now_context is present.

What's Changed

  • Fix: TutorBot channel config wiped on every server restart by @srinivasrk in https://github.com/HKUDS/DeepTutor/pull/332
  • fix(api): add missing http_version arg in selective_access_log middleware by @kagura-agent in https://github.com/HKUDS/DeepTutor/pull/335
  • fix(web): resolve 15 npm security vulnerabilities by @srinivasrk in https://github.com/HKUDS/DeepTutor/pull/330

Community Contributions

  • @srinivasrk — Resolve 15 npm security vulnerabilities (#330)
  • @srinivasrk — Preserve existing TutorBot config when starting bot via API (#332)
  • @kagura-agent — Fix selective_access_log middleware unpack crash (#335)

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.1.0...v1.1.1

v1.1.0 Breaking risk
Breaking changes
  • LLM_TEST_MAX_TOKENS environment variable removed; use agents.yaml diagnostics.llm_probe.max_tokens instead
Notable features
  • LaTeX block math parsing improved with automatic delimiter promotion for multiline content
  • Extra headers forwarding in llm_complete and llm_stream functions
  • SaveToNotebookModal migrated to UUID-based notebook list API
v1.1.0-beta Breaking risk
Notable features
  • Resolved keystroke lag via message list virtualization and scroll debouncing
  • URL-based chat routing for bookmarkable and shareable sessions
  • WebSocket heartbeat and auto-reconnect with exponential backoff recovery
v1.0.3 Breaking risk
Notable features
  • Question Notebook with bookmarking and category-based organization
  • Mermaid diagram support in Visualize
  • Embedding model mismatch detection for knowledge bases
v1.0.2 Breaking risk
Notable features
  • Automatic consolidation for any provider without template
  • SearXNG generic fallback formatter
v1.0.1 New feature
Notable features
  • Visualize capability with Chart.js/SVG pipeline
  • Explicit Reference picker in chat composer
  • Quiz duplicate prevention
v1.0.0-beta.4 New feature
Notable features
  • Embedding progress tracking with batch reporting
  • HTTP 429 retry with exponential back-off
  • Cross-platform dependency auto-installation
v1.0.0-beta.3 Mixed
Notable features
  • Native openai and anthropic SDK integration
  • Windows Math Animator subprocess compatibility with asyncio.Queue
  • Full UI internationalization (English, Chinese)
v1.0.0-beta.2 Breaking risk
Breaking changes
  • Python 3.10 support dropped; Python 3.11+ required
Notable features
  • Hot settings reload without restart
  • MinerU nested output directory support
v1.0.0-beta.1 Breaking risk
Breaking changes
  • Complete package restructure: src/→deeptutor/+deeptutor_cli/
  • Package renamed from ai-tutor to deeptutor
  • LightRAG and RAG-Anything pipelines temporarily removed
Notable features
  • Agent-native runtime with two-layer plugin model (Tools + Capabilities)
  • Three unified entry points: CLI, WebSocket API, Python SDK
  • TutorBot multi-channel system supporting 12 messaging platforms
v0.6.0 New feature
Notable features
  • Frontend session persistence across refreshes
  • Incremental document upload to knowledge bases
  • Full Chinese localization with i18n
Full changelog

DeepTutor v0.6.0 Release Notes

Release Date: 2026.01.23

Highlights

Frontend State Persistence

Implemented robust session persistence across the application:

  • Solver, Guide, and other sessions now persist across browser refreshes
  • Improved state management with dedicated persistence layer
  • Better user experience with session continuity

Incremental Document Upload

Enhanced knowledge base with incremental document processing:

  • Add new documents to existing knowledge bases without full re-indexing
  • Significant performance improvement for large document collections
  • Smarter document change detection

Flexible RAG Pipeline Import

Refactored RAG initialization for better compatibility:

  • On-demand loading of RAG libraries (RAG-Anything, LlamaIndex)
  • Reduced startup time and memory footprint
  • Graceful fallback when optional dependencies are unavailable

Full Chinese Localization (i18n)

Added complete Chinese language support for the web interface:

  • Comprehensive translation across all pages and components
  • Dynamic language switching without page reload
  • i18n audit tools for translation consistency

Bug Fixes & Improvements

  • Enhanced LLM retry mechanism for complex agent operations
  • Fixed temperature parameter handling issues
  • Docker build optimizations and npm compatibility fixes
  • Added api_version parameter for Azure OpenAI support

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v0.5.2...v0.6.0

v0.5.2 New feature
Notable features
  • Docling alternative for RAG-Anything initialization
  • Logging system refactoring
Full changelog

DeepTutor v0.5.2 Release Notes

Release Date: 2026.01.18

Highlights

Docling Support for RAG-Anything

Added alternative RAG-Anything initialization using Docling as the document parser:

  • For users whose local environment is not suitable for MinerU
  • Provides a lightweight alternative for document processing
  • Same multimodal graph capabilities with different backend

Logging System Optimization

Refactored the logging system for better management:

  • Improved log output control across all modules
  • Better structured logging adapters
  • Enhanced console, file, and WebSocket handlers

Bug Fixes & Code Improvements

  • Optimized code structure across multiple modules
  • Fixed several bugs affecting user experience
  • Improved CI/CD workflows with Python 3.10/3.11 matrix testing

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v0.5.1...v0.5.2

v0.5.1 New feature
Notable features
  • Docling support for RAG-Anything document parsing
  • Logging system optimization for better control
  • Enhanced CI/CD workflows with Python 3.10/3.11 matrix
v0.5.0 Breaking risk
Notable features
  • Unified configuration system with environment variable references
  • Per-knowledge-base RAG pipeline selection (LlamaIndex, LightRAG, RAG-Anything)
  • Question generation overhaul with specialized agent architecture
v0.4.1 Breaking risk
Breaking changes
  • src/core module removed; migrate imports to src/services (load_config_with_main → src.services.config, llm_factory → src.services.llm, prompt_manager → src.services.prompt, logging → src.logging)
Notable features
  • LLM provider system overhaul with three deployment modes
  • Provider presets for OpenAI, Anthropic, DeepSeek, Ollama, LM Studio, vLLM, llama.cpp
  • Question generation JSON parsing robustness
v0.4.0 Breaking risk
Breaking changes
  • Environment variables renamed: OPENAI_API_KEY→LLM_API_KEY, OPENAI_API_BASE→LLM_HOST, EMBEDDING_DIM→EMBEDDING_DIMENSION
  • New required variables: LLM_BINDING, EMBEDDING_BINDING
  • Removed cloud providers from local settings page
Notable features
  • Multi-provider LLM support (OpenAI, Anthropic, Azure, Ollama, Groq, DeepSeek, Gemini)
  • Multi-provider embedding support (OpenAI, Jina, Cohere, Ollama, HuggingFace)
  • Dark mode with theme toggle and localStorage persistence
v0.3.0 Breaking risk
Notable features
  • Centralized PromptManager singleton with global caching and language fallback
  • GitHub Actions workflows for testing, dependencies, and Docker publishing
  • Pre-built Docker images via GitHub Container Registry
v0.2.0 Security relevant
Security fixes
  • Path traversal and injection vulnerabilities
  • RCE and LFI vulnerabilities
Notable features
  • Docker multi-stage deployment
  • Next.js 16 and React 19 upgrade

Beta — feedback welcome: [email protected]