No immediate action

v1.5.5 New feature 15h

ChatGPT sign‑in + Eden AI + KB inventory

Open

No immediate action

v1.5.4 New feature 2d

Chat responsiveness + table fidelity + parsing

Open

No immediate action

v1.5.3 Breaking risk 3d

Code themes + CLI agents + provider fixes

Open

No immediate action

v1.5.2 Mixed 7d

Configurable attachments + PageIndex reasoning + expanded model support

Open

No immediate action

v1.5.1 New feature 17d

Drop single failing doc

Open

Upgrade now

v1.5.0 Mixed 22d

RCE / SSRF

Partners, Agentic Chat Engine, Knowledge Center

Open

No immediate action

v1.4.15 Bug fix 26d

Guided Learning grading fix

Open

Config change

v1.4.14 New feature 27d

Auth

Partial‑result flags + session isolation

Open

No immediate action

v1.4.13 New feature 29d

Non-Latin partners + logo rendering + small KB retrieval

Open

No immediate action

v1.4.12 New feature 1mo

LightRAG, PyMuPDF4LLM, FAISS

Open

No immediate action

v1.4.11 New feature 1mo

Native tool calling, Users UI, LaTeX quizzes, container binding

Open

Config change

v1.4.10 Breaking risk 1mo

Auth RBAC

MCP tool access denial

Open

No immediate action

v1.4.9 New feature 1mo

Search adaptation, profile renaming, mastery recording

Open

No immediate action

v1.4.8 New feature 1mo

Partner memory + consultation

Open

No immediate action

v1.4.7 New feature 1mo

Agent consulting, My Agents move, Partner trace improvements

Open

Config change

v1.4.6 Mixed 1mo

Auth RBAC

Dashboard, Retrieval Engines, Configurable Services

Open

No immediate action

v1.4.5 New feature 1mo

Guided Learning rebuild + partner exports

Open

No immediate action

v1.4.4 Breaking risk 1mo

Skill hub integration + KB previews + Markdown images

Open

Config change

v1.4.3 Breaking risk 1mo

Breaking upgrade

Partners, single Chat loop, multi‑user isolation

Open

Upgrade now

v1.4.2 Mixed 1mo

Auth

Upgrade Notes, Tests, ver1-4-1.md

Open

Upgrade now

v1.4.1 Breaking risk 2mo

RCE / SSRF Auth RBAC

Shell exec disabled + resource isolation

Open

Review required

v1.4.0 Breaking risk 2mo

Auth Breaking upgrade

Reasoning effort normalization + turn recovery

Open

Review required

v1.4.0-beta Breaking risk 2mo

Auth RBAC Dependencies +1 more

Auto Mode + Memory v2 + Chat tools

Open

v1.3.10 Breaking risk 2mo

⚠ Upgrade required

If using Matrix with E2EE, install the `matrix-e2e` extra or its requirements file and ensure libolm is available.
`DISABLE_SSL_VERIFY=true` is allowed only in non‑production environments; it remains blocked when ENVIRONMENT=prod or production.

Breaking changes

Matrix no longer installs E2EE by default; `deeptutor[matrix-e2e]` or `requirements/matrix-e2e.txt` must be used to enable encrypted rooms.

Notable features

Remote single‑user Docker works out of the box again when AUTH_ENABLED=false without extra CORS settings.
`DISABLE_SSL_VERIFY` now propagates to all OpenAI SDK paths for self‑signed LLM endpoints (blocked in prod).
Code blocks are protected from citation rewrite, preserving array indexes and other code content.

Full changelog

DeepTutor v1.3.10 Release Notes

Release Date: 2026.05.10

v1.3.10 is a focused reliability release for the issues reported after v1.3.9.
It restores smoother remote Docker access, makes self-signed LLM endpoints work
consistently across SDK-backed providers, protects code snippets from citation
rewrites, and splits Matrix E2EE into an explicit opt-in dependency.

Highlights

Remote Docker and CORS Recovery

Remote single-user Docker works out of the box again - when
AUTH_ENABLED=false, DeepTutor now accepts browser origins over HTTP/HTTPS so
LAN or remote-server frontends no longer hit the v1.3.8/v1.3.9 CORS
regression reported in #463.
Authenticated deployments stay explicit - when AUTH_ENABLED=true, CORS
still requires a concrete allowlist through CORS_ORIGIN or CORS_ORIGINS,
preserving the credentialed-auth safety boundary.
Multiple deployment origins are supported - CORS_ORIGINS accepts comma
or newline separated values, and both Docker Compose files pass the setting
through to the backend container.
Settings no longer drop network flags - CORS_ORIGIN, CORS_ORIGINS, and
DISABLE_SSL_VERIFY are part of the canonical .env write order.

Provider TLS and Rendering Fixes

DISABLE_SSL_VERIFY now reaches OpenAI SDK paths - OpenAI-compatible,
Azure OpenAI, executor, TutorBot, and legacy embedding SDK clients all receive
a shared httpx.AsyncClient(verify=False) when the flag is enabled, fixing
self-signed HTTPS LLM endpoints reported in #464.
Production still blocks unsafe TLS bypasses - ENVIRONMENT=prod or
ENVIRONMENT=production rejects DISABLE_SSL_VERIFY, with a single warning
logged in non-production use.
Code blocks keep array indexes intact - Markdown citation linkification now
masks fenced and inline code before rewriting references, so values[0] stays
code instead of becoming a #references citation link (#468).

Matrix Install Compatibility

Matrix no longer installs E2EE by default - the standard matrix extra and
requirements/matrix.txt now use plain matrix-nio, avoiding the
python-olm / libolm build failures seen on macOS Python 3.14 and Apple
Clang 21 (#462).
Encrypted rooms are an explicit add-on - install deeptutor[matrix-e2e]
or requirements/matrix-e2e.txt when E2EE support is needed and libolm is
available.
Runtime failures are clearer - Matrix defaults to non-E2EE mode, and
enabling E2EE without crypto dependencies now raises an actionable install
message instead of failing at import time.

Multi-User Runtime Compatibility

Default workspace paths stay stable outside user scope - when no current
multi-user context is active, path resolution falls back to the default data
workspace rather than forcing an admin scope.
Legacy test and monkeypatch hooks remain available - session and settings
routers keep compatibility shims used by tests and older integrations.
Local agent artifacts are ignored - .claude/ is now excluded from Git so
local worktrees and agent metadata do not accidentally enter releases.

Tests

Added CORS setting tests for unauthenticated remote origins and authenticated
explicit allowlists.
Added shared OpenAI SDK HTTP-client tests across provider-core, Azure,
executors, TutorBot, and embedding adapters.
Added Markdown display tests for prose citations, fenced code, inline code,
and explicit backticked citations.
Added Matrix dependency split tests to keep default installs free of
matrix-nio[e2e].
Re-ran targeted Python tests, web node tests, Ruff checks, and diff whitespace
validation for the release patch.

Upgrade Notes

If you run remote Docker with AUTH_ENABLED=false, no extra CORS setting is
required for normal HTTP/HTTPS browser origins.
If you run a shared or authenticated deployment with AUTH_ENABLED=true, set
CORS_ORIGIN or CORS_ORIGINS to the exact frontend origin(s), for example
https://learn.example.com.
Use DISABLE_SSL_VERIFY=true only for local, self-signed, or air-gapped test
LLM endpoints. It remains blocked in ENVIRONMENT=prod and
ENVIRONMENT=production.
Matrix installs are now non-E2EE by default. For encrypted Matrix rooms,
install .[matrix-e2e] or requirements/matrix-e2e.txt, ensure libolm is
present, and set e2ee_enabled=true in the Matrix channel config.
If you previously installed .[matrix] only to get non-encrypted Matrix
messaging, reinstalling after this release should no longer require native
libolm build tooling.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.9...v1.3.10

View release on GitHub

v1.3.9 Breaking risk 2mo

⚠ Upgrade required

Install or refresh the `.[tutorbot]` extra (or `requirements/tutorbot.txt`) to include `zulip>=0.8.0,<1.0.0`. Configure Zulip bots with site, email, apiKey, allowFrom, and groupPolicy; use mention for safer deployments.
If using `LLM_REASONING_EFFORT=minimal` with DeepSeek, DashScope, VolcEngine, BytePlus, or MiniMax, keep the setting; v1.3.9 translates it to provider‑specific disable payload.
Verify provider limits after raising context window ceiling; large configured windows may now be honored instead of being capped at 65,536 tokens.

Breaking changes

Maximum context window raised to 1,000,000 tokens; previously capped at 65,536 tokens for large‑model fallback.

Notable features

Zulip added as a TutorBot channel with mention/open policies and LaTeX/KaTeX conversion.
NVIDIA NIM provider support integrated into TutorBot configuration and registry detection.

Full changelog

DeepTutor v1.3.9 Release Notes

Release Date: 2026.05.09

v1.3.9 builds on the v1.3.8 multi-user foundation with broader TutorBot
deployment options, safer provider routing for thinking models, and a smoother
web onboarding path. It adds Zulip and NVIDIA NIM support, improves startup
ergonomics, and folds in the main issue fixes reported after the last release.

Highlights

TutorBot Channel and Provider Expansion

Zulip is now a TutorBot channel - bots can listen to private messages and
stream topics, enforce allow_from, choose mention-only or open stream
replies, and bridge Zulip's event queue into the async TutorBot bus.
Math and files work better in Zulip - LaTeX is converted to Zulip-friendly
KaTeX markup, upload/download calls use configurable retry behavior, and
attachment filenames include upload-path digests to avoid collisions.
Zulip topics keep conversations separated - stream topics now become part
of the chat/session key, with a stable (no topic) fallback for empty topics.
TutorBot supports NVIDIA NIM - nvidia_nim is available in TutorBot
provider config and registry detection, including NIM's streaming behavior
that omits unsupported stream_options.

Model and Runtime Reliability

Configured context windows are respected - the safety ceiling is raised to
1,000,000 tokens while the large-model fallback remains 65,536, so explicit
128K-style model settings are no longer silently clamped.
Qwen vision detection is fixed - Qwen VL models are treated as
vision-capable across DashScope, OpenAI-compatible, and custom bindings.
Minimal thinking mode is provider-safe - DeepSeek, DashScope, VolcEngine,
BytePlus, and MiniMax no longer receive a rejected top-level
reasoning_effort=minimal; DeepTutor sends the provider-specific disable
signal instead.
DeepSeek v4 costs are tracked - research token accounting includes
deepseek-v4-flash and deepseek-v4-pro pricing entries.

Web and CLI Polish

deeptutor start launches the full web stack - the CLI now delegates to
scripts/start_web.py so backend and frontend can be started from one
command, and launcher failures propagate through the CLI exit code.
Sidebar onboarding is clearer - primary navigation icons now expose
scoped, localized tooltips with descriptions and keyboard focus support.
Multi-line user messages stay readable - chat message rendering preserves
Shift+Enter line breaks, fixing code blocks and structured prompts that were
previously collapsed into one line.
Assigned resources are easier to understand - model-selection summaries
and read-only knowledge-base actions now present clearer labels for
non-admin, grant-scoped sessions.

Multi-User and Session Store Parity

Assigned model options match the selector contract - non-admin LLM choices
now return profile names, model names, labels, and active/default metadata in
the same shape expected by the web model selector.
PocketBase sessions support more chat flows - message metadata can be
persisted, last-message lookup is available, and message deletion works with
PocketBase string IDs as well as SQLite integer IDs.
Regenerate remains storage-neutral - turn retry logic can remove the last
assistant message without assuming the backing session store uses integer
primary keys.

Tests

Added Zulip channel coverage for config parsing, permission checks, duplicate
filtering, mentions, stream topic scoping, attachment extraction, retry
behavior, LaTeX conversion, typing status, sending, uploads, and startup
failures.
Added TutorBot NVIDIA NIM provider tests for registry detection, schema
acceptance, and streaming request compatibility.
Added LLM regression tests for Qwen vision capability, explicit context-window
budgets, and minimal-thinking provider kwargs.
Added CLI coverage so deeptutor start propagates the launcher exit code.
Added research token-pricing coverage for the DeepSeek v4 model entries.

Upgrade Notes

Install or refresh the .[tutorbot] extra, or requirements/tutorbot.txt, to
include the new zulip>=0.8.0,<1.0.0 dependency before enabling Zulip bots.
Configure Zulip bots with site, email, apiKey, allowFrom, and
groupPolicy; use mention for safer stream deployments and open only
when every stream message should reach the bot.
If you use LLM_REASONING_EFFORT=minimal with DeepSeek, DashScope,
VolcEngine, BytePlus, or MiniMax, keep the setting as-is; v1.3.9 translates it
to the correct provider-specific disable payload.
Large configured context windows may now be honored instead of capped at
65,536 tokens, so verify provider limits and expected prompt-cost behavior.
Optional PocketBase deployments should ensure the messages collection has a
metadata_json JSON field before relying on regenerate/session metadata
parity.

What's Changed

fix: raise context_window ceiling and add qwen vision support by @wedone in https://github.com/HKUDS/DeepTutor/pull/442
fix: add deepseek-v4-flash and deepseek-v4-pro to model pricing table by @Starfie1d1272 in https://github.com/HKUDS/DeepTutor/pull/447
fix(llm): stop sending reasoning_effort=minimal as top-level param to providers that reject it by @Starfie1d1272 in https://github.com/HKUDS/DeepTutor/pull/453
feat: add deeptutor start command to launch backend and frontend together by @Starfie1d1272 in https://github.com/HKUDS/DeepTutor/pull/445
fix(web): preserve newlines in user chat messages by @kagura-agent in https://github.com/HKUDS/DeepTutor/pull/449
feat(tutorbot): add Zulip channel support by @wedone in https://github.com/HKUDS/DeepTutor/pull/452
feat: tooltips for sidebar by @philliplagoc in https://github.com/HKUDS/DeepTutor/pull/457
fix: add TutorBot NVIDIA NIM provider support by @Bortlesboat in https://github.com/HKUDS/DeepTutor/pull/455

New Contributors

@philliplagoc made their first contribution in https://github.com/HKUDS/DeepTutor/pull/457
@Bortlesboat made their first contribution in https://github.com/HKUDS/DeepTutor/pull/455

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.8...v1.3.9

View release on GitHub

v1.3.8 New feature 2mo

⚠ Upgrade required

Existing single‑user installs remain unchanged unless `AUTH_ENABLED=true` is set.
For shared deployments set `AUTH_ENABLED=true`, leave `POCKETBASE_URL` blank, register the first admin via `/register`, and assign models before regular users start chats.
Upgrade backs up both `data/` and new `multi-user/` directories; multi‑worker setups must carefully bootstrap the first admin due to an in‑process promotion lock.

Notable features

Authenticated multi-user deployments with isolated per‑user workspaces
Admin‑managed access, model profile grants, and scoped knowledge bases
Integrated frontend auth routes (/login, /register) and comprehensive multi‑user documentation

Full changelog

DeepTutor v1.3.8 Release Notes

Release Date: 2026.05.08

v1.3.8 brings DeepTutor's optional multi-user mode into the main release line.
It keeps local single-user installs unchanged while adding authenticated shared
deployments with isolated user workspaces, admin-managed access, and clearer
deployment guidance.

Highlights

Multi-User Workspaces

Authentication can gate shared deployments - enabling AUTH_ENABLED
adds login, registration, JWT sessions, and a first-user admin flow.
Each user gets isolated data - ordinary users work under
multi-user/<uid>/ with separate chat history, memory, notebooks, and
knowledge bases, while admins keep the main workspace.
Admin grants control access - /admin/users lets admins create users and
assign allowed model profiles, knowledge bases, skills, and copied spaces
without exposing API keys.

Safer Runtime Boundaries

Knowledge and RAG stay scoped - assigned knowledge bases are visible with
badges, and non-admin RAG calls no longer fall back silently to admin data.
Model routing honors grants - non-admin chat turns use an assigned model
profile and fail early if no LLM is available.
Settings are redacted for users - non-admin settings show theme, language,
and model summaries, while provider secrets and endpoints remain admin-only.

Deployment and UI

Frontend auth routes are included - /login, /register, auth-aware
middleware, logout controls, and admin navigation are wired into the web app.
Multi-user docs are now first-class - README and translated READMEs
document setup, workspace layout, audit logs, env vars, and production
caveats.
Optional PocketBase remains documented - PocketBase can still be used as a
sidecar path, but true multi-user deployments should leave POCKETBASE_URL
unset and use the built-in JSON/SQLite backend.

Tests

Added multi-user tests for identity migration, first-admin registration,
grants, settings restrictions, scoped interface preferences, skill access, and
RAG fallback prevention.
Added status-redaction coverage so non-admin users do not receive provider
model or search endpoint details.

Upgrade Notes

Existing local installs stay in single-user mode unless AUTH_ENABLED=true.
For real multi-user deployments, set AUTH_ENABLED=true, keep
POCKETBASE_URL blank, create the first admin through /register, and assign
models before ordinary users start chat turns.
New deployment state is stored under multi-user/; back up both data/ and
multi-user/ before upgrading shared instances.
Multi-worker deployments should bootstrap the first admin carefully because
first-user promotion is protected by an in-process lock.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.7...v1.3.8

View release on GitHub

v1.3.7 New feature 2mo

⚠ Upgrade required

Set `LLM_REASONING_EFFORT` in `.env` for global thinking control; leave empty to auto-detect.
Knowledge-base metadata may now include `last_indexed_at`, `last_indexed_count`, and `last_indexed_action`.
Co-Writer clear/template actions are recoverable through undo until the user leaves the current draft.

Notable features

Thinking-model compatibility: reasoning output kept separate, DeepSeek effort configurable via `LLM_REASONING_EFFORT`, custom gateway headers preserved, and structured generation more tolerant.
Knowledge index visibility: activity recorded with timestamps, counts, actions; UI shows history panels.
Co-Writer editing safety: confirmation dialogs before clearing/non‑empty replace, enhanced undo (Ctrl/Cmd+Z/Y), clearer toolbar controls.

Full changelog

DeepTutor v1.3.7 Release Notes

Release Date: 2026.05.04

v1.3.7 focuses on thinking-model compatibility, clearer knowledge-base index
history, and safer Co-Writer editing. It keeps provider-specific reasoning
output under control while making index activity easier to understand in the UI.

Highlights

Thinking-Model and Gateway Compatibility

Reasoning output stays separate - OpenAI-compatible and TutorBot providers
keep reasoning_content out of visible answer text, and streaming avoids
replaying internal scratchpad as final content.
DeepSeek thinking can be configured from .env - LLM_REASONING_EFFORT
is documented and applied through the resolver path. Use minimal to disable
DeepSeek thinking, or high / max to enable it.
Custom gateway headers are preserved - chat and explicit LLM calls inherit
profile extra_headers, fixing gateways that require custom headers such as
a User-Agent override.
Structured generation is more tolerant - book blocks and question ideation
now handle fenced, repaired, list-shaped, or otherwise imperfect JSON outputs
more reliably.

Knowledge Index Visibility

Index activity is recorded - create, upload, and re-index flows now store
last_indexed_at, indexed document count, and the index action in knowledge
metadata.
Progress payloads describe real index changes - backend status updates can
distinguish metadata-only completion from an actual vector-index update.
The Knowledge UI shows index history - detail, settings, and index-version
panels display the latest index time and document count when available.

Co-Writer Editing Safety

Clear and template actions ask first - replacing a non-empty draft now
opens a confirmation dialog before the editor is cleared or overwritten.
Undo is more dependable - pending typing snapshots are committed before
toolbar edits, and editor shortcuts support Ctrl/Cmd+Z, Shift+Cmd+Z, and
Ctrl/Cmd+Y.
Toolbar controls are clearer - destructive and template actions now have
distinct tones, focus states, labels, and accessible tooltips.

Tests

Added OpenAI-compatible provider tests to keep reasoning_content separate
from visible response content in both service and TutorBot paths.
Expanded LLM factory tests for inherited extra_headers, inherited
reasoning_effort, and reasoning-only streaming behavior.
Added knowledge manager coverage for recording last_indexed_* metadata only
when the index actually changes.

Upgrade Notes

Set LLM_REASONING_EFFORT in .env if you need global thinking control.
Leave it empty to let DeepTutor auto-detect behavior from the active model.
Knowledge-base metadata may now include last_indexed_at,
last_indexed_count, and last_indexed_action.
Co-Writer clear/template actions are recoverable through undo until the user
leaves the current draft.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.6...v1.3.7

View release on GitHub

v1.3.6 New feature 2mo

⚠ Upgrade required

Send `llm_selection` as `{"profile_id":"...","model_id":"..."}` for explicit model routing; omission falls back to system default.
TutorBot configs may include `llm_selection`; legacy `model` overrides remain supported.
Configure launch ports via `.env` or environment variables (`BACKEND_PORT`, `FRONTEND_PORT`); old `data/user/settings/env.json` port block is ignored.

Notable features

Unified chat turns now carry `profile_id`/`model_id` via WebSocket payload and session preferences for explicit model targeting.
Settings endpoint returns safe, credential‑omitted provider/model options; shared UI selector used by Chat and TutorBot.
TutorBot can persist and reload its LLM selection without full restart, improving stability of bot history assembly.

Full changelog

DeepTutor v1.3.6 Release Notes

Release Date: 2026.05.03

v1.3.6 focuses on making model routing explicit across DeepTutor. Users can
choose configured LLM profiles from chat and TutorBot flows, runtime services
resolve those choices without leaking provider secrets, and RAG/knowledge-base
index handling is more defensive when persisted embeddings are invalid.

Highlights

Catalog-Based Model Selection

Chat can target a configured model - unified chat turns now carry a
profile_id and model_id selection through the WebSocket payload, session
preferences, turn snapshots, and regenerate flows.
Settings exposes safe LLM options - the new settings options endpoint
returns display-ready provider/model choices while omitting credentials and
connection secrets from the response.
Runtime model overrides are scoped per turn - selected profiles are
resolved through the provider catalog for the active request without writing
temporary choices back to disk or changing global defaults.
Model-selector UI is shared - chat and TutorBot screens use the same
configured-model selector, with localized labels and system-default handling.

TutorBot Model Control

Bots can persist model selections - TutorBot create/update flows now accept
llm_selection, validate it against the configured catalog, and store it with
each bot.
Running bots can reload their LLM - changing a bot's model updates the
active agent loop instead of requiring a full bot restart.
Recent bot history is steadier - TutorBot history assembly now sorts by
message timestamp with stable tie-breaking before taking the latest context.
Bot chat route changes are cleaner - the web chat page cancels in-flight
bot requests and resets transient reasoning state when switching bots.

RAG and Knowledge Reliability

Invalid vectors trigger rebuilds - re-indexing no longer treats a matching
document signature as reusable when the existing vector store fails embedding
validation.
Full rebuilds use fresh version directories - complete knowledge-base
rebuilds write to a new flat index version while leaving failed old storage
available for inspection.
RAG tool logs can stream to clients - retrieval runs can forward captured
INFO-level process logs as raw tool events when an event sink is available.
Knowledge health checks recognize bad embeddings - invalid persisted
vectors are surfaced earlier instead of producing opaque search failures.

Provider and Launch Fixes

OpenAI Responses token limits are normalized - Responses API calls now map
chat-style max_completion_tokens and max_tokens to max_output_tokens,
fixing the SDK error reported for newer OpenAI models in #437.
Azure and OpenAI-compatible paths share the mapping - both streaming and
non-streaming Responses API routes use the same conversion helper.
Launch ports come from .env and environment variables - setup and launch
helpers now keep backend/frontend port behavior aligned around the project
.env file instead of the older runtime settings JSON.

Web UX Polish

Skill names validate before save - the Skills editor slugifies names,
flags invalid input inline, and prevents silent API failures for uppercase
letters, spaces, underscores, or other unsupported characters.
Skill editor modals are opaque across themes - the editor now uses the
page background token, avoiding text bleed-through in translucent themes.
Space navigation is easier to scan - Space mini-navigation, notebook,
question-bank, skills, and session-list spacing were tightened with clearer
card and divider treatment.

Tests

Added model-selection service tests for safe option listing, active markers,
invalid profile/model rejection, and non-mutating catalog overrides.
Added unified WebSocket turn-runtime tests for persisted LLM selections,
invalid selections, model switching, snapshots, and regenerate behavior.
Added TutorBot API and manager tests for llm_selection persistence,
validation, runtime reload, and default-model behavior.
Added settings, provider-runtime, and LLM-config tests for scoped catalog
selection and per-turn config precedence.
Added RAG and knowledge-router tests for invalid vector stores, re-index
rebuild decisions, and storage version resolution.
Added OpenAI Responses converter tests for token-limit aliases, precedence,
None filtering, and input immutability.
Added frontend slug tests for skill-name normalization and validation.

Upgrade Notes

Chat and TutorBot clients that want explicit model routing should send
llm_selection as { "profile_id": "...", "model_id": "..." }. Omitting it
continues to use the configured system default.
TutorBot configuration files may now contain llm_selection. Existing bot
configs without that field continue to load, and legacy model values remain
usable as model-name overrides.
Launch ports should be configured in .env or process environment variables
(BACKEND_PORT / FRONTEND_PORT). The old data/user/settings/env.json
port block is no longer used as a launch-port source.
Knowledge bases with stale or invalid persisted vectors may rebuild on the
next re-index even when document signatures have not changed.
Skill names are now normalized and validated as lowercase slugs of up to 64
characters using letters, numbers, and hyphens.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.5...v1.3.6

View release on GitHub

v1.3.5 Breaking risk 2mo

⚠ Upgrade required

Update Node.js installation to version 20.9 or newer for local web installs.
`start_web.py` and setup helpers now read `data/user/settings/env.json`/`interface.json` first; adjust settings via the Settings page or rerun `start_tour.py` when changing ports.
Local OpenAI‑compatible embedding servers should use an empty API key; avoid relying on the removed placeholder `sk-no-key-required` which is no longer transmitted as an auth header.

Breaking changes

Minimum Node.js version increased to 20.9
`start_web.py` now prioritizes `data/user/settings/env.json` and `interface.json` over `.env` for runtime settings

Notable features

Setup Tour writes backend/frontend ports into `data/user/settings/env.json` for consistent later launches
RAG tool calls strictly reject empty queries, providing a safer fallback to the user's original question

Full changelog

DeepTutor v1.3.5 Release Notes

Release Date: 2026.05.02

v1.3.5 focuses on making local setup and knowledge-base chat more reliable. The
launcher now follows the same runtime settings users configure in the web app,
RAG tool calls are stricter about real search queries, and local embedding
servers no longer receive placeholder auth headers.

Highlights

Smoother Local Launch

Setup Tour writes launch ports - the guided installer now records backend
and frontend ports in data/user/settings/env.json, so later launches can use
the same choices.
start_web.py reads runtime settings first - backend/frontend ports and UI
language come from web settings when available, with .env kept as fallback.
Cleaner process handling - the launcher records started processes, detects
port conflicts, waits for readiness, and exposes scripts/stop_web.py for
cleaning up recorded backend/frontend processes.
Setup requirements are clearer - README and environment examples now align
around Node.js 20.9+, install profiles, complete embedding endpoint URLs, and
optional attachment storage.

More Reliable RAG Tool Calls

RAG queries must be non-empty - tool schemas, prompts, and built-in checks
now reject blank queries early instead of passing empty input into retrieval.
Chat-side fallback is safer - when a model omits the RAG query, the agentic
pipeline can reuse the user's actual question as the retrieval query.
ReAct calls accept simple string input - rag actions that provide a
string are normalized to {"query": ...}, reducing fragile tool-call failures.

Local Embedding Compatibility

No fake API key for local embedding providers - runtime config no longer
injects sk-no-key-required for local embedding servers.
Placeholder keys are not sent as auth headers - OpenAI-compatible
embedding requests suppress Authorization and api-key when the configured
key is the local placeholder, which helps LM Studio, Ollama, vLLM, and similar
servers.
Embedding examples are easier to follow - English and Chinese sample env
files now explain that EMBEDDING_HOST is the exact endpoint DeepTutor calls.

Web UX Polish

Dark-mode provider dropdown is readable - the Settings provider selector
now uses the theme background token, fixing the white native dropdown popover
reported on Edge/Chromium.
Settings controls are more consistent - select fields and setup tour
spotlight behavior were tightened for a steadier settings experience.
Book reference payloads are normalized more defensively - selected book
references keep the same behavior with cleaner filtering and deduplication.

Tests

Added launch settings tests for runtime settings precedence, .env fallback,
and invalid-port handling.
Added start_web.py tests for translation, state persistence, and recorded
process matching.
Added Setup Tour coverage for dependency profiles, Math Animator selection,
Node.js version validation, and saved launch ports.
Added RAG/tool tests for non-empty query schemas, blank-query rejection, and
fallback query behavior.
Added embedding runtime and adapter tests for local providers, placeholder API
keys, and auth header suppression.

Upgrade Notes

Local web installs now require Node.js 20.9 or newer.
start_web.py and setup helpers prefer data/user/settings/env.json and
interface.json over .env; edit the web Settings page or rerun
start_tour.py when changing launch ports.
Local OpenAI-compatible embedding servers should use an empty API key unless a
real key is required. Avoid relying on sk-no-key-required as a transmitted
credential.
Custom RAG callers should always provide a non-empty query; blank queries now
fail fast by design.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.4...v1.3.5

View release on GitHub

v1.3.4 New feature 2mo

⚠ Upgrade required

Refresh dependencies after upgrading; CLI extra now requires defusedxml>=0.7.1 for Office document extraction.
Custom WebSocket clients should pass `book_references` and `language` on turn start messages.
Set `LLM_REASONING_EFFORT` to tune global reasoning effort if using reasoning models.

Notable features

Book page chat sessions persist per page via the new page-chat-session API
Books can be rebuilt from an approved spine, clearing content while keeping the outline
Regular chat turns can attach and cite selected book pages as context

Full changelog

DeepTutor v1.3.4 Release Notes

Release Date: 2026.05.01

v1.3.4 turns the Book Engine and chat workspace into a tighter learning loop:
book pages can now carry their own persistent chat sessions, books can be
rebuilt from an existing spine, and regular chat turns can cite selected book
pages alongside Space context. This release also improves language consistency,
DeepSeek-style reasoning output handling, document extraction for RAG, logging
infrastructure, and the public documentation around DeepTutor's arXiv paper.

Highlights

Book Engine, Page Chat, and Book References

Book generation and reading now preserve more of the user's context and make it
easier to iterate on a generated book without starting over.

Book page chat uses the unified stream protocol - the page chat panel now
uses the shared WebSocket client and stream-event renderer used by the main
chat workspace, so tool output, assistant events, attachments, and restored
session history behave consistently.
Page chat sessions are persisted per book page - each page can be bound to
a chat session_id through the new page-chat-session API, and reopening the
reader restores the page's conversation when available.
Books can be rebuilt from the approved spine - the new rebuild flow clears
generated page content and progress while keeping the confirmed outline, then
restarts compilation from that structure.
Single-page regeneration keeps learner notes - forced recompilation can
reset generated content while preserving user-authored note blocks and key
transition metadata.
Regular chat can cite book pages - the chat composer can attach selected
books and pages as request context, persist them in the turn snapshot, restore
them when sessions hydrate, and show them as removable context chips.
Book context is cleaner for reasoning models - selected book pages are
converted into bounded text references with thinking tags stripped before they
are injected into chat or page-side conversations.

Chat Language and Reasoning-Model Behavior

Chat turns now follow the user's current language setting more reliably and are
more tolerant of providers that return reasoning content differently.

Language is part of each chat turn - WebSocket requests can carry the
current language, and both agentic chat and classic chat append explicit
language instructions so answers match the active UI language.
Regenerate and Answer Now use the current app language - new turns and
regenerated turns read the latest stored language instead of relying only on
older session preferences.
DeepSeek-style empty-content responses recover better - OpenAI-compatible
providers can fall back to reasoning_content when a model returns an empty
visible content field.
Book block writing can tune reasoning effort - LLM-backed book block
generation now passes reasoning_effort, and structured JSON retries can
lower effort when reasoning-heavy models fail to return parseable JSON.

RAG, Documents, and Knowledge Base Recovery

Document ingestion now uses the same extraction path across more file types and
keeps re-indexing available in more recovery states.

Office files route through parser extraction - .xlsx and .pptx files
now join PDF and DOCX in the parser-backed routing path, with spreadsheet and
presentation categories available to downstream RAG logic.
LlamaIndex loading uses shared document extraction - parser-routed files
are read through extract_text_from_path(), use file-type-specific size
limits, and avoid unnecessary character truncation during indexing.
DOCX extraction has a safer fallback path - when python-docx cannot read
a file, the extractor can parse OOXML content through defusedxml instead of
failing immediately.
Knowledge Base re-index controls are less fragile - the web UI can expose
re-index actions for error and mismatch states without requiring an already
initialized RAG runtime, as long as source documents are available.
Scanned or empty documents fail more clearly - extraction and validation
now distinguish byte limits, character limits, empty parsed content, and
unsupported parser results more consistently.

Settings, Runtime State, and Logging

This release continues the infrastructure cleanup needed for long-running local
and server deployments.

Settings shows clearer runtime state - backend, LLM, embedding, and search
status are displayed as service cards with online state, timestamps, runtime
model details, and pending-apply indicators.
LLM_REASONING_EFFORT is configurable - reasoning effort can be supplied
through environment configuration and is included in runtime summaries.
Logging uses the standard logger surface - routers, agents, providers, RAG
code, and TutorBot integrations move away from the old custom logger module
toward standard logging.getLogger(__name__) usage plus a focused Loguru
bridge.
Raw RAG debug log forwarding is quieter - the RAG service no longer
forwards low-level logging handler output into user-facing event streams by
default.
CI and lint coverage were refreshed - workflow and test changes cover the
logging configuration path, process-log streaming, and lint consistency.

Documentation and Localization

The project documentation now reflects the paper release and keeps localized
README files aligned with the latest release cadence.

The arXiv paper is linked from the README - the main README badge and News
section now point to 2604.26962.
Localized READMEs were refreshed - translated README files include the
latest release list, arXiv/news updates, and the expanded language navigation.
Book, chat, settings, and rebuild copy is localized - English and Chinese
app strings now cover the new Book chat, rebuild, language, attachment, and
runtime-state surfaces.

Tests

Added chat-language prompt coverage for per-turn language directives and
language-aware agentic chat behavior.
Added Book Engine coverage for book context extraction, page-chat session
binding, rebuild controls, forced page recompilation, and LLM JSON writing.
Added RAG and document-loader coverage for parser-routed files, Office
extraction paths, file-size limits, and re-index eligibility helpers.
Added provider/runtime coverage for LLM_REASONING_EFFORT, OpenAI-compatible
reasoning fallback behavior, and provider runtime summaries.
Added logging tests for configuration, context propagation, Loguru bridging,
process-log extraction, and task log streaming.
Updated frontend tests for document attachment handling, version reporting,
and Knowledge Base re-index helper behavior.

Upgrade Notes

CLI and server installs should refresh dependencies after upgrading. The CLI
extra and requirements/cli.txt now include defusedxml>=0.7.1 for safer
XML parsing during Office document extraction.
Custom WebSocket clients can pass book_references and language on turn
start messages. Clients that persist request snapshots should store book
references alongside notebooks, history, skills, memory, and attachments.
Deployments that use reasoning models can set LLM_REASONING_EFFORT to tune
reasoning effort globally; per-profile and per-model values remain available
as lower-priority fallbacks.
Integrations that consumed raw RAG debug log events should rely on structured
status and tool events instead of low-level forwarded logger output.
Book clients should call the new page-chat-session and rebuild APIs when they
need page-level conversation persistence or spine-preserving regeneration.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.3...v1.3.4

View release on GitHub

v1.3.3 New feature 2mo

⚠ Upgrade required

SQLite session databases are migrated with a new `messages.metadata_json` column on first open
Custom WebSocket clients must now explicitly send `memory_references: ["summary"]`, `[\

Notable features

NVIDIA NIM becomes a first‑class LLM provider with auto‑detection and usage‑metering disabled
Gemini embeddings are fully integrated end‑to‑end (model gemini-embedding-001, 3072 dimensions)
Space unifies chat history, notebooks, question‑bank items, skills, and memory into a single context picker

Full changelog

DeepTutor v1.3.3 Release Notes

Release Date: 2026.04.30

v1.3.3 is a fast follow-up release after v1.3.2. It expands provider coverage
with NVIDIA NIM and Gemini embeddings, makes Space the unified place to attach
chat history, notebooks, question-bank items, skills, and memory to a turn, and
continues the stability work around RAG re-indexing, thinking-model cleanup,
TutorBot history, and persisted session context.

Highlights

Provider and Embedding Coverage

DeepTutor now covers more hosted provider setups out of the box and keeps the
runtime configuration path aligned with the Setup Tour and .env examples.

NVIDIA NIM is a first-class LLM provider - provider auto-detection now
recognizes nvapi- keys and NVIDIA API bases, defaults to
https://integrate.api.nvidia.com/v1, and avoids sending
stream_options.include_usage because NIM can hang when that option is
present.
Gemini embeddings are available end to end - embedding runtime metadata,
endpoint validation, Setup Tour choices, model suggestions, and .env
examples now include Gemini, with gemini-embedding-001, 3072 dimensions, and
GEMINI_API_KEY fallback support.
Provider-specific embedding keys survive Settings writes - .env writes
preserve keys such as SiliconFlow, DashScope, Cohere, Jina, and Gemini instead
of only preserving the older core provider set.
Dependency resolution is less fragile - the NumPy upper bound was relaxed
to support current Manim installs in deeptutor[all], and Windows setup docs
now call out the Visual Studio Build Tools / C++ workload prerequisite.

Space, Chat Context, Skills, and Memory

The chat composer now treats all learning context as Space context, instead of
splitting references, skills, and memory across separate controls.

Space opens on Chat History - the Space entry point now lands on the new
Chat History page, where previous conversations can be searched, refreshed,
renamed, deleted, and reopened directly from the Space workspace.
One Space menu powers toolbar and @ mentions - the old inline
AtMentionPopup was replaced by a shared Space menu for chat history,
notebooks, question-bank items, skills, and memory, whether opened from the
toolbar or by typing @.
Skills selection is clearer - skills now open in a full picker with
search, tags, explicit multi-select, and Auto mode, instead of a small inline
dropdown beside the composer.
Memory can be attached per turn - users can select the running summary,
profile, or both through the new Memory picker. The request sends
memory_references, and the backend only injects the selected memory files.
Context chips show the full turn setup - selected history, notebooks,
question-bank items, skills, and memory all appear as removable chips before
send; sent user messages also show matching request-snapshot badges.
Answer Now and session hydration keep context - replayed turns and loaded
sessions now hydrate notebooks, history references, question-bank references,
skills, memory references, and attachments from persisted message metadata.

Session Persistence and Message Normalization

Conversation state now records more of the user's actual send-time context and
handles non-text message content more defensively.

Message metadata is persisted - the SQLite session store adds a
metadata_json column and stores a request_snapshot for user messages,
including capability, tools, selected KBs, language, config, attachments,
Space references, skills, and memory selections.
WebSocket turns accept memory and skills explicitly - incoming payloads
normalize memory_references to summary / profile, normalize skills into
a string list, and materialize both into message metadata.
TutorBot history handles multimodal content - bot history and recent bot
previews normalize string, array, object, and image-style content into safe
display text, while internal reasoning_content is stripped from API
responses.
Frontend message previews are safer - shared message-content utilities
now accept unknown content, stringify custom objects, render image parts as
[image], and truncate previews consistently across chat and session lists.

Memory, Notebook, and Thinking-Model Cleanup

The thinking-output cleanup introduced in v1.3.2 now reaches more durable
storage surfaces and rejects malformed memory rewrites before they can corrupt
profile or summary files.

Memory rewrites must match the expected shape - profile and summary
refreshes now verify allowed section headings before writing. If a thinking
model answers the user instead of returning structured memory, the write is
rejected rather than persisted.
Memory context is explicit - build_memory_context() now only includes
summary and/or profile when those files are requested, matching the new
per-turn Memory picker and avoiding accidental default memory injection.
Notebook summaries are cleaned and repaired - notebook writes, streaming
summary saves, and notebook loads strip thinking tags from summaries; older
notebook records are repaired on read when possible.
Streaming summary chunks are cleaner - generated notebook summaries are
assembled, cleaned, and emitted after cleanup, so empty or scratchpad-only
chunks are not streamed to clients.

RAG and Knowledge Base Resilience

RAG validation now catches more invalid persisted indexes before retrieval and
returns clearer events when the user needs to re-index.

Stale processing KBs recover when an index is ready - if kb_config.json
is stuck at processing or initializing but a ready LlamaIndex version is
already on disk, Knowledge Base info reports ready and hides the stale
progress bar instead of leaving the UI in a perpetual processing state.
More vector stores are validated - LlamaIndex storage now checks the
default vector store, storage_context.vector_stores, and persisted
*vector_store.json embedding dictionaries for null, dropped, non-numeric,
non-finite, or inconsistent vectors.
Invalid-index failures emit user-facing status events - RAG search now
sends a structured error status with needs_reindex through the tool event
stream and avoids treating known invalid-index failures as successful
retrieval attempts.
Low-level vector errors are less exposed - known invalid embedding/index
failures are logged and surfaced as re-index guidance instead of raw
NoneType * float style tracebacks in user-facing logs.

Tests

Added Knowledge Manager coverage for promoting stale processing /
initializing status to ready when a valid index version already exists.
Added provider coverage for NVIDIA NIM registry metadata, stream-option
behavior, Gemini embedding runtime defaults, endpoint validation, Setup Tour
provider choices, and .env key preservation.
Added session and WebSocket coverage for metadata_json, request snapshots,
skills normalization, memory reference parsing, and turn materialization.
Added memory and notebook coverage for thinking-tag stripping, invalid memory
rewrite rejection, selective memory-context injection, and summary repair on
read.
Added RAG/LlamaIndex coverage for multi-vector-store validation,
disk-persisted invalid vectors, needs_reindex status events, and sanitized
raw logs.
Added TutorBot and frontend message-content coverage for non-string,
multimodal, object, image, and truncated message content.

Upgrade Notes

Existing SQLite session databases are migrated in place with a new
messages.metadata_json column the first time the session store opens them.
Custom WebSocket clients that relied on implicit memory injection should now
pass memory_references: ["summary"], ["profile"], or both. Empty or absent
memory references intentionally mean "do not attach long-term memory".
Knowledge bases that still report invalid persisted vectors should be
re-indexed after confirming the active embedding provider, model, dimension,
and endpoint URL.
Notebook summary streaming clients should expect cleaned summary output after
assembly rather than relying on every raw model chunk being forwarded.
NVIDIA NIM users should configure an OpenAI-compatible model under the new
provider and keep stream_options.include_usage disabled for this gateway.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.2...v1.3.3

View release on GitHub

v1.3.2 Breaking risk 2mo

Breaking changes

Embedding configuration changed from base URLs to full endpoints; auto-migrated for known providers, custom gateways require manual configuration review

Security fixes

Prevents reasoning model scratchpad output from leaking into long-term memory

Notable features

Explicit embedding endpoint URLs with auto-migration for known providers
Invalid index detection with re-index guidance for RAG
Thinking tag removal from long-term memory

Full changelog

DeepTutor v1.3.2 Release Notes

Release Date: 2026.04.29

v1.3.2 is a focused stability release after v1.3.1. It tightens the embedding
endpoint contract, makes LlamaIndex RAG recover more cleanly from stale or
invalid indexes, and prevents reasoning-model scratchpad output from leaking
into long-term memory.

Highlights

Transparent Embedding Endpoint URLs

Embedding configuration is now explicit about the exact URL DeepTutor will call.
This removes the hidden "base URL vs endpoint URL" ambiguity that could make a
successful Settings test behave differently from a Knowledge Base re-index.

Settings now shows Endpoint URL for embeddings - the Web settings page
labels embedding URLs as endpoint URLs and explains that DeepTutor posts to
the visible URL exactly, without appending /embeddings or /api/embed at
request time.
Provider defaults are full endpoints - OpenAI, OpenRouter, Jina, vLLM/LM
Studio, and SiliconFlow default to /embeddings; Ollama defaults to
/api/embed; Cohere defaults to /embed; DashScope keeps its native
multimodal embedding endpoint.
Legacy base URLs are migrated safely - saved embedding profiles using
old-style bases such as https://api.openai.com/v1,
https://openrouter.ai/api/v1, or http://localhost:11434 are normalized to
the full endpoint form and persisted back to the model catalog. Custom
OpenAI-compatible URLs are left untouched.
Misconfigured endpoints fail early - the embedding client now rejects
known-provider URLs that point to a root/base path instead of the real
embedding endpoint, with an actionable message before indexing starts.
OpenRouter embedding uses exact-URL HTTP - public embedding providers no
longer route through the OpenAI SDK's hidden path-appending behavior.
custom_openai_sdk remains available for legacy configs, but is hidden from
the Settings provider dropdown.
Connection-test diagnostics match runtime behavior - embedding tests now
report "POSTed exactly as shown in Settings", matching the adapter behavior
used by RAG indexing and retrieval.

RAG Re-index and Retrieval Resilience

The LlamaIndex pipeline now refreshes embedding state more aggressively and
turns invalid persisted vectors into clear re-index guidance instead of raw
Python or NumPy errors.

Cached pipelines pick up Settings changes - initialize, search, and
incremental add paths reconfigure LlamaIndex before use, so a long-lived
pipeline does not keep embedding model, dimension, or endpoint settings from
an older Settings session.
Embedding clients refresh when config changes - the shared embedding
client is recreated when the resolved runtime config changes, and the
LlamaIndex CustomEmbedding adapter fingerprints the active config before
reusing a cached client.
Persisted index vectors are validated before retrieval - LlamaIndex
storage now checks the saved vector store for null, non-numeric, non-finite,
dropped, or inconsistent vectors before running similarity search.
Invalid indexes return a re-index hint - known failures such as
unsupported operand type(s) for *: 'NoneType' and 'float', vector shape
mismatches, and newly detected invalid persisted vectors now return
needs_reindex: true with a user-facing explanation.
Embedding connectivity checks use the same validation path - the
pre-index smoke test validates provider output with the same batch validator
used during indexing and retrieval.
RAG error logs are quieter when the fix is known - classified invalid
embedding/index failures are logged as actionable warnings instead of noisy
full tracebacks.

Memory Cleanup for Thinking Models

Memory refresh now strips private reasoning blocks before they can become
durable user memory.

Thinking tags are removed before writes - profile and summary rewrites run
through the shared clean_thinking_tags() helper after code-fence cleanup, so
<think> / <thinking> blocks from reasoning models are not saved into
PROFILE.md or SUMMARY.md.
Existing memory files self-repair on read - if an older memory file
already contains closed or unclosed thinking tags, reading the snapshot cleans
the content and writes the repaired version back to disk when possible.
Manual memory edits use the same cleanup - direct memory writes also pass
through the cleaner, keeping UI edits, refreshes, and runtime reads aligned.

Settings and Runtime Polish

Embedding provider choices are less confusing - Settings no longer offers
the legacy custom_openai_sdk provider in the public dropdown, while existing
saved profiles continue to resolve for backwards compatibility.
Model catalog normalization is persisted - catalog loads now save when
normalization changes active profile/model IDs or embedding endpoint URLs,
preventing the same migration from repeating on every startup.
OpenAI-compatible embedding errors are clearer - non-JSON or HTML
embedding responses now point to wrong endpoint/model pairings without
incorrectly suggesting only one gateway-specific cause.
Deep Solve ReAct calls are aligned again - the solver loop no longer
passes a stale attachments keyword into SolverAgent.process(), avoiding a
runtime TypeError while keeping attachment forwarding on the planner and
replan calls where it is supported.

Tests

Added endpoint migration coverage for OpenAI, OpenRouter, Ollama, and custom
embedding profiles.
Added Settings API coverage for full endpoint provider choices and hidden
custom_openai_sdk.
Added embedding client coverage for endpoint validation, OpenRouter's raw HTTP
adapter path, client refresh on config changes, and exact URL transparency.
Added LlamaIndex coverage for stale embedding-client refresh, repeated
settings reconfiguration, invalid persisted vector detection, and re-index
hints for invalid indexes.
Added memory coverage for closed and unclosed thinking tags, plus repair of
existing memory files during reads.
Ran targeted Deep Solve/RAG capability tests covering solver runtime wiring
after the stale attachments argument fix.

Upgrade Notes

Embedding URLs in Settings should now be full endpoint URLs. Existing known
provider profiles are migrated automatically, but custom gateways should be
reviewed manually if they use non-standard paths.
If Knowledge Base search still reports invalid embedding vectors, re-index the
affected KB after confirming the active embedding provider, model, dimension,
and endpoint URL.
Memory files containing old <think> blocks will be cleaned the next time the
Memory page or memory service reads them; this read can update the underlying
PROFILE.md or SUMMARY.md file to persist the cleaned version.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.1...v1.3.2

View release on GitHub

v1.3.1 Breaking risk 3mo

Notable features

Safer RAG routing: LLM can't override selected knowledge base, no silent fallbacks
Embedding validation: responses checked for null values, invalid dimensions, and shape mismatches before use
Docker improvements: memory volume mounting, memory migration safety, TutorBot auto-start preservation, IME-safe messaging

Full changelog

DeepTutor v1.3.1 Release Notes

Release Date: 2026.04.28

v1.3.1 is a stability release after v1.3.0. It focuses on safer RAG routing,
stronger embedding validation, more reliable Docker/TutorBot restarts, and a
set of small but important Web UX fixes.

Highlights

Safer RAG and Knowledge Base Routing

Selected KB is now trusted system state - chat RAG calls no longer let the
LLM invent or override kb_name; the model only sees a query, and DeepTutor
routes it to the KB selected in the UI/session.
No silent fallback when RAG has no KB - if RAG is enabled but no KB is
selected, the turn skips KB retrieval with a clear progress message instead
of accidentally querying stale/default state.
Default KB aliases are handled consistently - default, current,
selected, and Chinese equivalents resolve to the configured default KB for
tool calls, file listing, and re-index APIs, while a real KB named default
still wins over the alias.
Incremental document adds use the service layer - adding files now goes
through RAGService.add_documents, keeping add/search/re-index behavior on
the same provider and index-version path.
RAG internals are easier to maintain - the LlamaIndex pipeline was split
into focused modules for loading, embedding, storage, errors, and orchestration;
smart multi-query retrieval now lives in SmartRetriever.

Embedding and Index Reliability

Embedding responses are validated before use - DeepTutor now rejects
dropped vectors, null values, non-numeric values, non-finite values, and
inconsistent dimensions before they reach LlamaIndex.
Connection tests probe batch behavior - embedding smoke tests now send a
tiny batch, catching providers that only return one vector or change
dimensions between inputs.
Bad indexes fail with actionable messages - null-vector or shape mismatch
retrieval failures now return a re-index hint instead of exposing low-level
NoneType * float style errors.
Index status is less noisy during writes - empty in-progress version
directories no longer mark brand-new KBs as needs_reindex, and failed empty
version folders are cleaned up when possible.
Embedding examples were clarified - services/embedding/.env.example
now documents full endpoint URL semantics and concrete provider examples for
OpenAI, SiliconFlow, Ollama, Cohere, Jina, vLLM, Azure OpenAI, DashScope, and
OpenAI-SDK-style gateways.

Docker, Memory, and TutorBot Runtime

Docker images can expose the app version - APP_VERSION is passed through
to both backend and frontend runtime environment variables.
Shared memory has its own Docker volume - compose files now mount
./data/memory:/app/data/memory; README persistence docs were updated.
Memory migration is safer - legacy SUMMARY.md and PROFILE.md files are
copied into data/memory even when the target directory already exists, while
existing target files are preserved.
Memory refreshes are serialized - concurrent profile/summary rewrites now
run under a lock to avoid racing writes.
TutorBot restart intent is preserved - graceful server shutdown keeps each
bot's auto_start flag intact for the next Docker/host restart, while manual
stops still disable auto-start.
TutorBot shares long-term memory - started bot agents now receive the
shared memory directory instead of running without it.

Web and Authoring UX Fixes

IME-safe message sending - chat composers no longer submit when Enter is
being used to confirm Chinese/Japanese/Korean IME candidates.
Knowledge list stays fresh in chat - the chat page reloads KB metadata on
focus, page show, and visibility changes, so newly created or re-indexed KBs
appear without a full refresh.
Markdown preview protects pseudo-tags - unknown HTML-like tags such as
<think> are escaped for display while preserving source line counts for
editor/preview sync.
Co-Writer output strips reasoning tags - closed, aliased, attributed, and
unclosed <think> / <thinking> blocks are removed from final edit output.
Theme initialization runs before hydration - the theme script is rendered
from the server so dark/light preference is applied before React hydrates.
Knowledge UI feedback is clearer - index version chips have cleaner labels,
failed empty active indexes are explained, and 404s from newer UI vs older
Docker backend now suggest pulling/recreating the container.
Memory page feedback is clearer - refresh calls distinguish "checked, no
long-term updates" from real failures.

Startup and Windows Robustness

CLI/server streams tolerate legacy code pages - startup scripts and API
runners use replacement-safe text streams to avoid Unicode crashes on Windows
locales such as GBK/CP936.
Child processes inherit safer encoding - web/tour startup paths set
PYTHONIOENCODING=utf-8:replace.
Generated frontend env files are UTF-8 - start_web.py now writes
web/.env.local with explicit UTF-8 encoding.
GHCR compose pulls fresher images - docker-compose.ghcr.yml now uses
pull_policy: always for the published image.
Docs/locales were refreshed - README persistence notes were updated, the
Polish README link was added, and English/Chinese UI copy gained missing
labels for knowledge, memory, TutorBot soul templates, and Co-Writer drafts.

Tests

Added/updated coverage for chat RAG KB routing, default KB aliases, knowledge
file/re-index APIs, embedding batch validation, LlamaIndex invalid-vector
failures, in-progress index-version status, incremental add storage layout,
TutorBot auto-start/shared memory, memory migration, Windows CLI encoding,
IME keyboard handling, markdown tag escaping, and reasoning-tag cleanup.

Upgrade Notes

Docker users should pull and recreate the container so the Web UI and backend
knowledge APIs stay in sync.
If a KB was indexed with a broken or changed embedding provider, use
Re-index from the Knowledge page after fixing the embedding settings.
If you persist data manually, add the new shared memory mount:
./data/memory:/app/data/memory.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.3.0...v1.3.1

View release on GitHub

v1.3.0 Breaking risk 3mo

Notable features

Versioned knowledge base indexes with embedding-aware access patterns and background re-indexing API
Embedding adapters for OpenAI SDK, Aliyun DashScope, SiliconFlow, OpenRouter, and multimodal support
Knowledge management page split into dedicated tabs with file browser, re-index controls, and live task status

Full changelog

DeepTutor v1.3.0 Release Notes

Release Date: 2026.04.27

Highlights

Versioned Knowledge Base Indexes and Re-index Workflow

Knowledge bases now keep vector indexes per embedding configuration instead of treating one llamaindex_storage/ directory as the whole truth. Each new index version records an embedding signature and metadata, so switching models no longer has to overwrite the previous index, and switching back can reuse a ready version when the signature matches.

Versioned storage layout — new indexes are written as flat version-N/ directories with meta.json; legacy root and nested llamaindex_storage layouts remain readable for existing installs.
Embedding-aware reads and writes — RAG search, document additions, manager statistics, and delete/cleanup paths now resolve storage through the active embedding signature. If no ready version matches the current embedding model, retrieval returns a clear needs_reindex signal instead of failing later with an opaque storage error.
Background re-index API — POST /api/v1/knowledge/{kb_name}/reindex rebuilds a KB against the active embedding configuration, streams logs through the existing task channel, and returns noop: true when a matching ready index already exists.
Index status surfaced to the UI — KB summaries now include index-version metadata, active-match state, embedding mismatch flags, progress, and re-index readiness so the frontend can explain why a KB needs rebuilding.

Knowledge Management Page Rebuild

The Knowledge page has been split from a monolithic screen into a focused master-detail workspace for day-to-day KB operations.

Dedicated KB detail tabs — Files, Add documents, Index versions, and Settings now live as separate sections with a compact header showing provider, embedding model, default status, update time, and live task status.
Raw file browser and inline preview — the new Files tab lists documents from the KB raw/ directory and opens them in an inline preview pane, reusing the chat preview pipeline for PDFs, images, Markdown, code/text, and fallback downloads. The file list can collapse to reclaim preview space.
Re-index controls and logs — the Index versions section shows active, stale, legacy, and inactive versions, with a one-click Re-index action and live process logs for rebuild tasks.
Progress and history hooks — useKnowledgeBases, useKnowledgeProgress, and useKnowledgeHistory merge server state with live WebSocket/SSE progress, auto-refresh active work, and keep recent create/upload/re-index outcomes visible.

Embedding Runtime, Provider Coverage, and Dimension Discovery

The embedding stack was tightened around provider-specific behavior instead of assuming every service behaves like OpenAI's default endpoint.

No more hard-coded 3072 default — embedding dimension now starts as unknown/empty and is auto-filled from the provider response after a successful test connection. The test probe deliberately avoids sending dimensions so it measures the model's native vector length before any Matryoshka truncation.
Full endpoint URL semantics — httpx-based embedding adapters now treat EMBEDDING_HOST / catalog URLs as the exact endpoint to call, while the new openai_sdk adapter keeps SDK-style /v1 base URL behavior. .env.example documents concrete endpoint examples for OpenAI, Cohere, Jina, Ollama, SiliconFlow, and Aliyun DashScope.
New embedding adapters and bindings — added official OpenAI SDK embedding support, Aliyun DashScope native multimodal embeddings, SiliconFlow presets, OpenRouter/custom OpenAI-SDK profiles, provider batch limits, multimodal provider flags, and per-provider API-key fallbacks.
Multimodal embedding requests — EmbeddingRequest now accepts structured contents plus DashScope enable_fusion; Cohere v2, Jina, OpenAI-compatible gateways, and DashScope handle multimodal payloads through their own adapter rules.
Better provider errors — embedding failures now preserve provider status, body, model, and URL context, with clearer handling for 4xx responses and non-JSON/HTML gateway replies.

LLM Reasoning Streams and Vision Attachment Robustness

Reasoning traces and image attachments now flow through more provider paths with less provider-specific surprise.

Reasoning deltas — on_reasoning_delta is wired through the base LLM provider contract, OpenAI-compatible streaming, Azure SDK streaming, Anthropic paths, and OpenAI Responses parsing. Streaming output wraps reasoning text in <think>...</think> before normal content resumes.
DeepSeek reasoning defaults — DeepSeek reasoning model patterns can auto-inject a high reasoning effort when the caller has not specified one, matching providers that require an explicit switch to surface thinking output.
Vision URL capability flags — provider/model capabilities now distinguish "supports vision" from "accepts image URLs." Moonshot/Kimi vision models and Anthropic-style adapters force local attachment URLs into inline base64 when possible.
Local attachment URL resolution — /api/attachments/... image URLs can be resolved back through the attachment store and sent as base64 to providers that reject remote URL form; unresolved external URLs are counted as dropped image inputs instead of silently pretending they were sent.

Space Hub, Skills Tags, and Personal Library UX

Personal learning artifacts have been gathered under a new Space area in the sidebar.

Space navigation — /space redirects to /space/notebooks and the new mini-nav groups Notebooks, Question Bank, Skills, and Memory with a shared section header style.
Notebooks section — notebooks can be created, deleted, searched, opened, and inspected with rendered record previews. Saved TutorBot, chat, research, and Co-Writer outputs receive distinct badges and can link back to the original chat session when metadata is available.
Question Bank section — quiz entries can be filtered by all/bookmarked/wrong-only, grouped with categories, renamed, removed from categories, bookmarked, deleted, and opened back in their source context.
Memory section — the Memory page moves into Space with edit/preview modes for summary and profile, manual save, refresh-from-session, clear actions, unsaved-change status, and localized feedback.
Skills section — user-authored skills now support tags in frontmatter plus a .tags.json vocabulary. The API adds tag list/create/rename/delete endpoints, and the UI can filter by tag, manage tags, rename skills, and edit tag assignments while preserving the existing SKILL.md workflow.

Dependency Layers, TutorBot Debugging, and Windows Startup

The install story has been reorganized around pyproject extras, with requirements files kept as mirrors for Docker/CI environments. This also makes TutorBot setup and channel debugging less ambiguous: the agent engine, channel SDKs, Matrix native dependencies, and core server provider imports now live in clearly separated layers.

Extras hierarchy — .[cli] now includes RAG, document parsing, and built-in provider SDKs; .[server] builds on CLI with FastAPI/uvicorn; .[tutorbot], .[matrix], .[math-animator], .[dev], and .[all] layer optional capabilities explicitly.
TutorBot dependency boundary — requirements/tutorbot.txt now mirrors .[tutorbot], depends on server.txt, and keeps channel/agent dependencies such as cron, MCP, Telegram, Feishu/Lark, DingTalk, Slack, QQ, Socket.IO, socks, and message-pack tooling in the TutorBot layer instead of mixing them into the base install.
Matrix channel split-out — Matrix / Element support has its own .[matrix] extra and requirements/matrix.txt, with matrix-nio[e2e], Markdown sanitization helpers, and explicit libolm setup notes for native encryption dependencies.
Runtime dependency fix (#391) — loguru and json-repair moved into the server dependency layer because provider-core imports need them before TutorBot is involved. This fixes clean server installs that previously crashed on missing modules.
Windows launcher robustness (#391, #398) — scripts/start_web.py now reads backend/frontend subprocess output as UTF-8 with replacement, avoiding UnicodeDecodeError on Windows locales such as GBK.
Docs and CLI hints — README, Chinese README, CLI README, and CLI error messages now point users to pip install -e ".[cli]" / pip install -e ".[server]" instead of older requirements-first commands. A new requirements/matrix.txt mirrors the Matrix extra and documents the native libolm prerequisite.

Bug Fixes

Knowledge upload/create diagnostics (#392, #405) — KB initialization, upload, and re-index tasks now propagate failure details and stack traces through task logs; the UI can show richer errors instead of appearing to do nothing when background ingestion fails.
KB name validation — HTTP and CLI creation paths now reject path-like or URL-reserved characters while preserving Unicode-friendly names, preventing invalid KB folders and unsafe routes.
Case-insensitive document discovery — KB directory scanning and CLI document collection now use the shared file router, so uppercase extensions such as .PDF and .MD are accepted consistently.
Safer document filenames — uploaded filenames are normalized, path fragments are stripped, and extensions are lowercased before validation and storage.
Raw file serving safety — KB raw-file endpoints resolve paths strictly under the raw/ directory and reject traversal attempts.
Model catalog environment overlay — .env values are only synced into the catalog while it still looks pristine, avoiding accidental overwrites once users have multiple custom profiles.
Research reporting fallback (#404) — the reporting agent's JSON-fallback warning now uses an f-string so loggers that do not apply % formatting still include the section title cleanly.

Test Suite Expansion

Knowledge/RAG — new coverage for KB naming, index-version allocation and read priority, legacy-to-flat storage compatibility, LlamaIndex storage layout, raw directory initialization, case-insensitive file routing, KB deletion, and API upload edge cases.
Embedding/config — new tests cover dimension auto-fill from test probes, catalog .env overlay behavior, DashScope and OpenAI SDK adapters, Qwen3 send_dimensions, URL transparency, non-JSON provider responses, and multimodal embedding requests.
LLM/multimodal — new tests validate reasoning/vision capability behavior and local attachment URL conversion for providers that require inline base64 images.
CLI and validation — CLI KB collection and document validator tests cover uppercase extensions, Chinese filenames, and Windows-style path stripping.

Community Contributions

@jonathanzhan1975 — Fix Windows server startup and missing server runtime dependencies affecting clean Web/TutorBot installs (#391)
@kagura-agent — Clean up reporting-agent fallback logging for JSON parse failures (#404)

Recent open discussions after v1.2.5 also shaped this release window, especially KB upload/create failures (#392, #405), TutorBot Agent restart/state reports (#385), Windows startup reports (#398), dependency-installation pain (#402), JSON robustness feedback (#400), and the next wave of Space/Memory/project organization requests (#397, #401, #403).

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.2.5...v1.3.0

View release on GitHub

v1.2.5 New feature 3mo

Notable features

Persistent attachment store with safe /api/attachments/ serving
File preview drawer for PDFs, images, code, Office docs
Broadened text/code support (JSONC, Vue, Kotlin, Solidity)

Full changelog

DeepTutor v1.2.5 Release Notes

Release Date: 2026.04.25

Highlights

Attachment Store and File Preview Drawer

Chat attachments are now persisted as first-class session artifacts instead of living only as inline base64 blobs in message rows. The turn runtime writes every uploaded file to a new attachment store before document extraction, records a stable URL on the message, and drops the bulky base64 payload from persisted chat history.

Backend attachment store — deeptutor/services/storage/attachment_store.py introduces a pluggable AttachmentStore protocol and a default LocalDiskAttachmentStore rooted at data/user/workspace/chat/attachments. The path can be overridden with CHAT_ATTACHMENT_DIR, documented in .env.example.
Safe attachment serving — a new /api/attachments/{session_id}/{attachment_id}/{filename} router serves stored files with traversal-safe path resolution, atomic writes on upload, inline Content-Disposition, UTF-8 filename handling, and private no-cache headers.
Stable attachment metadata — Attachment now carries an id plus extracted_text; TurnRuntimeManager generates missing IDs, persists original bytes, stores the preview URL, and keeps extracted text from Office/text documents so the UI can show exactly what the assistant read.
Right-side preview drawer — FilePreviewDrawer adds a Claude-style side panel for chat files. On desktop it squeezes the chat column; on smaller screens it overlays. The shell stays mounted for instant open/close, while heavy preview bodies are deferred until after the slide animation to avoid jank.
Preview renderers — PDFs render in the browser viewer, images and SVGs render with native <img>, Markdown reuses the main Markdown renderer, code/text files use RichCodeBlock with syntax highlighting, Office files show backend-extracted plain text, and unsupported/legacy files fall back to a download card.
Attachment actions — pending composer chips and sent message cards are clickable, with keyboard focus rings, download, copy-link, Escape-to-close, and graceful "legacy file not stored" messaging for old sessions.

Broader Code and Text Attachment Coverage

The v1.2.4 text/code attachment support has been widened substantially, keeping the backend RAG router, chat extractor, frontend upload allowlist, icons, and preview highlighter aligned.

More accepted formats — FileTypeRouter.TEXT_EXTENSIONS and TEXT_LIKE_EXTS now include JSONC/JSON5, MJS/CJS/MTS/CTS, Vue/Svelte, Kotlin scripts, Groovy/Gradle, C#/Zig/Nim, Objective-C, Perl/Lua/Julia/Dart, Haskell/Clojure/Elixir/Erlang/OCaml/F#, Lisp/Scheme/Racket, Solidity, fish/vim, GraphQL/protobuf, CMake/Makefile, Terraform/HCL, nginx config, and Dockerfile-style files.
Central syntax mapping — web/lib/code-languages.ts maps extensions and special filenames (Dockerfile, Makefile, CMakeLists.txt, dotfiles, etc.) to Prism language names so preview classification and code highlighting stay in sync.
Frontend helpers exported — extOf() is now exported from web/lib/doc-attachments.ts for the preview pipeline, and document icons/categories were expanded to match the new extension set.

Attachment-Aware Deep Capabilities

Uploaded attachments now flow into more agent pipelines, not just the default chat stage.

Base agent parity — non-streaming LLM calls now use the same prepare_multimodal_messages() path as streaming calls when attachments are present, including image stripping/logging for non-vision models.
Deep Solve — instead of extracting only the first image URL, DeepSolveCapability forwards image attachments through MainSolver into planner, solver, and replan calls, so multimodal problems remain visible throughout the Plan-ReAct-Write loop.
Deep Question — topic generation and follow-up answering pass attachments to the underlying agents. Mimic mode can now use [Attached Documents] text directly when uploaded PDFs have already been extracted and stripped from base64 storage.
Deep Research — research planning accepts attachments and forwards them into the first planning LLM call, whether that is rephrase or decompose, without duplicating the same image/file context in later planning turns.
Visualize — visualization analysis now receives chat attachments, enabling diagrams or screenshots to influence render-type selection and data extraction.

TutorBot Export and Notebook Capture

TutorBot chat sessions now have the same capture paths as regular chat:

The Agent chat page adds Save to Notebook and Download Markdown actions in the header.
SaveToNotebookModal, notebook-api, backend notebook request types, and RecordType now recognize a new tutorbot record type.
The Knowledge page displays TutorBot notebook entries with their own violet badge and bot icon.
Restored TutorBot chat history now re-snaps to the bottom across multiple frames so Markdown/KaTeX growth after first paint does not leave the user above the latest message.

Setup Tour Diagnostics and Install Robustness

The guided setup tour now explains dependency failures instead of failing silently:

Bootstrap dependency installation captures stdout/stderr and prints the real pip error plus a manual retry command.
uv resolution checks common install locations (~/.local/bin, ~/.cargo/bin, Homebrew paths) before attempting installation, and reports clear next steps if uv installs successfully but is still not on PATH.
uv install failures now show localized English/Chinese hints for Python wheel availability, stale shell PATH, PyPI mirror issues, and direct installer options.
Node.js/npm version checks resolve executables through shutil.which() before running --version, improving Windows .cmd/.bat compatibility after Python subprocess hardening.

Chat and Knowledge UX Fixes

Auto-scroll reliability — useChatAutoScroll now waits for message content before attaching its mutation observer, fixing missed bottom-scroll behavior when reopening sessions whose message container was initially empty.
Preview animation polish — globals.css adds a chat-preview-shell transition so the drawer slide and chat-column squeeze move together at the same 220 ms timing.
Knowledge upload picker — KB file inputs no longer rely on the browser/OS accept filter, which could hide valid files on some systems; in-app validation still enforces the supported-file policy after selection.
Localized preview copy — English and Chinese strings were added for preview loading, copy link, unavailable previews, legacy files, remove attachment, and Office extracted-text explanation.

Test Suite Expansion

Capability attachment forwarding — tests/core/test_capabilities_runtime.py adds coverage for Deep Solve, Deep Question, Deep Research, and Visualize attachment propagation, including mimic mode with extracted document text.
Research planning edge case — tests/agents/research/test_research_pipeline_rag.py verifies that attachments are forwarded to decompose when rephrase is enabled but performs zero iterations, ensuring the first actual planning LLM call still sees the uploaded context.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.2.4...v1.2.5

View release on GitHub

v1.2.4 Breaking risk 3mo

Notable features

Text/code attachment support (Markdown, JSON, YAML, CSV, LaTeX, HTML, code files)
One-command setup tour with dependency installation
Chat Markdown export

Full changelog

DeepTutor v1.2.4 Release Notes

Release Date: 2026.04.25

Highlights

Support More Attachment Formats in Chat

The v1.2.3 document-attachment pipeline has been expanded beyond Office files. Chat attachments now accept the same text-like formats as the Knowledge Base ingestion router: Markdown, plain text, logs, JSON/YAML/TOML, CSV/TSV, LaTeX/BibTeX, HTML/XML/SVG, stylesheets, scripts, and common source-code files (.py, .js, .ts, .tsx, .java, .cpp, .go, .rs, .sql, .sh, and more).

Backend parity with KB routing — deeptutor/utils/document_extractor.py imports FileTypeRouter.TEXT_EXTENSIONS, so chat attachments and KB uploads share one source of truth for text/code formats. FileTypeRouter.decode_bytes() centralizes the UTF-8 / UTF-8-BOM / GBK / GB18030 / Latin-1 / CP1252 fallback chain.
SVG as readable source — .svg was added to the RAG text extension set and is treated as a document attachment instead of a vision image, letting the LLM inspect the XML source while the frontend still renders a safe thumbnail preview.
Typed attachment UX — web/lib/doc-attachments.ts now exposes OFFICE_EXTS, TEXT_LIKE_EXTS, SVG detection, richer MIME/extension fallback, and category-specific icons for code, shell, JSON, config, data, markup, stylesheets, plain text, Office docs, and SVG.
Composer copy updated — the drag-and-drop hint now advertises "Images, Office docs, code & text" instead of the older Office-only list.

One-Command Setup Tour

scripts/start_tour.py has evolved from a pure configuration wizard into a 7-step fresh-install path that can detect the local environment, install dependencies, and then guide provider configuration.

Dependency installation step — the tour checks Python, uv, Node.js, and npm; installs backend requirements, installs DeepTutor in editable mode, and runs frontend npm install with live terminal output.
uv pip compatibility (#376) — Python dependency installation now prefers uv pip when available, and binds --python <current-interpreter> so packages land in the interpreter running the tour instead of an unrelated .venv / $VIRTUAL_ENV.
Windows npm detection (#381) — npm invocations use npm.cmd on Windows, fixing version checks and install commands on systems where npm is exposed as a command shim rather than an executable.
Provider registry-driven LLM choices — the wizard now reads from the full PROVIDERS registry, groups providers by mode, highlights common services, and includes newer options such as custom_anthropic, OpenRouter, SiliconFlow, Volcengine / BytePlus coding providers, GitHub Copilot, OpenAI Codex, llama.cpp, OVMS, MiniMax, Mistral, Qianfan, Step Fun, and Xiaomi MiMo.
Search and config hints — Serper is available in the search-provider list; Azure/API-version and proxy prompts now show inline guidance in both English and Chinese.
Long-list terminal UX — scripts/_cli_kit.py renders long select menus inside a scrolling window with "more above/below" indicators, preventing stale terminal rows when the provider list exceeds the screen height.

Chat Markdown Export

The chat page now has a Download Markdown action next to "Save to Notebook" and "New Chat". web/lib/chat-export.ts serializes the current conversation as Markdown with a title, ISO export timestamp, role headings, capability labels, and attachment metadata, then downloads it with a sanitized date-stamped filename. This gives users a lightweight local export path for sharing, archiving, or moving a conversation into external notes.

Knowledge Base Management UI Polish

The Knowledge page was tightened up for denser day-to-day management:

Creation and upload drop zones are now simpler dashed cards with concise inline upload-policy summaries.
KB cards moved from four large stat panels to compact rows showing document count, index readiness, last-updated time, provider, embedding model, live progress, and storage path.
Default/delete actions were simplified visually, reducing card height and making large KB lists easier to scan.

Documentation and Localization Refresh

The README family was updated to match the new install story and provider surface:

Polish README — assets/README/README_PL.md adds a complete Polish translation of the project README (#379).
Setup docs — the main README and multilingual READMEs now describe the guided tour as the recommended fresh-clone path: create a Python environment, run python scripts/start_tour.py, then use python scripts/start_web.py for daily launch.
Provider docs — provider tables were refreshed for the v1.2.3 provider registry, including custom_anthropic, MiniMax's canonical endpoints, coding-plan providers, local providers, and clarified authentication notes for OpenAI Codex / GitHub Copilot.
Release/news sections — multilingual READMEs now include v1.2.2 and v1.2.3 summaries, a contributing-guide callout, and the 20k-star community milestone.

UI and Theme Fixes

Theme-aware popovers — composer capability/tool/reference/skill menus now use --popover plus backdrop blur instead of --card, improving contrast in the Glass theme and dark popover contexts.
Native color-scheme hints — light, dark, Snow, and Glass theme roots declare the correct color-scheme, improving native controls and browser-rendered surfaces.
Smooth-scroll hydration — global smooth scrolling is now gated behind html[data-scroll-behavior="smooth"], with the attribute set by the root layout to avoid applying it unintentionally.
Sidebar logo sizing — sidebar logo images now pin explicit rendered width/height classes to prevent small layout shifts.

Cleanup

Removed the stale nanobot submodule pointer and the deprecated scripts/extract_numbered_items.sh stub from the repository. The v1.2.x codebase now relies on the in-tree document extraction and RAG routing paths instead of that legacy helper.

Test Suite Expansion

Backend document extraction — tests/utils/test_document_extractor.py now covers plain text, Python source, JSON, CSV, Markdown, UTF-8-BOM input, GBK fallback decoding, and SVG source extraction.
Frontend attachment classification — web/tests/doc-attachments.test.ts now covers text/code acceptance, SVG-as-document routing, case-insensitive SVG filename detection, and the new icon categories for code, JSON, config, shell, data, markup, stylesheets, and SVG.

What's Changed

fix: use npm.cmd for version detection on Windows by @jonathanzhan1975 in https://github.com/HKUDS/DeepTutor/pull/381
fix(scripts): prefer uv pip over python -m pip in start_tour.py by @rogercsi in https://github.com/HKUDS/DeepTutor/pull/376
docs: add Polish translation of README by @kKamUL in https://github.com/HKUDS/DeepTutor/pull/379

New Contributors

@jonathanzhan1975 made their first contribution in https://github.com/HKUDS/DeepTutor/pull/381
@rogercsi made their first contribution in https://github.com/HKUDS/DeepTutor/pull/376
@kKamUL made their first contribution in https://github.com/HKUDS/DeepTutor/pull/379

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.2.3...v1.2.4

View release on GitHub

v1.2.3 Breaking risk 3mo

Notable features

Document attachments (PDF, DOCX, XLSX, PPTX) in chat with preview cards
Model thinking block display with collapsible ModelThinkingCard
LLM provider core refactor with dedicated modules per provider family

Full changelog

DeepTutor v1.2.3 Release Notes

Release Date: 2026.04.24

Highlights

Document Attachments in Chat

Chat now accepts non-image file attachments (PDF, DOCX, XLSX, PPTX) alongside images. A new paperclip button in the composer opens a system file picker, and drag-and-drop / paste have been extended from images-only to all supported document types. Attached documents are rendered as typed preview cards (colour-coded icon + filename + size label) in both the pending-attachment bar and the message history. On the backend, a new document_extractor module extracts plain text from the uploaded bytes using PyMuPDF / pypdf / python-docx / openpyxl / python-pptx (all optional, graceful fallback) and injects the content into the [Attached Documents] section of the effective user message, so the LLM can read the file without a separate RAG call. File-type classification, per-file and total size limits, and duplicate-name detection are handled by web/lib/doc-attachments.ts on the frontend and DocumentValidator on the backend.

Model Thinking Block Display

Responses from reasoning models (DeepSeek-R1, Claude with extended thinking, QwQ, etc.) often contain <think> / <|thinking|> scratchpad blocks that were previously rendered as raw text. A new parseModelThinkingSegments parser (web/lib/think-segments.ts) splits assistant content into alternating text and thinking segments, and AssistantResponse now renders each thinking segment as a collapsible ModelThinkingCard — collapsed by default — so users can peek at the model's chain-of-thought without the scratchpad overwhelming the conversation. Incomplete (still-streaming) thinking blocks show a pulsing indicator and expand automatically.

Tri-State `send_dimensions` for Embeddings (#368)

Some OpenAI-compatible embedding providers reject the dimensions parameter that OpenAI's text-embedding-3-* models require. A two-part fix: first, the adapter now only sends dimensions when the model name matches text-embedding-3* (PR #368 by @jefflv). Second, a new Send Dimensions tri-state toggle (Auto / On / Off) was added to the Settings page, the .env store (EMBEDDING_SEND_DIMENSIONS), and the model catalog, giving operators explicit control. Auto (the default) preserves the #368 heuristic; On forces the parameter for providers that accept it on custom models; Off suppresses it unconditionally. The toggle is reflected end-to-end: catalog → provider_runtime.ResolvedEmbeddingConfig.send_dimensions → OpenAICompatibleEmbeddingAdapter.

LLM Provider Core Refactor

The monolithic factory.py has been rewritten around a new provider_core/ package (~3 000 lines) that gives each provider family its own module: openai_compat_provider, anthropic_provider, azure_openai_provider, github_copilot_provider, and openai_codex_provider, all inheriting from a shared BaseProvider. A provider_factory.py resolves the correct runtime provider from config, and factory.py itself shrank from ~600 to ~490 lines by deriving presets dynamically from the PROVIDERS registry instead of maintaining hard-coded dictionaries. A new context_window.py module exposes resolve_effective_context_window(), used by ContextBuilder to base the history-budget calculation on the model's true context window rather than the max_tokens output cap — improving long-conversation history recall for models with large windows.

Soul Template Editor (PR #373)

TutorBot's "Soul" (system personality) can now be authored directly inside the Agents page. The new inline editor lets users create, preview, and save SOUL.md templates with YAML frontmatter, replacing the previous workflow of hand-editing files on disk. Contributed by @srinivasrk.

Co-Writer Save to Notebook

The Co-Writer toolbar gains a Save to Notebook button (📓 icon). Clicking it opens the existing SaveToNotebookModal with a new co_writer record type, so Co-Writer drafts appear alongside chat and quiz entries in the Knowledge page's Notebooks tab. Backend: co_writer was added to the notebook record type enum, the summarize agent prompts, and the API Literal type.

Knowledge Base Management Improvements

Drag-and-drop upload — the Knowledge page now accepts file drops directly onto KB cards (or the creation form) with real-time extension/size validation, duplicate-name detection, and a typed file-selection list before upload starts.
/supported-file-types endpoint — a new REST endpoint returns the server's current upload policy (accepted extensions, per-file and per-PDF size caps) so the web client stays in sync without hard-coding.
Richer KB cards — each card now displays creation / last-updated timestamps, embedding model name, dimension, reindex-needed badge, and live progress bars with percent readout during indexing.
Delete resilience (#370) — shutil.rmtree now uses an onerror handler that clears the read-only bit and retries, preventing Docker bind-mount and Windows permission errors from leaving a KB stuck in the list.
Progress persistence — ProgressTracker now writes a snapshot file alongside the kb_config.json entry and emits task-stream events, so WebSocket subscribers and page reloads can recover live indexing state without relying on in-memory callbacks. When a KB reaches ready, the progress blob is removed from config to keep the card looking like a stable resource.

Question Generation Language Fidelity

Extracted the language-directive system from the Book engine's _language.py into a shared services/prompt/language.py module. The Deep Question agents — Generator, IdeaAgent, and FollowupAgent — now call append_language_directive() on their system prompts, ensuring that quiz questions, answer choices, and follow-up answers respect the user's configured language instead of occasionally drifting to English.

Settings Page Tour Redesign

The Run Tour button moved from the page bottom to the top toolbar, sitting alongside Save Draft and Apply. The guided tour itself was expanded to walk through the full save-and-test cycle (Save → Diagnostics → Apply). The former bottom area was replaced with a concise configuration note explaining the runtime priority of model_catalog.json over .env.

CLI: Notebook `add-md` and `replace-md`

Two new deeptutor notebook sub-commands let users add or replace Markdown content in a notebook record directly from the terminal, useful for scripted workflows and CI pipelines.

Bug Fixes

TutorBot pure-CJK bot name crash — creating a bot whose name contains only non-ASCII characters produced an empty slug, breaking the API route. A stable ASCII fallback (bot-<hash>) is now generated, and the frontend surfaces a creation error toast instead of silently failing.
React 19 I18nProvider render-time setState warning — i18n.init() was being called synchronously during the first render of I18nProvider, triggering a React 19 warning about state updates in the render phase. Initialization is now performed at module-load time; the provider body only syncs document.documentElement.lang via useEffect.
Skills default to off — skillsAutoMode now defaults to false so new users aren't surprised by automatic skill injection until they intentionally enable it.
Moonshot default Base URL — changed from https://api.moonshot.ai/v1 to https://api.moonshot.cn/v1 to match the provider's current canonical endpoint, with all multilingual README files updated.
Version badge display — fixed a minor rendering issue in the sidebar version badge introduced during the v1.2.2 merge.

Provider Registry Enhancements

custom_anthropic — a new provider spec for Anthropic-API-compatible endpoints (e.g. AWS Bedrock proxies) that routes through the Anthropic backend instead of OpenAI-compat.
thinking_style — new ProviderSpec field allowing providers like Volcengine to advertise how they signal extended-thinking mode.
Alias expansion — lm-studio, anthropic-compatible, openai-compatible (hyphenated) are now recognized; canonical_provider_name() uses to_snake() for more robust normalization.

Test Suite Expansion

Net +1 900 test lines across 15 new or expanded files: test_document_extractor.py (245), test_language_prompts.py (156), think-segments.test.ts (122), test_provider_runtime.py (103), test_llm_probe_config.py (116), test_manager_delete.py (79), doc-attachments.test.ts (75), test_context_window_detection.py (60), test_progress_tracker.py (58), quiz-question-type.test.ts (58), plus expansions to test_factory_provider_exec.py, test_context_builder.py, test_chat_params_config.py, test_config_module.py, test_knowledge_router.py, and test_start_tour.py.

What's Changed

fix(embedding): gate dimensions for text-embedding-3 models by @Jeff-Lv in https://github.com/HKUDS/DeepTutor/pull/368
feat: Revamp soul template creation and usage by @srinivasrk in https://github.com/HKUDS/DeepTutor/pull/373
feat(cli): add add-md and replace-md commands to notebook group by @zbinxp in https://github.com/HKUDS/DeepTutor/pull/371

New Contributors

@Jeff-Lv made their first contribution in https://github.com/HKUDS/DeepTutor/pull/368
@zbinxp made their first contribution in https://github.com/HKUDS/DeepTutor/pull/371

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.2.2...v1.2.3

View release on GitHub

v1.2.2 Breaking risk 3mo

Notable features

User-authored skills system with SKILL.md
Chat input performance overhaul for long conversations
Auto-fallback for response_format rejection

Full changelog

DeepTutor v1.2.2 Release Notes

Release Date: 2026.04.22

Highlights

User-Authored Skills System

Introduced a full Skills subsystem that lets users create, edit, and activate custom SKILL.md files from the web UI. Each skill lives under data/user/workspace/skills/<name>/SKILL.md with YAML frontmatter (name, description, optional triggers) and a Markdown body that is injected verbatim into the chat system prompt when active.

Backend — SkillService (deeptutor/services/skill/service.py) provides CRUD + listing + selection with strict name validation (^[a-z0-9][a-z0-9-]{0,63}$), frontmatter parsing, and directory-level isolation. REST API mounted at /api/v1/skills with GET /list, GET /{name}, POST /create, PUT /{name}, DELETE /{name}.
Frontend — a new Skills tab in the Knowledge management page (web/app/(utility)/knowledge/page.tsx) with inline SKILL.md editor, and a skill picker menu in the chat composer. Skills can be toggled individually or set to auto mode (sends ["auto"] to the backend). web/lib/skills-api.ts provides a cached client-side API layer. UnifiedChatContext carries skills through the WebSocket payload.

Chat Input Performance Overhaul (#351, #360, #362)

Eliminated input lag in long conversations through deep state colocation:

ComposerInput — extracted from ChatComposer into its own memo'd component so frequent keystroke re-renders are isolated from the rest of the composer (capability panels, reference chips, tool menus). Exposes an imperative ComposerInputHandle (clear(), getValue()) to avoid lifting input state.
SimpleComposerInput — a lightweight variant for the TutorBot chat page that strips @-mention and capability overhead entirely, fixing residual lag reported after the initial refactor.
React.memo on config panels — QuizConfigPanel, MathAnimatorConfigPanel, ResearchConfigPanel, and VisualizeConfigPanel are now wrapped in React.memo; parent callbacks in ChatPage stabilized with useCallback.
@-mention helpers exported — shouldOpenAtPopup and stripTrailingAtMention moved to ComposerInput.tsx as named exports for cross-component reuse.

Auto-Fallback for `response_format` Rejection

Extended the v1.2.1 static supports_response_format guard with a runtime auto-recovery path. When a provider unexpectedly returns HTTP 400 for response_format={"type":"json_object"} (common with LM Studio + Gemma/Qwen-style models), both LLM execution paths now detect the error, drop the field, and retry once:

aiohttp path (cloud_provider.py) — _looks_like_unsupported_response_format heuristic detects the rejection in _openai_complete and _openai_stream; on match, the request is retried without response_format and the (binding, model) pair is cached.
OpenAI SDK path (executors.py) — _create_with_format_fallback wraps client.chat.completions.create, catching BadRequestError and applying the same retry + cache logic.
Runtime cache (capabilities.py) — disable_response_format_at_runtime / is_response_format_disabled_at_runtime record discovered incompatibilities in a module-level set so subsequent calls skip response_format upfront without paying the retry cost.
_answer_now.py — stream_synthesis now checks supports_response_format before attaching response_format, preventing the 400 in the first place for fast-path answer flows.

LAN / Remote Access Fix (#340)

When another machine on the local network opens the web UI, the build-time NEXT_PUBLIC_API_BASE (typically http://localhost:8001) would resolve to the remote browser's own loopback instead of the server. A new resolveBase() helper in web/lib/api.ts detects this mismatch and swaps the hostname for window.location.hostname at runtime, so apiUrl() and wsUrl() reach the correct backend regardless of which machine opened the page.

Sidebar Version Badge & GitHub Link

The sidebar now displays the current build version alongside a status indicator:

/api/version route — a Next.js ISR endpoint that queries api.github.com/repos/HKUDS/DeepTutor/releases/latest (hourly revalidation, optional GITHUB_TOKEN for rate-limit headroom) and returns the latest tag, name, URL, and publish date.
VersionBadge — compares NEXT_PUBLIC_APP_VERSION (injected at build time via next.config.js / git describe --tags) against the fetched latest release. Shows a green dot when up-to-date, amber when outdated, and neutral when unknown. Clicking navigates to the release page.
Dockerfile — new APP_VERSION build arg piped into the Next.js env so Docker-based deployments also get accurate version display.
GitHub icon — a direct link to the repository added to both collapsed and expanded sidebar states.

Deep Solve Image Attachment Support

DeepSolveCapability now extracts the first image attachment (data-URI or URL) from context.attachments via a new _first_image_url helper and passes it to the planner and solver agents. Internally, PlannerAgent and SolverAgent were refactored to use the Attachment dataclass through BaseAgent's unified multimodal pipeline instead of manually constructing multimodal message arrays — removing two identical _build_multimodal_messages helper functions. BaseAgent.stream_llm also gained logic to construct messages from system_prompt + user_prompt when attachments are provided without explicit messages, and logs when images are stripped for non-vision models.

Agentic Chat Pipeline Attachment Passthrough

The _stage_responding and _stage_acting methods in AgenticChatPipeline now call _prepare_messages_with_attachments after building messages, ensuring that user-uploaded images are forwarded to the LLM in every chat stage — not just the initial thinking stage.

TutorBot WebSocket Resilience (#354)

Hardened the /api/v1/tutorbot/{bot_id}/ws endpoint:

Auto-start — if the bot is configured but not running when a WebSocket connects, the endpoint now calls mgr.start_bot() automatically instead of immediately closing with 4004. Unknown bot IDs still receive a JSON error payload before close.
_safe_send wrapper — all ws.send_json calls go through a helper that catches WebSocketDisconnect / RuntimeError, preventing unhandled exceptions when the client drops mid-stream.
Graceful disconnect — _handle_user_messages catches WebSocketDisconnect on receive_text and breaks cleanly.

Settings Page: API Key Masking (#355)

The API Key field on the Settings page is now rendered as type="password" by default with an Eye / EyeOff toggle button. The visibility state resets when switching between services or profiles.

Book Library UI Overhaul

Replaced the minimal "Select a book" placeholder with a full BookLibrary component (web/app/(workspace)/book/components/BookLibrary.tsx) featuring search, status-filtered cards with chapter/page counts, creation and deletion actions, and status badges (Draft / Outline / Compiling / Ready). BookSidebar was simplified to a single-book reader view with a "← All books" back button, and the book page now conditionally shows either the library or the sidebar based on the current view.

Visualization Fullscreen Mode

SVG, Mermaid, and ChartJS visualizations now have a Maximize button (top-right) that opens the graphic in a fullscreen overlay with Escape-to-close. HTML iframe visualizations are excluded since they already provide their own "Open in new tab" affordance.

Bug Fixes

Embedding adapter ConnectError not retried (#353) — added httpx.ConnectError to the retry exception tuple in OpenAICompatibleEmbeddingAdapter, so transient connection failures during embedding are retried with exponential backoff instead of raising immediately.
RAG None-embedding hardening — CustomEmbedding._aget_query_embedding and _aget_text_embedding now raise a clear ValueError when the provider returns None instead of crashing later in similarity computation. The batch method _aget_text_embeddings now determines the fallback zero-vector dimension from sibling vectors or the configured dim, and raises if neither is available (preventing silent persistence of zero-length vectors).
KB delete crash on missing directory — KnowledgeBaseManager.delete_knowledge_base now resolves the path directly via self.base_dir / name and handles the case where the on-disk folder was already removed, cleaning up the orphaned kb_config.json entry instead of raising FileNotFoundError.
CLI parse_json_object whitespace — the function now strips leading/trailing whitespace before parsing, so trailing newlines from shell piping no longer cause json.JSONDecodeError.

Test Suite Expansion

TutorBot WebSocket — 2 new tests covering auto-start of stopped-but-configured bots and JSON error payload for unknown bot IDs (tests/api/test_tutorbot_router.py).
CLI helpers — 2 tests for parse_json_object whitespace handling and invalid JSON (tests/cli/test_common.py).
Route params — 3 tests for the new firstParam utility (web/tests/route-params.test.ts).
API resolve base — tests for the LAN hostname swap logic (web/tests/api-resolve-base.test.ts).

What's Changed

fix: add ConnectError to embedding retry exceptions by @S-A-D-4 in https://github.com/HKUDS/DeepTutor/pull/353
feat: Hide the API key on settings page by @srinivasrk in https://github.com/HKUDS/DeepTutor/pull/355
fix : Tutorbot websocket resilience and CLI config parsing edge case by @srinivasrk in https://github.com/HKUDS/DeepTutor/pull/354
Fix #296: [Bug]:LLM 0, EMBEDDING 0, SEARCH 0 by @JiwaniZakir in https://github.com/HKUDS/DeepTutor/pull/340
perf(web): decouple chat input state to resolve lag in long conversat… by @jiakeboge in https://github.com/HKUDS/DeepTutor/pull/360
Perf/optimize chat input by @jiakeboge in https://github.com/HKUDS/DeepTutor/pull/362

New Contributors

@S-A-D-4 made their first contribution in https://github.com/HKUDS/DeepTutor/pull/353
@JiwaniZakir made their first contribution in https://github.com/HKUDS/DeepTutor/pull/340

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.2.1...v1.2.2

View release on GitHub

v1.2.1 New feature 3mo

Notable features

Regenerate last response across CLI (/regenerate), WebSocket (type: regenerate), and Web UI
Per-stage token limits (responding, answer_now, thinking, observing, acting, react_fallback) and temperature configurable via agents.yaml
Fixed dark code blocks unreadable on light theme, None embeddings crash in LlamaIndex, and Gemma model json_object response format rejection

Full changelog

DeepTutor v1.2.1 Release Notes

Release Date: 2026.04.21

Highlights

Per-Stage Token Limits & Temperature for Chat (#348)

Promoted the agentic chat pipeline to a first-class config citizen in agents.yaml. A new capabilities.chat block exposes per-stage max_tokens (responding, answer_now, thinking, observing, acting, react_fallback) and a shared temperature, deep-merged over baked-in defaults via services/config/loader.py::get_chat_params(). The responding and answer_now budgets jump from the previous hard-coded 1800 to 8000, eliminating the mid-sentence truncation that was clipping long answers. Internally, _ChatLimits.from_config coerces every legacy shape (missing keys, scalar instead of dict, partial overrides) into a stable dataclass so existing installs keep working without touching their YAML. 10 new unit tests cover loader resolution, deep-merge precedence, and dataclass coercion.

Regenerate Last Response — CLI, WebSocket, Web UI (#349)

Added a real regenerate flow that re-runs the previous user message in place, working uniformly across every entry point:

CLI — /regenerate (alias /retry) inside the deeptutor chat REPL.
WebSocket — type: "regenerate" message on /api/v1/ws, with optional overrides for capability, tools, knowledge_bases, language, config.
Web UI — a per-message Regenerate button (RefreshCcw icon) on the last assistant turn for chat-capability replies.

On the backend, TurnRuntimeManager.regenerate_last_turn rolls back the trailing assistant via the new SQLiteSessionStore.delete_message / get_last_message helpers, then reuses start_turn with _persist_user_message=False and _regenerate=True so the user row isn't duplicated and memory_service.refresh_from_turn isn't run a second time. Pre-flight checks raise non-fatal regenerate_busy (another turn is running) or nothing_to_regenerate (no prior user message) errors instead of silently failing. The _stage_responding LLM call also gained empty-response diagnostics that surface a structured warning when the model returns no content. 18 new tests cover all three reject paths, the delete-then-restart flow, the memory-refresh-skip contract, and the WebSocket round-trip.

UI Harmony Polish for Regenerate

Two follow-up tweaks so the new button matches the rest of the chat UI and behaves predictably under server rejection:

i18n — added Regenerate keys to web/locales/{en,zh}/app.json (Regenerate / 重新生成) and switched ChatMessages.tsx from a hardcoded "Regenerate" string to t("Regenerate"), matching the existing t("Copy") pattern in the same row.
Optimistic-pop rollback — when the server rejects a regenerate request pre-flight (regenerate_busy / nothing_to_regenerate), the optimistic POP_LAST_ASSISTANT + STREAM_START placeholder is now restored via a new RESTORE_ASSISTANT reducer action. The popped message is held in a per-key pendingRegenerateRef and cleared on done or any terminal result, so the transcript never silently loses the user's last reply.

Bug Fixes

Dark code blocks unreadable on light theme (#352) — the hard-coded #1f2937 / #292524 code-block background combined with .prose pre (forcing #D6D3D1) and .prose code:not(.md-code-block__code):not(.md-inline-code) (overriding to var(--foreground)) was producing near-black text on near-black backgrounds in light mode. Added a higher-specificity .md-renderer .md-code-block { ,pre,code } rule that pins #e5e7eb regardless of theme, and tagged the <code> elements in RichMarkdownRenderer and SimpleMarkdownRenderer fallbacks with the existing md-code-block__code class so the :not() guard kicks in. Thanks @DarkGenius.
None embeddings crashed LlamaIndex pipeline (#347, fixes #346) — when an embedding provider returns null for a chunk's vector, the None ended up in the vector index and blew up np.dot(NoneType) during similarity computation. Two-layer fix: _extract_embeddings_from_response now uses or [] instead of get(key, default) so explicit None values are caught, and CustomEmbedding._get_text_embeddings validates the batch result and substitutes a zero vector for any None slot. Thanks @kagura-agent.
Gemma models rejected json_object response_format (#345, fixes #344) — Gemma served through LM Studio (and similar local OpenAI-compatible servers) responds 400 "'response_format.type' must be 'json_schema' or 'text'" when handed response_format={"type": "json_object"}. Added supports_response_format: False to the existing gemma MODEL_OVERRIDES entry so the json_object path is skipped; the existing extract_json_object utilities in the visualize and math-animator agents already parse JSON from plain text, so all callers continue to work without further changes. Thanks @octo-patch.

Test Suite Expansion

Net +575 test lines: 10 cases for the chat-params loader / _ChatLimits coercion (tests/services/config/test_chat_params_config.py), 18 cases for the regenerate flow including all three reject paths, the in-place delete + restart, the memory-refresh skip, and the end-to-end no-duplicate-user contract (tests/services/session/test_regenerate.py), 14 cases for supports_response_format model overrides (tests/services/llm/test_capabilities.py), and a regression test for the None-embedding extraction path (tests/services/embedding/test_extract_embeddings.py).

What's Changed

fix(rag): guard against None embeddings in LlamaIndex pipeline by @kagura-agent in https://github.com/HKUDS/DeepTutor/pull/347
fix: disable json_object response_format for gemma models by @octo-patch in https://github.com/HKUDS/DeepTutor/pull/345
fix(web): ensure readable text in dark code blocks on light theme by @DarkGenius in https://github.com/HKUDS/DeepTutor/pull/352
feat(chat): make per-stage token limits and temperature configurable via agents.yaml by @DarkGenius in https://github.com/HKUDS/DeepTutor/pull/348
feat(chat): regenerate last response (CLI, WebSocket, Web UI) by @DarkGenius in https://github.com/HKUDS/DeepTutor/pull/349

Community Contributions

@DarkGenius — Make per-stage chat token limits configurable via agents.yaml (#348)
@DarkGenius — Regenerate last response across CLI / WebSocket / Web UI (#349)
@DarkGenius — Ensure readable text in dark code blocks on light theme (#352)
@kagura-agent — Guard against None embeddings in the LlamaIndex pipeline (#347)
@octo-patch — Disable json_object response_format for Gemma models (#345)

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.2.0...v1.2.1

View release on GitHub

v1.2.0 Breaking risk 3mo

Breaking changes

Guided Learning module (deeptutor/agents/guide/) and entire /guide web UI removed

Notable features

Book Engine: multi-agent pipeline (Ideation, Source Exploration, Spine Synthesis, Page Planning, Block Compilation) with 14 block types
Multi-document Co-Writer workspace with persistent per-document storage
Interactive HTML visualization support for stateful content

Full changelog

DeepTutor v1.2.0 Release Notes

Release Date: 2026.04.20

Highlights

Book Engine — Multi-Agent "Living Book" Compiler

Introduced a brand-new Book Engine (deeptutor/book/) that compiles user inputs — chat history, notebooks, knowledge bases, and free-form intent — into structured, block-based, interactive "living books". The engine sits parallel to ChatOrchestrator and drives a five-stage multi-agent pipeline:

Ideation — an LLM proposes a book outline from the user's intent and source material.
Source exploration — a SourceExplorer agent performs deep RAG retrieval and knowledge-base health checks (kb_health.py) to surface the most relevant passages.
Spine synthesis — a SpineSynthesizer agent merges the proposal with explored sources into a chapter/page tree (Spine → Chapter → Page).
Page planning — a PagePlanner agent designs each page as an ordered sequence of typed blocks.
Block compilation — a BookCompiler dispatches each block to its dedicated generator.

14 block types ship in Phase 1: text, callout, quiz, flash cards, code, figure, deep dive, animation, interactive, timeline, concept graph, section, user note, and a placeholder for blocks still compiling. Each generator has its own bilingual (en/zh) YAML prompts and can call RAG helpers for grounded content.

The web UI (web/app/(workspace)/book/) includes a BookCreator wizard (intent → proposal → spine confirmation), a SpineEditor for drag-and-drop chapter reordering, a PageReader with an outline nav rail and per-block renderers (concept graphs rendered as interactive force-directed diagrams, quizzes with inline grading, flash cards with flip animations, etc.), a BookProgressTimeline for real-time compilation tracking, a BookChatPanel for in-context Q&A, and a BookHealthBanner that warns when the underlying KB is unhealthy. A per-book WebSocket stream fans out compilation events to all connected clients.

Backend: POST /api/v1/book/create, POST /confirm-proposal, POST /confirm-spine, POST /compile-page, GET /list, GET /{book_id}, DELETE /{book_id}, plus WS /ws/{book_id}. CLI: deeptutor book create, deeptutor book list, deeptutor book show, deeptutor book delete.

Legacy Guided Learning Removed

The deeptutor/agents/guide/ module (guide manager, 4 agents, 8 prompt YAMLs) and the entire /guide web UI (components, hooks, types, API router — ~5,300 lines) have been removed. The Book Engine supersedes Guided Learning with a richer, more extensible architecture.

Multi-Document Co-Writer Workspace

Co-Writer is no longer a single-document scratchpad. Each document now gets its own persistent directory under data/user/workspace/co-writer/documents/ with atomic manifest writes (CoWriterStorage). The web UI routes to per-document pages (/co-writer/[docId]) and the sidebar shows a CoWriterRecent section for quick access. New API endpoints handle full document CRUD (GET /list, POST /create, GET /{doc_id}, PATCH /{doc_id}, DELETE /{doc_id}).

Interactive HTML Visualizations

The Visualize capability now supports a fourth render type — html — alongside svg, chartjs, and mermaid. When the LLM determines that the user request requires "user interaction + state changes + mixed text/graphics" (e.g. draggable demos, step-by-step walkthroughs, clickable practice exercises), it produces a self-contained single-file HTML page that renders inside an iframe. A local validation pass (is_valid_html_document) checks the output before serving; if the model returns something unrenderable, a styled fallback template is injected instead. The LLM review stage is skipped for HTML pages (saving 30–60s of latency with negligible quality loss). A new figure constraint mode restricts the LLM to svg/chartjs/mermaid only — used internally by the Book Engine's figure block.

Question Bank @-Mention in Chat

A new QuestionBankPicker component lets users @-mention individual Question Bank entries directly in the chat composer. Selected entries are resolved by the turn runtime (_build_question_bank_context) into structured Markdown context — including question text, options, user/reference answers, and explanations — and injected alongside notebook and history references so the LLM can reason over specific past quiz performance.

Prompt Externalization — Phase 2

Continued the migration of hard-coded LLM strings into editable YAML files:

answer_now — question, research, solve, and visualize capabilities now load their fast-path prompts from prompts/{en,zh}/answer_now.yaml.
notebook agents — analysis_agent and summarize_agent prompts moved to YAML; the Python modules now delegate to the prompt manager.
Capability modules (_answer_now.py, deep_question.py, deep_research.py, deep_solve.py, visualize.py) slimmed down accordingly.

Co-Writer Module Restructured

Moved deeptutor/agents/co_writer/ to deeptutor/co_writer/ (top-level service module) to reflect its standalone nature alongside deeptutor/book/.

Sidebar Overhaul

"Guided Learning" nav entry replaced with Book (Library icon).
Added BookRecent and CoWriterRecent sidebar sections with per-item navigation.
Sidebar collapsed/expanded state lifted into AppShellContext so it persists across route transitions.
Collapsed sidebar refined: logo + expand toggle layered with hover reveal, circular "New Chat" button, subtle dividers, and consistent spacing.

Capability & Config Panel Refresh

Updated shared type definitions and config panels across Quiz, Research, Visualize, and Math Animator to align with new capability options (e.g. html render mode, expanded locale keys). TracePanels in the chat UI received layout and styling improvements.

README & Localization Update

All nine localized README files (AR, CN, ES, FR, HI, JA, PT, RU, TH) updated to reflect the Book Engine, multi-document Co-Writer, and other v1.2.0 features.

Bug Fixes

Channel manager import-time crash — loguru.logger was imported at module scope, causing ImportError when TutorBot dependencies were absent. Replaced with a lazy _logger() factory function.
Obsolete test_app_facade.py — removed a stale test module that referenced deleted code paths.
Missing __init__.py in test packages — added init files across tests/agents/, tests/api/, tests/cli/, tests/knowledge/, tests/scripts/, tests/services/llm/, tests/services/memory/, tests/services/search/, tests/services/session/, and tests/tools/ to fix import resolution.

Test Suite & CI

CI now installs requirements/tutorbot.txt and caches it alongside server/cli requirements.
Removed the obsolete test_app_facade from the CI test manifest.
Updated conftest.py for RAG pipeline tests; refreshed prompt parity, capabilities runtime, LLM probe config, factory provider, model catalog, config loader, and notebook service tests to match restructured imports.
Added favicon and apple-touch-icon assets for PWA metadata.

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.1.2...v1.2.0

View release on GitHub

v1.1.2 Breaking risk 3mo

Breaking changes

RAG providers other than llamaindex removed; legacy rag_provider values auto-coerced to llamaindex
RAG scaffolding modules removed: chunkers, embedders, indexers, parsers, retrievers, pipeline orchestrator
PATCH /tutorbot/{bot_id} now returns HTTP 422 for invalid channel payloads instead of silently persisting

Security fixes

Channel secrets no longer exposed in API responses; tokens and passwords masked by default

Notable features

Schema-driven channels tab auto-discovers all channel types (Telegram, Slack, Discord, Matrix, Email, Feishu) from Pydantic schema with token reveal toggle
Channel secrets masked in API responses by default; plaintext available via ?include_secrets=true query parameter
Chat prompts externalized to editable YAML files per language

Full changelog

DeepTutor v1.1.2 Release Notes

Release Date: 2026.04.18

Highlights

Schema-Driven Channels Tab with Token Reveal (#338)

The Channels tab in the Agents page is no longer hard-coded for Telegram. It now auto-discovers every channel (Telegram, Slack, Discord, Matrix, Email, Feishu, …) and renders a form directly from each channel's Pydantic config schema — no per-channel front-end code required. Secret fields (tokens, passwords, API keys) render as masked inputs with an eye-toggle for explicit reveal. A last_reload_error banner warns when live listeners failed to restart after a config change.

Channel Secret Masking

API responses no longer expose raw channel secrets. Tokens and passwords are replaced with *** by default; the admin edit form uses ?include_secrets=true to fetch plaintext when needed. Create and update responses are likewise masked.

Channel Config Validation & Reload Hardening

PATCH /tutorbot/{bot_id} now validates channel payloads upfront and returns a 422 with structured errors instead of silently persisting bad config. reload_channels is serialised with a per-instance lock to prevent duplicate listeners, and any failure is recorded in last_reload_error so the UI can surface it.

RAG Simplified to a Single Pipeline

Removed ~2,600 lines of unused RAG scaffolding (chunkers, embedders, indexers, parsers, retrievers, pipeline orchestrator, type definitions) that existed as placeholders for never-shipped backends. The RAG service is now a thin wrapper over the single LlamaIndex pipeline. Legacy rag_provider values (e.g. lightrag) are silently coerced to llamaindex and the KB is flagged for re-indexing.

Centralized File Type Routing

Consolidated file-type classification into a single FileTypeRouter module with a flat API (get_document_type, classify_files, get_supported_extensions, etc.). The old per-provider extension helpers are gone — there's only one provider. Unknown extensions still fall through to content sniffing before being rejected.

No More Phantom Knowledge Bases

Closed every code path that could silently call RAG against a non-existent KB:

deep_solve — strips the rag tool when no KB is attached and warns the user.
deep_research — drops kb from sources, warns, and aborts if no sources remain.
SolveToolRuntime — returns a graceful "no KB selected" observation instead of crashing, keeping the ReAct loop alive.
ResearchPipeline — returns a structured "skipped" event instead of falling back to the old DE-all placeholder.
DecomposeAgent — no longer defaults to ai_textbook; disables RAG when no KB is provided.

Externalized Chat Prompts

Moved all hard-coded zh/en strings out of AgenticChatPipeline into editable YAML files (agentic_chat.yaml for each language). Stage labels, system prompts, user templates, and UI notices are now configurable without code changes. Falls back gracefully if the YAML is missing.

Thai README (#337)

Added README_TH.md with Thai-language documentation.

Bug Fixes

Research pipeline crashed without a KB — the DE-all fallback KB no longer exists in most installs; now short-circuits with a structured skip event.
Decompose agent tried RAG against ai_textbook — replaced the hard-coded default with None and a defensive guard.
Bad channel config persisted silently — now rejected at the API boundary with a 422 before reaching disk.
Concurrent reload_channels created duplicate listeners — serialised via an asyncio lock; failure leaves the bot channel-less with a clear error instead of half-rebuilt.
Channel tokens leaked in API responses — now masked by default across all endpoints.

Test Suite

Added 6 new test modules (1,042 lines total): file-type routing, KB config migration, channel schema introspection, channel secret masking, RAG/KB consistency at the capability layer, and research pipeline RAG safety. Extended existing tests for the tool runtime, knowledge router, TutorBot router, and RAG pipeline modules.

What's Changed

feat(tutorbot): Channels tab, Telegram UI, API channel reload, token … by @srinivasrk in https://github.com/HKUDS/DeepTutor/pull/338
docs: add Thai README documentation by @DoctorNasa in https://github.com/HKUDS/DeepTutor/pull/337
release: v1.1.2 — CI fix & release notes cleanup by @pancacake in https://github.com/HKUDS/DeepTutor/pull/341

New Contributors

@DoctorNasa made their first contribution in https://github.com/HKUDS/DeepTutor/pull/337

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.1.1...v1.1.2

View release on GitHub

v1.1.1 Breaking risk 3mo

Notable features

Answer Now fast-paths added to all capabilities (chat, deep_solve, deep_question, deep_research, math_animator, visualize)
Co-Writer editor with resizable splitter and bidirectional scroll sync via source line mapping
Save-to-Notebook now supports message selection mode with quick presets and role-based icons

Full changelog

DeepTutor v1.1.1 Release Notes

Release Date: 2026.04.17

Highlights

Universal "Answer Now" Escape Hatch — Per-Capability Fast Paths

Promoted "Answer now" from a chat-only affordance to a universal interrupt that respects each capability's output shape. A new shared helper deeptutor/capabilities/_answer_now.py provides the gate (extract_answer_now_context) and the prompt-friendly trace summary, and every built-in capability now owns its own fast-path branch at the top of run():

chat — synthesize the final markdown answer from the partial trace (existing behavior).
deep_solve — skip planning + reasoning, jump straight into the writer.
deep_question — skip ideation/templates, emit the full quiz JSON in one structured call (still rendered by QuizViewer).
deep_research — skip rephrase/decompose/research, write the report directly from accumulated evidence.
math_animator — skip analysis + design + summary but keep code generation + render, so the user still gets a real animation.
visualize — skip analysis + review, emit the final renderable code in one structured call.

Each fast-path preserves the same result envelope as the normal pipeline (so the Quiz / MathAnimator / Visualization viewers render unchanged) and prepends a > ⚡ Skipped X stage(s) notice so users know it was a best-effort early exit. The orchestrator no longer re-routes answer_now to chat; it now keeps active_capability and only falls back to chat if the originally selected capability has been removed from the registry. The frontend matches: handleAnswerNow no longer overrides the snapshot's capability, and a single-shot guarded AnswerNowRow component renders the action inline below the streaming trace panel.

Co-Writer — Resizable Split & Line-Anchored Scroll Sync

The Co-Writer page picked up a draggable splitter between the editor and preview panes (with a persisted ratio in localStorage) and a true bidirectional scroll-sync that survives soft-wrapped lines. Each preview block now carries a data-source-line attribute pointing back at its starting line in the markdown source (provided by remark's AST positions); on the editor side a hidden mirror element mimics the textarea's wrap geometry so we can read the real pixel-y of every source line. With both sides expressed as pixel coordinates the sync becomes a single piecewise-linear interpolation in either direction, with a per-source-line cache that invalidates only when the content or wrap width changes.

Save-to-Notebook — Message Selection Mode

SaveToNotebookModal now accepts an optional messages prop that flips the modal into "selection mode": the user picks exactly which user/assistant turns to include, and the transcript + userQuery shipped to the backend are rebuilt from the selected subset. Quick presets ("Select all", "Last turn", "Last 3 turns") and an auto-derived title that tracks the first selected user message keep the flow fast for the common cases. The modal also now uses Check / MessageSquare / User icons to distinguish roles at a glance, and reports loading state for the notebook list separately from the save spinner.

Real Notebook System Adoption Across the Stack

Migrated every remaining Notebook surface off the legacy quiz-category API onto the real /api/v1/notebook/* endpoints. A new web/lib/notebook-api.ts block exports typed helpers — listNotebooks, getNotebook, createNotebook, updateNotebook, deleteNotebook, deleteNotebookRecord — alongside the preserved quiz-only helpers. The Knowledge → Notebooks tab, the Guide page's notebook reference picker (useNotebookSelection), and the Save-to-Notebook modal all now resolve UUIDs end-to-end, so records saved from Co-Writer, Chat, or Guided Learning are immediately discoverable as references everywhere.

Unified Collapsible Settings Panel

Extracted the collapsible "Settings" section that previously only existed in ResearchConfigPanel into a shared CollapsibleConfigSection component. The Quiz, Math Animator, and Visualize panels now share the exact same chevron + summary header, and each form ships a summarizeXxxConfig helper so the collapsed state shows a meaningful one-liner (e.g. Custom · 5q · Hard · MCQ or Mimic · paper.pdf · max 10). The chat page now keeps a single panelCollapsed state for whichever capability is active, auto-expands on capability switch, and auto-collapses after sending a message so the composer stays compact during conversation.

Streaming Stop Button & Composer Polish

Replaced the spinner-inside-the-Send-button with a dedicated Stop button that appears in place of Send while a turn is streaming. A faint ring slowly rotates around the rim to signal "still working — click to cancel", with a white square front-and-center as the click target. The header above the messages (capability label + Save / New chat buttons) is now always rendered, and the messages container picks up a soft mask gradient at the top and bottom so streaming content fades in/out instead of clipping at the scroll edge. In Deep Research mode, sources moved into a dropdown with a compact summary line of the active picks, matching the pattern used by the tool selector.

TutorBot Config Manager Refactor

Rewrote TutorBotManager's config persistence into a small public API (load_bot_config, save_bot_config, merge_bot_config) with three meaningful improvements: writes are now atomic (write-temp + Path.replace) so a killed process never leaves a half-written config.yaml; merges have explicit-clear semantics — None means "leave as-is", an empty string or empty dict is an intentional clear — so clients can deliberately wipe a description or channels list; and the API endpoint forwards only model_dump(exclude_unset=True) so omitted fields fall through to the on-disk value. New regression tests cover the atomic-write contract, the corrupt-yaml fallback, and the four merge-semantics cases.

Markdown Renderer Refinements

The MarkdownRenderer family gained a trackSourceLines prop that propagates data-source-line attributes through every block element (headings, lists, paragraphs, etc.) and bypasses the line-shifting normalization passes (processMarkdownContent, normalizeMarkdownForDisplay) so AST positions stay faithful for editor/preview sync consumers. RichCodeBlock now skips react-syntax-highlighter entirely for unlabeled / text / plaintext fences (eliminating Prism "unknown language" warnings) and renders them as a tidy plain-monospace block. Mermaid detection was also extended to recognize editor.md style ```flow, ```seq, and ```sequence fences that get rewritten to mermaid by the preprocessor.

Theme & Guide UI Refresh

Tightened the default light and Snow themes with deeper foregrounds, warmer borders, and a slightly more saturated --primary (#B0501E) for better legibility against the new card surfaces. The Guided Learning page (/guide) was migrated off hardcoded slate-* / indigo-* palettes onto the design tokens (var(--card), var(--primary), var(--muted-foreground), etc.) so it now respects the theme switcher. HistorySessionPicker got the same treatment, plus a fix for session timestamps that were being treated as milliseconds instead of seconds (which produced 1970 dates).

System Message Rendering Fix

Backend system messages (e.g. quiz follow-up grounding context written by the turn runtime) are now filtered out at the UnifiedChatContext.hydrateMessages boundary and again defensively in ChatMessageList, so they never surface as ghost chat bubbles in the UI while still flowing into the LLM context as intended.

Bug Fixes

TutorBot channel config wiped on every server restart (#332) — create_and_start_bot was constructing a fresh BotConfig with empty defaults on every call, which _save_bot_config then persisted over the existing config.yaml, wiping user-configured channels (e.g. Telegram). The endpoint now loads the existing config first and overlays only client-supplied fields.
selective_access_log middleware crash on every non-200 response (#334 / #335) — the middleware passed four args to uvicorn's AccessFormatter which expects five (omitting http_version), raising ValueError: not enough values to unpack on every error response. Now reads http_version from the ASGI scope with a 1.1 fallback.
15 npm security vulnerabilities (#330) — bumped jspdf 4.0.0 → 4.2.0 (9 CVEs incl. critical PDF injection), next 16.1.1 → 16.2.3 (8 CVEs incl. HTTP smuggling, CSRF bypass), mermaid 11.12.2 → 11.14.0, and the matching eslint-config-next. npm audit fix swept up the indirect chain (flatted, lodash-es, minimatch, picomatch, dompurify, ajv, brace-expansion). End state: 0 vulnerabilities, no breaking changes.
.env.example_CN — removed an accidental // README suffix on the provider-list comment.

Test Suite Expansion

Added a new tests/services/tutorbot/test_manager_config.py module covering load/save round-trips, the corrupt-yaml fallback, atomic temp-file writes, failure-recovery after a mid-write OSError, and all four merge_bot_config semantics (no existing config, omitted-field passthrough, None-as-not-provided, and empty-value-as-explicit-clear). Extended tests/api/test_tutorbot_router.py with the explicit-clear test class. Rewrote the orchestrator answer-now routing tests to pin the new contract — active_capability is preserved when answer_now_context is set, the orchestrator falls back to chat only when the original capability is missing, and emits a clear error when neither is registered. Added a new tests/capabilities/test_answer_now.py module with 28 cases covering the shared helpers (extract_answer_now_context, format_trace_summary truncation/i18n, make_skip_notice, labeled_block, join_chunks) and every per-capability fast-path: chat, deep_solve, deep_question (including the unparseable-JSON fallback), deep_research, visualize (including code-fence stripping and invalid render_type recovery), and math_animator (verifying that run_analysis/run_design/run_summary are not invoked while run_code_generation/run_render are). A reverse test confirms that capabilities still take their normal pipeline when no answer_now_context is present.

What's Changed

Fix: TutorBot channel config wiped on every server restart by @srinivasrk in https://github.com/HKUDS/DeepTutor/pull/332
fix(api): add missing http_version arg in selective_access_log middleware by @kagura-agent in https://github.com/HKUDS/DeepTutor/pull/335
fix(web): resolve 15 npm security vulnerabilities by @srinivasrk in https://github.com/HKUDS/DeepTutor/pull/330

Community Contributions

@srinivasrk — Resolve 15 npm security vulnerabilities (#330)
@srinivasrk — Preserve existing TutorBot config when starting bot via API (#332)
@kagura-agent — Fix selective_access_log middleware unpack crash (#335)

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v1.1.0...v1.1.1

View release on GitHub

v1.1.0 Breaking risk 3mo

Breaking changes

LLM_TEST_MAX_TOKENS environment variable removed; use agents.yaml diagnostics.llm_probe.max_tokens instead

Notable features

LaTeX block math parsing improved with automatic delimiter promotion for multiline content
Extra headers forwarding in llm_complete and llm_stream functions
SaveToNotebookModal migrated to UUID-based notebook list API

View release on GitHub

v1.1.0-beta Breaking risk 3mo

Notable features

Resolved keystroke lag via message list virtualization and scroll debouncing
URL-based chat routing for bookmarkable and shareable sessions
WebSocket heartbeat and auto-reconnect with exponential backoff recovery

View release on GitHub

v1.0.3 Breaking risk 3mo

Notable features

Question Notebook with bookmarking and category-based organization
Mermaid diagram support in Visualize
Embedding model mismatch detection for knowledge bases

View release on GitHub

v1.0.2 Breaking risk 3mo

Notable features

Automatic consolidation for any provider without template
SearXNG generic fallback formatter

View release on GitHub

v1.0.1 New feature 3mo

Notable features

Visualize capability with Chart.js/SVG pipeline
Explicit Reference picker in chat composer
Quiz duplicate prevention

View release on GitHub

v1.0.0-beta.4 New feature 3mo

Notable features

Embedding progress tracking with batch reporting
HTTP 429 retry with exponential back-off
Cross-platform dependency auto-installation

View release on GitHub

v1.0.0-beta.3 Mixed 3mo

Notable features

Native openai and anthropic SDK integration
Windows Math Animator subprocess compatibility with asyncio.Queue
Full UI internationalization (English, Chinese)

View release on GitHub

v1.0.0-beta.2 Breaking risk 3mo

Breaking changes

Python 3.10 support dropped; Python 3.11+ required

Notable features

Hot settings reload without restart
MinerU nested output directory support

View release on GitHub

v1.0.0-beta.1 Breaking risk 3mo

Breaking changes

Complete package restructure: src/→deeptutor/+deeptutor_cli/
Package renamed from ai-tutor to deeptutor
LightRAG and RAG-Anything pipelines temporarily removed

Notable features

Agent-native runtime with two-layer plugin model (Tools + Capabilities)
Three unified entry points: CLI, WebSocket API, Python SDK
TutorBot multi-channel system supporting 12 messaging platforms

View release on GitHub

v0.6.0 New feature 6mo

Notable features

Frontend session persistence across refreshes
Incremental document upload to knowledge bases
Full Chinese localization with i18n

Full changelog

DeepTutor v0.6.0 Release Notes

Release Date: 2026.01.23

Highlights

Frontend State Persistence

Implemented robust session persistence across the application:

Solver, Guide, and other sessions now persist across browser refreshes
Improved state management with dedicated persistence layer
Better user experience with session continuity

Incremental Document Upload

Enhanced knowledge base with incremental document processing:

Add new documents to existing knowledge bases without full re-indexing
Significant performance improvement for large document collections
Smarter document change detection

Flexible RAG Pipeline Import

Refactored RAG initialization for better compatibility:

On-demand loading of RAG libraries (RAG-Anything, LlamaIndex)
Reduced startup time and memory footprint
Graceful fallback when optional dependencies are unavailable

Full Chinese Localization (i18n)

Added complete Chinese language support for the web interface:

Comprehensive translation across all pages and components
Dynamic language switching without page reload
i18n audit tools for translation consistency

Bug Fixes & Improvements

Enhanced LLM retry mechanism for complex agent operations
Fixed temperature parameter handling issues
Docker build optimizations and npm compatibility fixes
Added api_version parameter for Azure OpenAI support

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v0.5.2...v0.6.0

View release on GitHub

v0.5.2 New feature 6mo

Notable features

Docling alternative for RAG-Anything initialization
Logging system refactoring

Full changelog

DeepTutor v0.5.2 Release Notes

Release Date: 2026.01.18

Highlights

Docling Support for RAG-Anything

Added alternative RAG-Anything initialization using Docling as the document parser:

For users whose local environment is not suitable for MinerU
Provides a lightweight alternative for document processing
Same multimodal graph capabilities with different backend

Logging System Optimization

Refactored the logging system for better management:

Improved log output control across all modules
Better structured logging adapters
Enhanced console, file, and WebSocket handlers

Bug Fixes & Code Improvements

Optimized code structure across multiple modules
Fixed several bugs affecting user experience
Improved CI/CD workflows with Python 3.10/3.11 matrix testing

Full Changelog: https://github.com/HKUDS/DeepTutor/compare/v0.5.1...v0.5.2

View release on GitHub

v0.5.1 New feature 6mo

Notable features

Docling support for RAG-Anything document parsing
Logging system optimization for better control
Enhanced CI/CD workflows with Python 3.10/3.11 matrix

View release on GitHub

v0.5.0 Breaking risk 6mo

Notable features

Unified configuration system with environment variable references
Per-knowledge-base RAG pipeline selection (LlamaIndex, LightRAG, RAG-Anything)
Question generation overhaul with specialized agent architecture

View release on GitHub

v0.4.1 Breaking risk 6mo

Breaking changes

src/core module removed; migrate imports to src/services (load_config_with_main → src.services.config, llm_factory → src.services.llm, prompt_manager → src.services.prompt, logging → src.logging)

Notable features

LLM provider system overhaul with three deployment modes
Provider presets for OpenAI, Anthropic, DeepSeek, Ollama, LM Studio, vLLM, llama.cpp
Question generation JSON parsing robustness

View release on GitHub

v0.4.0 Breaking risk 6mo

Breaking changes

Environment variables renamed: OPENAI_API_KEY→LLM_API_KEY, OPENAI_API_BASE→LLM_HOST, EMBEDDING_DIM→EMBEDDING_DIMENSION
New required variables: LLM_BINDING, EMBEDDING_BINDING
Removed cloud providers from local settings page

Notable features

Multi-provider LLM support (OpenAI, Anthropic, Azure, Ollama, Groq, DeepSeek, Gemini)
Multi-provider embedding support (OpenAI, Jina, Cohere, Ollama, HuggingFace)
Dark mode with theme toggle and localStorage persistence

View release on GitHub

v0.3.0 Breaking risk 6mo

Notable features

Centralized PromptManager singleton with global caching and language fallback
GitHub Actions workflows for testing, dependencies, and Docker publishing
Pre-built Docker images via GitHub Container Registry

View release on GitHub

v0.2.0 Security relevant 6mo

Security fixes

Path traversal and injection vulnerabilities
RCE and LFI vulnerabilities

Notable features

Docker multi-stage deployment
Next.js 16 and React 19 upgrade

View release on GitHub

All releases

DeepTutor v1.3.10 Release Notes

Highlights

Remote Docker and CORS Recovery

Provider TLS and Rendering Fixes

Matrix Install Compatibility

Multi-User Runtime Compatibility

Tests

Upgrade Notes

DeepTutor v1.3.9 Release Notes

Highlights

TutorBot Channel and Provider Expansion

Model and Runtime Reliability

Web and CLI Polish

Multi-User and Session Store Parity

Tests

Upgrade Notes

What's Changed

New Contributors

DeepTutor v1.3.8 Release Notes

Highlights

Multi-User Workspaces

Safer Runtime Boundaries

Deployment and UI

Tests

Upgrade Notes

DeepTutor v1.3.7 Release Notes

Highlights

Thinking-Model and Gateway Compatibility

Knowledge Index Visibility

Co-Writer Editing Safety

Tests

Upgrade Notes

DeepTutor v1.3.6 Release Notes

Highlights

Catalog-Based Model Selection

TutorBot Model Control

RAG and Knowledge Reliability

Provider and Launch Fixes

Web UX Polish

Tests

Upgrade Notes

DeepTutor v1.3.5 Release Notes

Highlights

Smoother Local Launch

More Reliable RAG Tool Calls

Local Embedding Compatibility

Web UX Polish

Tests

Upgrade Notes

DeepTutor v1.3.4 Release Notes

Highlights

Book Engine, Page Chat, and Book References

Chat Language and Reasoning-Model Behavior

RAG, Documents, and Knowledge Base Recovery

Settings, Runtime State, and Logging

Documentation and Localization

Tests

Upgrade Notes

DeepTutor v1.3.3 Release Notes

Highlights

Provider and Embedding Coverage

Space, Chat Context, Skills, and Memory

Session Persistence and Message Normalization

Memory, Notebook, and Thinking-Model Cleanup

RAG and Knowledge Base Resilience

Tests

Upgrade Notes

DeepTutor v1.3.2 Release Notes

Highlights

Transparent Embedding Endpoint URLs

RAG Re-index and Retrieval Resilience

Memory Cleanup for Thinking Models

Settings and Runtime Polish

Tests

Upgrade Notes

DeepTutor v1.3.1 Release Notes

Highlights

Safer RAG and Knowledge Base Routing

Embedding and Index Reliability

Tri-State `send_dimensions` for Embeddings (#368)

CLI: Notebook `add-md` and `replace-md`

Auto-Fallback for `response_format` Rejection