Skip to content

Text Generation Web UI

v4.9 Security

This release includes 3 security fixes for security teams reviewing exposed deployments.

Published 14d LLM Frameworks
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →
This release patches 3 known CVEs

Affected surfaces

auth rce_ssrf

Summary

AI summary

Broad release touches Bug fixes, Dependency updates, Updating a portable install, and Vulkan.

Changes in this release

Security Medium

Restrict CORS to localhost by default to prevent drive-by API access.

Restrict CORS to localhost by default to prevent drive-by API access.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Security Medium

UI: Improve web search security by rejecting non-HTTP links.

UI: Improve web search security by rejecting non-HTTP links.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Security Medium

Sanitize character name in load_character to prevent path traversal.

Sanitize character name in load_character to prevent path traversal.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Security Medium

Fix path traversal in load_template_by_name (#7562).

Fix path traversal in load_template_by_name (#7562).

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Feature Medium

Add draft-mtp as new --spec-type option for MTP speculative decoding support.

Add draft-mtp as new --spec-type option for MTP speculative decoding support.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Web search results now include snippet excerpts, reducing need for fetch_webpage calls.

Web search results now include snippet excerpts, reducing need for fetch_webpage calls.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Drop link URLs from fetch_webpage output, showing plain text links instead of markdown.

Drop link URLs from fetch_webpage output, showing plain text links instead of markdown.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Prettier rendering of web_search results in chat with spinner during call.

Prettier rendering of web_search results in chat with spinner during call.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Add info message to Activate web search checkbox.

Add info message to Activate web search checkbox.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Show live generation speed (tokens/s) and context size while generating.

Show live generation speed (tokens/s) and context size while generating.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Add Linux aarch64 portable builds for DGX Spark support.

Add Linux aarch64 portable builds for DGX Spark support.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Add Check for updates button in Electron Session tab.

Add Check for updates button in Electron Session tab.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Add folder picker for models directory in Electron.

Add folder picker for models directory in Electron.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Add right-click context menu for copying text in Electron.

Add right-click context menu for copying text in Electron.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Add spellcheck toggle in Electron Session tab.

Add spellcheck toggle in Electron Session tab.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Store app data in user_data/cache/electron instead of OS default location.

Store app data in user_data/cache/electron instead of OS default location.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Disable DNS-over-HTTPS probes in Electron.

Disable DNS-over-HTTPS probes in Electron.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Auto-detect and auto-select sibling mmproj files when loading a model.

Auto-detect and auto-select sibling mmproj files when loading a model.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Detect mmproj-*.gguf files in main models folder, appearing in mmproj dropdown and hidden from regular model dropdown.

Detect mmproj-*.gguf files in main models folder, appearing in mmproj dropdown and hidden from regular model dropdown.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Treat negative --ctx-size values as auto (0).

Treat negative --ctx-size values as auto (0).

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

Add drag-and-drop file upload support to chat input (Gradio fork).

Add drag-and-drop file upload support to chat input (Gradio fork).

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: high

Feature Medium

One-click installer now tracks latest release tag, not bleeding-edge main.

One-click installer now tracks latest release tag, not bleeding-edge main.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Feature Medium

Add project icon courtesy of LMLocalizer on Reddit.

Add project icon courtesy of LMLocalizer on Reddit.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Feature Medium

Reorganize right sidebar with Mode/Character/Chat style on top.

Reorganize right sidebar with Mode/Character/Chat style on top.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Feature Medium

Hide reasoning and tools controls in chat mode, shown only in instruct/chat-instruct.

Hide reasoning and tools controls in chat mode, shown only in instruct/chat-instruct.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Feature Medium

Fade in new messages, fix scroll-up jump on send.

Fade in new messages, fix scroll-up jump on send.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Feature Medium

Rename Send dummy message/reply to Insert user/assistant message.

Rename Send dummy message/reply to Insert user/assistant message.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Feature Medium

Polish character dropdown in chat tab.

Polish character dropdown in chat tab.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Feature Medium

Tighten spacing between dropdowns and refresh buttons.

Tighten spacing between dropdowns and refresh buttons.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Feature Medium

Improve looks of Session tab.

Improve looks of Session tab.

Source: granite4.1:8b-q6_K@2026-05-20

Confidence: low

Full changelog

Changes

  • MTP speculative decoding support: Add draft-mtp as a new --spec-type option. Auto-enabled when loading MTP GGUFs (e.g. Qwen 3.6 MoE MTP builds).
  • Web search improvements:
    • Add snippet support to the web_search tool: results now include a short text excerpt that often answers the query directly, eliminating the need for a follow-up fetch_webpage call (#7548).
    • Drop link URLs from fetch_webpage output (links now appear as plain text instead of [text](url) markdown), significantly reducing tokens used per page.
    • Prettier rendering of web_search results in the chat, with a spinner during the call.
    • Add an info message to the "Activate web search" checkbox.
  • Show live generation speed (tokens/s) and context size while generating (#7563).
  • DGX Spark support: Add Linux aarch64 portable builds.
  • Electron
    • Add "Check for updates" button in the Session tab.
    • Add a folder picker for the models directory.
    • Add right-click context menu for copying text.
    • Add a spellcheck toggle in the Session tab (#7550).
    • Store app data in user_data/cache/electron instead of the OS default location.
    • Disable DNS-over-HTTPS probes.
  • One-click installer: Track the latest release tag instead of bleeding-edge main.
  • Auto-detect and auto-select sibling mmproj files when loading a model (#7564).
  • Detect mmproj-*.gguf files in the main models folder: They appear in the mmproj dropdown and are hidden from the regular model dropdown.
  • Project icon: Add an icon, courtesy of LMLocalizer on Reddit.
  • Treat negative --ctx-size values as auto (0).
  • UI
    • Add drag-and-drop file upload support to the chat input (Gradio fork).
    • Reorganize the right sidebar with Mode/Character/Chat style on top.
    • Hide reasoning and tools controls in chat mode (only shown in instruct / chat-instruct).
    • Fade in new messages, fix scroll-up jump on send.
    • Rename "Send dummy message/reply" to "Insert user/assistant message".
    • Polish character dropdown in chat tab.
    • Tighten spacing between dropdowns and refresh buttons.
    • Improve the looks of the Session tab.

Security

  • Restrict CORS to localhost by default to prevent drive-by API access. --listen and --public-api opt into network exposure.
  • Sanitize character name in load_character to prevent path traversal.
  • fix: prevent path traversal in load_template_by_name (#7562). Thanks, @Allen930311.
  • UI: Improve web search security by rejecting non-HTTP links.

Bug fixes

  • Fix llama-server not being killed when the parent process exits on Windows, e.g. when closing the console window or killing python.exe (#7574).
  • Fix streaming output leaking across chats when switching mid-stream (#7555).
  • Fix continue-mode regressions across template families.
  • Fix incorrect prompts generated with continue mode. Thanks, @MeemeeLab.
  • Fix thinking channel being lost across tool-call turns (#7578).
  • Fix API model load silently dropping hyphenated arg keys (#7577).
  • Fix chat deletion failing when user_data/logs is a symlink (#7579).
  • Fix token count not being set in non-streaming mode.
  • Keep web search blocks closed when the user closes them mid-stream.
  • fix(win): set PYTHONUTF8 for non-ASCII locale Windows compatibility (#7560). Thanks, @jerry78424.
  • Set TORCH_VERSION to 2.9.0 to match xformers 0.0.33's torch pin (#7581). Thanks, @AJ-Gazin.

Dependency updates

  • Update llama.cpp to https://github.com/ggml-org/llama.cpp/commit/e947228222147356bc7e64154d3439e142481632
  • Update ik_llama.cpp to https://github.com/ikawrakow/ik_llama.cpp/commit/40254a51daf485b2b644bcb82a84278d95745ee5
  • Update ExLlamaV3 to 0.0.34

Portable builds

TextGen is now a desktop app for local LLMs. Download, unzip, double-click.

[!NOTE]
NVIDIA GPU: If nvidia-smi reports CUDA Version >= 13.1, use the cuda13.1 build. Otherwise, use cuda12.4.

ik_llama.cpp is a llama.cpp fork with new quant types. If unsure, use the llama.cpp column.

Windows

| GPU/Platform | llama.cpp | ik_llama.cpp |
|---|---|---|
| NVIDIA (CUDA 12.4) | Download (936 MB) | Download (1.24 GB) |
| NVIDIA (CUDA 13.1) | Download (840 MB) | Download (1.33 GB) |
| AMD/Intel (Vulkan) | Download (336 MB) | — |
| AMD (ROCm 7.2) | Download (617 MB) | — |
| CPU only | Download (319 MB) | Download (335 MB) |

Linux

| GPU/Platform | llama.cpp | ik_llama.cpp |
|---|---|---|
| NVIDIA (CUDA 12.4) | Download (893 MB) | Download (1.21 GB) |
| NVIDIA (CUDA 13.1) | Download (826 MB) | Download (1.33 GB) |
| NVIDIA ARM64 (CUDA 13.1) | Download (910 MB) | — |
| AMD/Intel (Vulkan) | Download (324 MB) | — |
| AMD (ROCm 7.2) | Download (409 MB) | — |
| CPU only | Download (307 MB) | Download (338 MB) |

macOS

macOS note: You need to run xattr -cr /path/to/your/textgen-folder on the extracted folder before launching. See https://github.com/oobabooga/textgen/issues/7558.

| Architecture | llama.cpp |
|---|---|
| Apple Silicon (arm64) | Download (272 MB) |
| Intel (x86_64) | Download (284 MB) |

Updating a portable install:

  1. Download and extract the latest version.
  2. Replace the user_data folder with the one in your existing install. All your settings and models will be moved.

Starting with 4.0, you can also move user_data one folder up, next to the install folder. It will be detected automatically, making updates easier:

textgen-4.6/
textgen-4.7/
user_data/    <-- shared by both installs

Security Fixes

  • Restrict CORS to localhost by default (opt‑in via --listen/--public-api)
  • Sanitize character name in load_character to prevent path traversal
  • Fix: prevent path traversal in load_template_by_name (#7562)

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Text Generation Web UI

Get notified when new releases ship.

Sign up free

About Text Generation Web UI

The original local LLM interface. Text, vision, tool-calling, training, and more. 100% offline.

All releases →

Beta — feedback welcome: [email protected]