Text Generation Web UI

v4.8 Security

This release patches 1 CVE for security teams tracking exposure across their dependency inventory.

Published 27d LLM Frameworks

View tool

1 patched CVE

Read the diff → Tool health → What is this tool? →

This release patches 1 known CVE CVE-2023-4863 EPSS 93%

1 CVEs patched

Summary

AI summary

Redesigned chat composer with taller input and bottom‑pinned actions.

Full changelog

Changes

Redesigned chat composer: Taller input area with the paperclip and message-action buttons pinned to the bottom, similar to Gemini and DeepSeek.
Smooth scroll animation when sending a new message: Inspired by Gemini's chat UI.
Electron improvements:
- Persist window bounds and maximize state across launches.
- Add a --no-electron flag to skip the desktop window and use the web UI in the browser instead.
- Disable spellcheck in the chat input.
API: Add support for list-format content in tool and assistant messages.
Add more space below the last chat/chat-instruct message so its action buttons have breathing room.

Bug fixes

Fix speculative decoding broken by upstream llama.cpp arg renames (#7541).
Fix truncation length reverting after model load on UI reload (#7540).
Don't clear the chat input when sending a message with no model loaded (#7542).
Electron:
- Fix big character picture failing to load (#7540).
- Fix --listen mode in the launcher.
- Fix missing log colors on Windows.

Dependency updates

Update llama.cpp to https://github.com/ggml-org/llama.cpp/commit/68380ae11b564af67196afc70f10c99dbb532fa9
Update ik_llama.cpp to https://github.com/ikawrakow/ik_llama.cpp/commit/9a26522af234f8db079ae3735f35ab6c20fe2c66

Portable builds

TextGen is now a desktop app for local LLMs. Download, unzip, double-click.

[!NOTE]
NVIDIA GPU: If nvidia-smi reports CUDA Version >= 13.1, use the cuda13.1 build. Otherwise, use cuda12.4.

ik_llama.cpp is a llama.cpp fork with new quant types. If unsure, use the llama.cpp column.

Windows

| GPU/Platform | llama.cpp | ik_llama.cpp |
|---|---|---|
| NVIDIA (CUDA 12.4) | Download (891 MB) | Download (1.23 GB) |
| NVIDIA (CUDA 13.1) | Download (817 MB) | Download (1.33 GB) |
| AMD/Intel (Vulkan) | Download (336 MB) | — |
| AMD (ROCm 7.2) | Download (604 MB) | — |
| CPU only | Download (319 MB) | Download (334 MB) |

Linux

| GPU/Platform | llama.cpp | ik_llama.cpp |
|---|---|---|
| NVIDIA (CUDA 12.4) | Download (848 MB) | Download (1.20 GB) |
| NVIDIA (CUDA 13.1) | Download (803 MB) | Download (1.33 GB) |
| AMD/Intel (Vulkan) | Download (324 MB) | — |
| AMD (ROCm 7.2) | Download (396 MB) | — |
| CPU only | Download (307 MB) | Download (334 MB) |

macOS

| Architecture | llama.cpp |
|---|---|
| Apple Silicon (arm64) | Download (271 MB) |
| Intel (x86_64) | Download (283 MB) |

Updating a portable install:

Download and extract the latest version.
Replace the user_data folder with the one in your existing install. All your settings and models will be moved.

Starting with 4.0, you can also move user_data one folder up, next to the install folder. It will be detected automatically, making updates easier:

textgen-4.6/
textgen-4.7/
user_data/    <-- shared by both installs

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track Text Generation Web UI

Get notified when new releases ship.

About Text Generation Web UI

The original local LLM interface. Text, vision, tool-calling, training, and more. 100% offline.

All releases →