langroid

v0.65.0 Breaking

This release includes 2 breaking changes for platform teams planning a safe upgrade.

Published 1mo AI Agents & Assistants

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agents ai chatgpt function-calling llm gpt-4

+10 more

gpt4 information-retrieval language-model llama llm-agent llm-framework local-llm multi-agent-systems openai-api retrieval-augmented-generation

Affected surfaces

breaking_upgrade

ReleasePort's take

Moderate signal

editorial:auto 1mo

The default PDF parser now uses pypdfium2 instead of pymupdf4llm, removing the AGPL dependency. The pymupdf packages are removed from core and offered as optional extras.

Why it matters: Affects any code relying on pymupdf4llm or pymupdf for PDF parsing; migration required before next upgrade cycle.

Summary

AI summary

Updates What changed, Breaking change & migration, and https://github.com/py-pdf/benchmarks across a mixed release.

Changes in this release

Type	Severity	Summary	CVE
Breaking	High	Default PDF parser switched from pymupdf4llm to pypdfium2; AGPL dependency removed. Default PDF parser switched from pymupdf4llm to pypdfium2; AGPL dependency removed. Source: llm_adapter@2026-05-29 Confidence: high	—
Feature	Low	`pypdfium2` added as core dependency; `PyPDFium2Parser` extracts text page‑by‑page. `pypdfium2` added as core dependency; `PyPDFium2Parser` extracts text page‑by‑page. Source: llm_adapter@2026-05-29 Confidence: high	—
Deprecation	Medium	`pymupdf4llm` and `pymupdf` removed from core dependencies; available as opt‑in extras. `pymupdf4llm` and `pymupdf` removed from core dependencies; available as opt‑in extras. Source: llm_adapter@2026-05-29 Confidence: high	—

Full changelog

Permissive-by-default PDF parsing — no AGPL in the default install

Langroid is MIT-licensed, but until now a plain pip install langroid pulled in
pymupdf4llm (and transitively pymupdf), which are AGPL-3.0 licensed. This
release removes that AGPL dependency from the default install and switches the
default PDF parser to the permissively-licensed pypdfium2
(Apache-2.0 / BSD-3-Clause). Resolves #1026.

What changed

pypdfium2 is now the default PDF parser, added as a core dependency. A new
PyPDFium2Parser extracts text page-by-page via PDFium.
pymupdf4llm / pymupdf removed from core dependencies. They remain
available as opt-in extras: doc-chat, pdf-parsers, all, or pymupdf4llm.
DocChatAgent now defaults to pypdfium2 as well, so document-chat works
out of the box on a bare install with no AGPL code.
Per the py-pdf benchmarks, pypdfium2
matches or exceeds pymupdf on raw text-extraction accuracy.

Breaking change & migration

A bare pip install langroid no longer installs pymupdf4llm/pymupdf, and the
default PDF parser now emits plain text rather than pymupdf4llm's structured
Markdown.

If you want the richer Markdown extraction (headers, tables, multi-column reflow)
from pymupdf4llm, install an extra and select it explicitly:

pip install "langroid[doc-chat]"   # or [pdf-parsers], [all], [pymupdf4llm]

from langroid.parsing.parser import ParsingConfig, PdfParsingConfig

cfg = ParsingConfig(pdf=PdfParsingConfig(library="pymupdf4llm"))

Thanks to @alexagr for reporting the licensing issue (#1026). See #1028 for details.

Breaking Changes

Removed core dependencies `pymupdf4llm` and `pymupdf`; they are now opt‑in extras (doc-chat, pdf-parsers, all, pymupdf4llm).
Default PDF parser changed from `pymupdf4llm` to `pypdfium2`, altering output from structured Markdown to plain text.

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track langroid

Get notified when new releases ship.

About langroid

Harness LLMs with Multi-Agent Programming

All releases →

Related context

Related tools

Earlier breaking changes

v0.65.9 MCP tool parameters now strictly validate enums, unions, and nested models.
v0.65.5 Blocks code‑execution, file, and network primitives in creation tools by default.
v0.65.5 Restricts retrieval tools to read‑only queries; write or admin clauses are rejected by default.
v0.65.3 Raw user messages containing tools registered with `enable_message(..., use=False, handle=True)` are now dropped instead of executed.
v0.65.2 Restricts eval'd expression builtins to a curated safe set, breaking code that relied on full Python builtins (e.g., __import__, open).