Skip to content

Release history

langroid releases

Harness LLMs with Multi-Agent Programming

All releases

27 shown

Config change
0.65.0 Breaking risk
Breaking upgrade

PDF parser switch

Upgrade now
0.64.0 Breaking risk
RCE / SSRF Breaking upgrade

Path‑traversal + SQL file read fixes

No immediate action
0.63.0 Maintenance

Routine maintenance and dependency updates.

0.62.0 Bug fix
Notable features
  • Adds Python 3.13 support; CI now tests against Python 3.11 and 3.13.
Full changelog

Python 3.13 support and Task.init() memory leak fix

Python 3.13 support

PR [#1011](https://github.com/langroid/langroid/pull/1011) bumps the supported Python range to >=3.10,<3.14 and adds Python 3.13 to the CI matrix alongside 3.11, so Langroid is now tested on both versions on every push.

  • pyproject.toml: requires-python = "<3.14,>=3.10"
  • .github/workflows/validate.yml: lint/test job now runs against ["3.11", "3.13"]

Fix ObjectRegistry memory leak in Task.init()

PR [#1022](https://github.com/langroid/langroid/pull/1022) (originally [#1021](https://github.com/langroid/langroid/pull/1021)) fixes a memory leak in Task.init(). The method built a temporary ChatDocument solely to feed log_message(Entity.SYSTEM, ...), but ChatDocument.__init__ auto-registers every instance in the class-level ObjectRegistry, so each new Task left behind a system-message doc that was never freed. In long-running services that spawn many tasks (e.g. servers), this accumulated significantly over time.

The fix:

  • explicitly removes the temporary doc via ChatDocument.delete_id(...) after logging
  • renames the local to system_message_temp_doc to make its transient nature obvious
  • adds a regression test (test_task_init_no_registry_leak) mirroring the PR #939 leak-check pattern, asserting that repeated Task(...).init() calls leave zero leaked ChatDocuments in the registry

Thanks to @alexagr for the catch and the original fix.

0.61.1 Bug fix

Fixes MiniMax context window size to 204,800 tokens (API limit), preventing unnecessary prompt truncation.

0.61.0 New feature
Notable features
  • MiniMax LLM provider with 7 models and up to 1M context
  • OpenAI-compatible API with minimax/ prefix
0.60.3 Bug fix

Fixes Gemini model name suffix handling for experimental, latest, and preview variants.

0.60.2 Bug fix

Fixed context-length preflight checks for Gemini model aliases and file attachments.

0.60.1 New feature
Notable features
  • Thread-safe cache operations protected by RLock in multi-threaded environments
  • LRU tracking with monotonic timestamps to identify and manage stale entries
  • prune_cache(max_age_seconds) function to evict stale client entries
Full changelog

Thread-safe client cache with LRU eviction

PR #993 enhances the client cache in langroid/language_models/client_cache.py with thread-safety and LRU (Least Recently Used) eviction.

What's New

  • Thread safety: All cache operations are now protected by a threading.RLock() to prevent race conditions in multi-threaded environments
  • LRU tracking: Cache entries store a last-used monotonic timestamp, refreshed on each access
  • prune_cache(max_age_seconds): New function to evict stale client entries older than the specified age
  • Helper functions: _get_cached_client() and _store_client() encapsulate cache access with consistent locking and timestamp management

Applies to all cached client getters: get_openai_client(), get_async_openai_client(), get_groq_client(), get_async_groq_client(), get_cerebras_client(), and get_async_cerebras_client().

0.60.0 New feature
Notable features
  • Seltz web search provider integration
  • SeltzSearchTool for agent usage
  • Setup and integration documentation
0.59.39 New feature
Notable features
  • Async HTTP client factory support
  • Tuple pattern for paired sync and async clients
0.59.38 Bugfix

Fix empty tool arguments serialization (`None` → `"{}"`) that caused `INVALID_ARGUMENT` errors with Gemini models on VertexAI (#988)

0.59.37 Maintenance

Handle null deltas in OpenAIGPT streaming: https://github.com/langroid/langroid/pull/987

0.59.34 Mixed
Notable features
  • Configurable context overflow strategy with 'truncate' (default) and 'drop_turns' options
  • Cleaner OpenAI API error logging without full tracebacks for server-side errors
Full changelog

Configurable context overflow strategy (#967, #974)

Added a context_overflow_strategy option to ChatAgentConfig for handling message history that exceeds the model's context length. Two strategies are available:

  • "truncate" (default): Truncates content of early messages while preserving all messages in the sequence. Maintains backward compatibility and the alternating message structure required by LLM APIs.
  • "drop_turns": Drops complete conversation turns (a USER message and all responses until the next USER message). More aggressive but cleaner — particularly useful for voice agents with limited context models (e.g., llama-3-8b with 8192 tokens), where individual messages are already short and truncation is ineffective.
config = lr.ChatAgentConfig(
    context_overflow_strategy="drop_turns",  # or "truncate" (default)
    ...
)

Cleaner OpenAI API error logging (#975)

OpenAI API errors (authentication, bad request, rate limits, etc.) are now logged without a full Python traceback, since these errors originate server-side and a local stack trace adds no diagnostic value. Network-level errors (APIConnectionError, APITimeoutError) still include the full traceback to aid in diagnosing local issues.

Minor fixes

  • Fixed defensive check for empty/missing choices in OpenAI API response parsing (#975)
  • Formatting cleanup in inline reasoning tests (#977)
  • Updated llms*.txt documentation files
0.59.33 New feature
Notable features
  • Inline reasoning support in streaming responses for models using embedded thinking/reasoning delimiters
  • Added `extra_content` field to `OpenAIToolCall` for storing additional metadata from tool calls
  • New `_split_inline_reasoning()` method with configurable delimiters and state tracking across streaming chunks
Full changelog

Support inline reasoning in streaming responses (#973)

Added support for handling inline reasoning content in streaming LLM responses, for models that embed thinking/reasoning inside the main content stream (e.g., using <think>...</think> delimiters) rather than in a separate reasoning field.

Key Changes

  • Added extra_content field to OpenAIToolCall: New optional field to store additional metadata from tool calls, with updated from_dict() and api_dict() methods
  • Implemented inline reasoning splitting: New _split_inline_reasoning() static method that separates reasoning tokens from text tokens based on configurable delimiters, tracking state across streaming chunks
  • Enhanced streaming event processing: Updated _process_stream_event() and _process_stream_event_async() to route split tokens to correct streamers (TEXT vs REASONING) with proper state tracking
  • Updated Ruff: Bumped pre-commit ruff version from v0.14.14 to v0.15.0
0.59.32 Security relevant
Security fixes
  • CVE-2025-46724: RCE bypass prevented via dunder attribute blocking in AST validator
0.59.31 New feature
Notable features
  • Reasoning parameter in `show_llm_response` and `finish_llm_stream` callbacks to expose LLM chain-of-thought
  • Automatic reasoning display in Chainlit UIs with '💭 Reasoning' label
Full changelog

Add reasoning parameter to LLM response callbacks (#965)

This release adds support for passing chain-of-thought reasoning from LLMs (like DeepSeek R1, Claude extended thinking) to UI callbacks.

Features

  • Reasoning in callbacks: show_llm_response and finish_llm_stream callbacks now receive a reasoning parameter containing the LLM's chain-of-thought reasoning (when available)

  • Chainlit integration: Reasoning is automatically displayed as a nested message with a "💭 Reasoning" label in Chainlit UIs

  • Backward compatible: Existing custom callbacks without the reasoning parameter continue to work - uses signature inspection to only pass reasoning if supported

Documentation

  • Added "Displaying Reasoning in UI Callbacks" section to reasoning-content.md with examples for custom callback implementations

Thanks to @alexagr for the initial PR adding the reasoning parameter!

0.59.30 Maintenance
Notable features
  • Message routing configuration documented
  • OpenAIAssistant routing behavior fix
  • Text-based routing test coverage
0.59.29 New feature
Notable features
  • Preserve thought tags in message history for inline reasoning models
  • Added message_with_reasoning and content_with_reasoning fields
0.59.28 Breaking risk
Notable features
  • Vertex AI support for Gemini models
  • New GEMINI_API_BASE environment variable
0.59.27 Breaking risk

User-provided API parameters like reasoning_effort are no longer silently filtered.

Full changelog

Fixed

  • Respect user-provided API parameters: Removed overly aggressive parameter filtering that silently dropped params like reasoning_effort for models added to MODEL_INFO. The library now trusts user configuration and lets the API validate parameter support. (#956 - thanks @alexagr)
0.59.26 New feature
Notable features
  • Callbacks `show_llm_response` and `finish_llm_stream` now have separate `content` (text messages) and `tools_content` (serialized tool calls) parameters
Full changelog

What's Changed

Improvements

Callback API Enhancement: Separate content and tools_content (#952, #945)

The show_llm_response and finish_llm_stream callbacks now include separate content and tools_content parameters:

  • content: Always contains the text message generated by the model
  • tools_content: Contains serialized functions/tools if present, empty string otherwise

Why this matters: Previously, content mixed text messages with JSON-serialized tool calls. This caused issues for applications using callbacks for purposes like text-to-speech, where tool call JSON was incorrectly processed as regular text.

Backward compatible: The tools_content parameter defaults to an empty string, so existing callback implementations continue to work.

Full Changelog: https://github.com/langroid/langroid/compare/0.59.25...0.59.26

0.59.25 Breaking risk
Notable features
  • GPT-5.2 and GPT-5.2-Pro support
  • Gemini 3 Flash and Gemini 3 Pro support
  • Claude 4 family including 4.5 variants
0.59.24 Mixed
Notable features
  • Azure OpenAI API v1 support with chat_model_orig parameter
  • Fixed tool message caching in multi-agent scenarios
  • Fixed ObjectRegistry memory leak
0.59.23 Breaking risk

Minor fixes and improvements.

Full changelog

What's Changed

  • Fix: Replace deprecated datetime.utcnow() with datetime.now(timezone.utc) to remove DeprecationWarnings in Python 3.12+ (#937)

Beta — feedback welcome: [email protected]