This release includes 2 breaking changes for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
+13 more
Affected surfaces
ReleasePort's take
Light signalLLM.run and LLM.run_async now require keyword arguments; positional messages or streaming_callback are no longer accepted.
Why it matters: This breaking change (severityβ―70) forces callers to update all invocations of LLM.run/LLM.run_async before upgrading, preventing runtime errors on version v2.30.0.
Summary
AI summaryUpdates π Bug Fixes, π New Features, and β‘οΈ Enhancement Notes across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Breaking | High |
LLM.run and LLM.run_async no longer accept positional messages or streaming_callback arguments. LLM.run and LLM.run_async no longer accept positional messages or streaming_callback arguments. Source: llm_adapter@2026-06-03 Confidence: high |
β |
| Feature | Medium |
All Haystack ChatGenerator components now accept a plain string for the messages parameter. All Haystack ChatGenerator components now accept a plain string for the messages parameter. Source: llm_adapter@2026-06-03 Confidence: high |
β |
| Feature | Medium |
Introduced the `PythonCodeSplitter` component for syntaxβaware Python source splitting. Introduced the `PythonCodeSplitter` component for syntaxβaware Python source splitting. Source: llm_adapter@2026-06-03 Confidence: high |
β |
| Feature | Low |
`TextEmbeddingRetriever`, `MultiQueryEmbeddingRetriever`, and `MultiQueryTextRetriever` now support async execution via run_async. `TextEmbeddingRetriever`, `MultiQueryEmbeddingRetriever`, and `MultiQueryTextRetriever` now support async execution via run_async. Source: llm_adapter@2026-06-03 Confidence: low |
β |
| Feature | Low |
TextEmbeddingRetriever, MultiQueryEmbeddingRetriever, and MultiQueryTextRetriever gained run_async support for native coroutine use in AsyncPipeline. TextEmbeddingRetriever, MultiQueryEmbeddingRetriever, and MultiQueryTextRetriever gained run_async support for native coroutine use in AsyncPipeline. Source: granite4.1:30b@2026-06-03-audit Confidence: low |
β |
| Feature | Low |
LLM now supports templateβvariable mode (Jinja2 variables) and passβthrough mode for messages, enabling flexible prompt construction. LLM now supports templateβvariable mode (Jinja2 variables) and passβthrough mode for messages, enabling flexible prompt construction. Source: granite4.1:30b@2026-06-03-audit Confidence: low |
β |
| Feature | Low |
Pipeline.draw() and Pipeline.show() validate Mermaid server responses (magicβbyte signature and Content-Type) before writing files, raising PipelineDrawingError on mismatches. Pipeline.draw() and Pipeline.show() validate Mermaid server responses (magicβbyte signature and Content-Type) before writing files, raising PipelineDrawingError on mismatches. Source: granite4.1:30b@2026-06-03-audit Confidence: low |
β |
| Bugfix | Medium |
Agent now exits correctly when multiple tool calls include the configured exitβcondition tool regardless of order. Agent now exits correctly when multiple tool calls include the configured exitβcondition tool regardless of order. Source: llm_adapter@2026-06-03 Confidence: high |
β |
| Bugfix | Medium |
`LLMMetadataExtractor.run_async` now respects max_workers concurrency limit. `LLMMetadataExtractor.run_async` now respects max_workers concurrency limit. Source: llm_adapter@2026-06-03 Confidence: high |
β |
| Bugfix | Low |
`Document.from_dict()` no longer mutates the input dictionary during deserialization. `Document.from_dict()` no longer mutates the input dictionary during deserialization. Source: llm_adapter@2026-06-03 Confidence: high |
β |
| Bugfix | Low |
`AnswerBuilder.run()` no longer mutates the meta dict of input Document objects. `AnswerBuilder.run()` no longer mutates the meta dict of input Document objects. Source: llm_adapter@2026-06-03 Confidence: high |
β |
| Bugfix | Low |
`DocumentLanguageClassifier` no longer crashes when `Document.content` is None. `DocumentLanguageClassifier` no longer crashes when `Document.content` is None. Source: llm_adapter@2026-06-03 Confidence: low |
β |
| Bugfix | Low |
`DocumentJoiner` in concatenate mode now treats documents with a score of exactly 0.0 as scored, not unscored. `DocumentJoiner` in concatenate mode now treats documents with a score of exactly 0.0 as scored, not unscored. Source: llm_adapter@2026-06-03 Confidence: low |
β |
| Bugfix | Low |
DocumentLanguageClassifier now handles Document.content=None gracefully without crashing, logging a warning instead. DocumentLanguageClassifier now handles Document.content=None gracefully without crashing, logging a warning instead. Source: granite4.1:30b@2026-06-03-audit Confidence: low |
β |
| Bugfix | Low |
DocumentJoiner in concatenate mode treats a score of exactly 0.0 as valid, preventing unintended duplicate removal. DocumentJoiner in concatenate mode treats a score of exactly 0.0 as valid, preventing unintended duplicate removal. Source: granite4.1:30b@2026-06-03-audit Confidence: low |
β |
| Refactor | Low |
ToolsType type hint updated to accept any class inheriting from Tool or Toolset in sequences like list or tuple. ToolsType type hint updated to accept any class inheriting from Tool or Toolset in sequences like list or tuple. Source: granite4.1:30b@2026-06-03-audit Confidence: low |
β |
Full changelog
βοΈ Highlights
π Syntax-aware Python code splitting with PythonCodeSplitter
The new PythonCodeSplitter is a syntax-aware splitter for Python source files, built for code-RAG and code-search pipelines where naive line-based splitting tends to cut through functions and lose structural context. It parses sources with the ast module and greedily merges units, such as module docstring, import blocks, top-level functions, class headers, methods, and nested classes, into chunks of roughly max_effective_lines, keeping whole functions and methods together. For functions that exceed oversized_factor * max_effective_lines, it falls back to a line-based secondary split with overlap.
Two options make the resulting chunks more useful downstream: strip_docstrings=True moves docstrings into chunk metadata, and preserve_class_definition=True prepends the enclosing class signature to chunks whose members live in a later chunk. Each chunk also carries rich metadata including start_line, end_line, unit_kinds, include_classes, decorators, docstrings, source_id, and split_id.
from haystack.components.preprocessors import PythonCodeSplitter
splitter = PythonCodeSplitter(
max_effective_lines=80,
strip_docstrings=True,
preserve_class_definition=True,
)
result = splitter.run(documents=[doc])
π¬ Pass a plain string to any ChatGenerator
All Haystack ChatGenerator components now accept a plain string for the messages parameter in addition to a list of ChatMessage objects. The string is automatically wrapped in a ChatMessage with the user role. This makes switching from a Generator to a ChatGenerator a one-line change. The change applies to AzureOpenAIChatGenerator, AzureOpenAIResponsesChatGenerator, FallbackChatGenerator, HuggingFaceAPIChatGenerator, HuggingFaceLocalChatGenerator, OpenAIChatGenerator, and OpenAIResponsesChatGenerator, and will soon be rolled out to the ChatGenerators in Haystack Core Integrations.
from haystack.components.generators.chat import OpenAIChatGenerator
generator = OpenAIChatGenerator()
# passing a string is equivalent to passing [ChatMessage.from_user("...")]
response = generator.run("What's Natural Language Processing?")
print(response["replies"][0].text)
β¬οΈ Upgrade Notes
-
DALLEImageGeneratorhas been updated to account for OpenAI's retirement of the DALL-E models. The default model is nowgpt-image-2(previouslydall-e-3). To migrate:- Update
modelvalue: besidesgpt-image-2,gpt-image-1andgpt-image-1-miniare also supported. - Update
qualityvalue: the new accepted values areauto,high,medium, orlow(previouslystandardorhd). - Update
sizevalue: the new accepted values are1024x1024,1024x1536,1536x1024, orauto.gpt-image-2also supports arbitrary sizes. - The
response_formatparameter is now ignored. The component always returns base64-encoded JSON.
- Update
-
LLM.runandLLM.run_asyncno longer acceptmessagesandstreaming_callbackas positional arguments β they must now be passed as keyword arguments. Update any direct calls accordingly:# Before llm.run([message], my_callback) # After llm.run(messages=[message], streaming_callback=my_callback)
π New Features
-
Introduced the
PythonCodeSplittercomponent, a syntax-aware splitter for Python source files:- Parses sources with the
astmodule and merges units (module docstring, import blocks, top-level functions, class headers, methods, nested classes, and remaining statements) greedily into chunks of roughlymax_effective_lines. - Keeps whole functions and methods together; falls back to a line-based secondary split (using
DocumentSplitter) with overlap only for functions whose effective length exceedsoversized_factor * max_effective_lines. - Optionally strips docstrings into chunk metadata via
strip_docstrings=True, and prepends the enclosing class signature to chunks whose members live in a later chunk viapreserve_class_definition=True. - Emits per-chunk metadata including
start_line,end_line,unit_kinds,include_classes,decorators,docstrings,source_id, andsplit_id.
- Parses sources with the
-
All Haystack
ChatGeneratorcomponents now also accept a plain string for themessagesparameter in addition to a list ofChatMessageobjects. The string is automatically converted into a list containing aChatMessagewith theuserrole. This is done to simplify switching from Generators to ChatGenerators; Generators might be removed in Haystack 3.0.This applies to
AzureOpenAIChatGenerator,AzureOpenAIResponsesChatGenerator,FallbackChatGenerator,HuggingFaceAPIChatGenerator,HuggingFaceLocalChatGenerator,OpenAIChatGenerator, andOpenAIResponsesChatGenerator.The same change will be soon applied to ChatGenerators available in Haystack Core Integrations.
Example:
from haystack.components.generators.chat import OpenAIChatGenerator generator = OpenAIChatGenerator() # passing a string is equivalent to passing [ChatMessage.from_user("...")] response = generator.run("What's Natural Language Processing?") print(response["replies"][0].text)
β‘οΈ Enhancement Notes
- Added
run_asynctoTextEmbeddingRetriever,MultiQueryEmbeddingRetriever, andMultiQueryTextRetriever. These components now execute natively as coroutines inAsyncPipeline, delegating to each wrapped component'srun_asyncwhen available and falling back to a thread executor otherwise. - Fix grammar in the
AzureOpenAIGeneratorandAzureOpenAIChatGeneratordocstring code examples ("<this a model name..."β"<this is a model name...") so that copy-pasted snippets read correctly. LLMnow supports two usage modes:- Template-variable mode: provide a
user_promptwith Jinja2 variables (e.g.{{ query }}). Those variables become pipeline inputs andmessagesis optional. The rendereduser_promptis always appended after anymessagesprovided at runtime. - Pass-through mode: omit
user_promptor provide one with no template variables.messagesbecomes a required input, allowing a fully-constructed list ofChatMessages to be passed from upstream.
- Template-variable mode: provide a
- Update
ToolsTypeto improve type checking for thetoolsparameter. Any class that inherits from eitherToolorToolsetis now accepted in any sequence (list, tuple, etc). Pipeline.draw()andPipeline.show()now validate the Mermaid server response before writing it to disk. The response body is checked against the expected output format (PNG, JPEG, WebP, SVG, or PDF) via its magic-byte signature, and theContent-Typeheader is checked as well. If the response is empty or does not match the requested format, aPipelineDrawingErroris raised and no file is written. This prevents a misconfigured or untrustedserver_urlfrom causing arbitrary content (for example an HTML error page) to be saved verbatim to the output path.
π Bug Fixes
- Prevent
Document.from_dict()from mutating the input dictionary during deserialization. - Prevent DocumentLanguageClassifier from crashing when
Document.content=Noneby marking them as unmatched and logging a warning. - Fixed a bug where
Agentwould not exit when the model emitted multiple tool calls in a single turn and the configured exit-condition tool was not the first one in the list. Previously, only the first tool call in each assistant message was checked againstexit_conditions, so a reply like[search, finish](withexit_conditions=["finish"]) would silently fail to stop the loop and keep iterating untilmax_agent_stepswas reached. Since parallel tool calls are now the norm for frontier models, this could quietly turn a single successful turn into dozens of wasted LLM calls. TheAgentnow inspects every tool call in the message, so the exit condition is honored regardless of ordering. - Fix
AnswerBuilder.run()mutating themetadict of inputDocumentobjects.source_index(andreferencedwhenreference_patternis set) are now only added to the document copies insideGeneratedAnswer.documents, not to the originals. - Fixed
DocumentJoinerinconcatenatemode so that documents with a score of exactly0.0are no longer treated as unscored during deduplication. Previously a truthiness check coercedscore=0.0to-inf, which could cause a worse, negatively-scored duplicate to be kept instead of the0.0-scored document. Themergemode was updated to the same explicitis not Nonecheck for consistency; its observable behavior is unchanged. - Fixed in-place mutation of
ExtractedAnswer.metainExtractiveReader._add_answer_page_numberwhen the answer'smetawasNone. Now usesdataclasses.replaceto avoid triggering the dataclass mutation warning. - Fixed
ExtractiveReaderraisingValueErrorwhen the number of valid answer spans for a sequence was smaller thananswers_per_seq(for example with short documents or whenanswers_per_seqexceeded the number of upper-triangular, non-masked (start, end) token pairs)._postprocessnow filters the per-sequence probabilities by the same validity mask it already applied to the start/end token indices, so the three structures always have matching lengths. HierarchicalDocumentSplitterno longer mutates the metadata of the inputDocument._add_meta_datanow returns a newDocumentwith a copiedmetadict viadataclasses.replaceinstead of writing__block_size,__parent_id,__children_idsand__levelonto the caller'sDocument.- Fixed a bug in
LLMMetadataExtractor.run_asyncwhere theasyncio.Semaphoreintended to bound concurrent LLM calls tomax_workerswas acquired once around the outergather(...)call instead of inside each task. As a result,max_workershad no effect inrun_asyncand all LLM requests for a batch were issued simultaneously. The semaphore is now acquired per task, somax_workerscorrectly caps in-flight requests. expand_page_range()now raises aValueError: too many values to unpackwhen a page range string contained more than one hyphen (e.g."10-20-30"). The parser now validates the format and raises a clearValueErrorwith an explanatory message for invalid inputs.LLMMetadataExtractornow raises a clearValueErrorwhen thepromptcontains no template variables. Previously this case raised an unhelpfulIndexError: list index out of range. The error message now consistently explains that the prompt must contain exactly one variable calleddocument.- Fixed
HuggingFaceAPIDocumentEmbedderserialization to preserve the configuredconcurrency_limit. - Fix
TransformersZeroShotDocumentClassifier.to_dictto include theclassification_fieldandmulti_labelinit parameters, which were previously dropped on serialization and reset to defaults afterfrom_dict.
π Big thank you to everyone who contributed to this release!
@Aarkin7, @anakin87, @bogdankostic, @davidsbatista, @etairl, @HamidOna, @julian-risch, @kota-wilson, @maxdswain, @MechaCritter, @pragnyanramtha, @rautaditya2606, @sachinn854, @sjrl
Breaking Changes
- `LLM.run` and `LLM.run_async` no longer accept `messages` and `streaming_callback` as positional arguments; they must be passed as keyword arguments.
- `DALLEImageGenerator` default model changed from `dall-e-3` to `gpt-image-2`; quality options updated (auto, high, medium, low) and size options updated (1024x1024, 1024x1536, 1536x1024, auto); `response_format` parameter is ignored.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About haystack
Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and conversational systems.
Related context
Related tools
Earlier breaking changes
- v2.29.0 `LLM.run` and `LLM.run_async` now require keyword arguments for `messages` and `streaming_callback`.
Beta — feedback welcome: [email protected]