This release includes 1 breaking change for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
+13 more
ReleasePort's take
Moderate signalHaystack v2.29.0 breaks LLM.run and LLM.run_async, requiring keyword arguments for messages and streaming_callback. The release adds MultiRetriever for parallel retrieval execution and CacheChecker async support.
Why it matters: Apps calling LLM.run/run_async with positional arguments must migrate to keyword syntax before upgrading. No auto-conversion path. Test in dev and plan the upgrade accordingly.
Summary
AI summaryLLM.run and LLM.run_async now require keyword arguments for messages and streaming_callback.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Breaking | Medium |
`LLM.run` and `LLM.run_async` now require keyword arguments for `messages` and `streaming_callback`. `LLM.run` and `LLM.run_async` now require keyword arguments for `messages` and `streaming_callback`. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
MultiRetriever runs multiple text retrievers in parallel and merges results. MultiRetriever runs multiple text retrievers in parallel and merges results. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
TextEmbeddingRetriever wraps embedding-based retriever with embedder for MultiRetriever compatibility. TextEmbeddingRetriever wraps embedding-based retriever with embedder for MultiRetriever compatibility. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
`CacheChecker.run_async` enables use in AsyncPipeline without blocking event loop. `CacheChecker.run_async` enables use in AsyncPipeline without blocking event loop. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
`MultiRetriever` now supports `join_mode` parameter with ` `MultiRetriever` now supports `join_mode` parameter with ` Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
CacheChecker gains a `run_async` method to be used in AsyncPipeline without blocking. CacheChecker gains a `run_async` method to be used in AsyncPipeline without blocking. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
| Feature | Medium |
MultiRetriever adds a `join_mode` parameter supporting "reciprocal_rank_fusion" (default) and "concatenate". MultiRetriever adds a `join_mode` parameter supporting "reciprocal_rank_fusion" (default) and "concatenate". Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
| Bugfix | Medium |
NamedEntityExtractor restores spaCy/Thinc device state correctly after execution. NamedEntityExtractor restores spaCy/Thinc device state correctly after execution. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
| Bugfix | Medium |
OpenAIChatGenerator's `tools_strict=True` now recursively applies schema strictness to nested tool parameters. OpenAIChatGenerator's `tools_strict=True` now recursively applies schema strictness to nested tool parameters. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
| Bugfix | Low |
Preserve resumable snapshots when some inputs or outputs are non-serializable, omitting only failing fields. Preserve resumable snapshots when some inputs or outputs are non-serializable, omitting only failing fields. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
| Refactor | Medium |
Documented input ordering behavior of auto-promoted lazy variadic sockets in `Pipeline.connect()`. Documented input ordering behavior of auto-promoted lazy variadic sockets in `Pipeline.connect()`. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Refactor | Low |
LLM now supports template-variable mode and pass-through mode for prompt handling. LLM now supports template-variable mode and pass-through mode for prompt handling. Source: granite4.1:30b@2026-05-23-audit Confidence: low |
— |
Full changelog
⭐️ Highlights
🔍 Combine Retrievers with MultiRetriever and TextEmbeddingRetriever
Two new retriever components make it easier to build hybrid search pipelines. MultiRetriever runs multiple text retrievers in parallel and merges their results into a single deduplicated list, ranked by reciprocal rank fusion by default. You can selectively enable or disable individual retrievers at runtime using the active_retrievers parameter. This is useful when you want to skip the embedding retriever for short or keyword-only queries, for example.
TextEmbeddingRetriever wraps an embedding-based retriever together with a text embedder into a single component, making it compatible with MultiRetriever by implementing the TextRetriever protocol. Here's how to combine BM25 and embedding retrieval in a single component:
from haystack.components.retrievers import MultiRetriever, TextEmbeddingRetriever
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever, InMemoryEmbeddingRetriever
from haystack.components.embedders import SentenceTransformersTextEmbedder
retriever = MultiRetriever(
retrievers={
"bm25": InMemoryBM25Retriever(document_store=doc_store),
"embedding": TextEmbeddingRetriever(
retriever=InMemoryEmbeddingRetriever(document_store=doc_store),
text_embedder=SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2"),
),
},
top_k=3,
)
# Run all retrievers
result = retriever.run(query="green energy sources")
# Run only the BM25 retriever
result = retriever.run(query="green energy sources", active_retrievers=["bm25"])
⬆️ Upgrade Notes
-
LLM.runandLLM.run_asyncno longer acceptmessagesandstreaming_callbackas positional arguments — they must now be passed as keyword arguments. Update any direct calls accordingly:# Before llm.run([message], my_callback) # After llm.run(messages=[message], streaming_callback=my_callback)
🚀 New Features
- Add
run_asynctoCacheChecker, enabling it to be used inAsyncPipelinewithout blocking the event loop.
⚡️ Enhancement Notes
- Document the input ordering behavior of auto-promoted lazy variadic sockets in
Pipeline.connect(). When multiple senders are connected to the same list-typed receiver socket, ordering depends on the pipeline class. WithPipeline, items are ordered alphabetically by sender component name (becausePipeline.run()schedules components in alphabetical order for deterministic execution), not by the order ofconnect()calls. WithAsyncPipeline, no ordering is guaranteed, since components in different branches may run in parallel. The docstrings now point users to a dedicated joiner component when they need explicit ordering. - Add
join_modeparameter to the experimentalMultiRetrievercomponent, supporting"reciprocal_rank_fusion"(default) and"concatenate". Reciprocal Rank Fusion merges the ranked result lists from all retrievers into a single deduplicated list ordered by RRF score. The underlying RRF logic is extracted into a shared utility_reciprocal_rank_fusioninhaystack.utils.misc, which is now also used byDocumentJoiner. LLMnow supports two usage modes:- Template-variable mode: provide a
user_promptwith Jinja2 variables (e.g.{{ query }}).
Those variables become pipeline inputs andmessagesis optional. The rendereduser_prompt
is always appended after anymessagesprovided at runtime. - Pass-through mode: omit
user_promptor provide one with no template variables.messages
becomes a required input, allowing a fully-constructed list ofChatMessages to be passed from upstream.
- Template-variable mode: provide a
🐛 Bug Fixes
- Fixed a bug in
NamedEntityExtractorwhere the spaCy/Thinc device state was not correctly restored after execution, potentially affecting the device configuration of other spaCy components in the same process. - Preserve resumable snapshots when some inputs or outputs are non-serializable. Haystack now omits only the failing top-level fields (for example non-serializable callbacks or runtime objects) instead of replacing the whole payload with an empty dictionary. This applies both to agent sub-component inputs (
chat_generatorandtool_invoker) and to pipeline-levelinputs,original_input_data, andpipeline_outputscaptured by_create_pipeline_snapshot. When every field fails to serialize, the snapshot still stores a structurally valid empty payload ({"serialization_schema": {"type": "object", "properties": {}}, "serialized_data": {}}) so that resuming the snapshot does not raiseDeserializationError— for example when resuming from aToolBreakpointwhere the sub-component's inputs are not strictly required. - Fixed
tools_strict=TrueinOpenAIChatGeneratorto recursively applyadditionalProperties: falseandrequiredto all nested objects in tool parameter schemas. Previously only the top-level object was transformed, causing OpenAI's strict mode to reject tools with nested parameters.
💙 Big thank you to everyone who contributed to this release!
@Aftabbs, @albertodiazdurana, @anakin87, @ArkaD171717, @bilgeyucel, @bogdankostic, @davidsbatista, @FuturMix, @julian-risch, @kacperlukawski, @ritikraj2425, @saivedant169, @shaun0927, @sjrl, @SyedShahmeerAli12
Breaking Changes
- `LLM.run` and `LLM.run_async` no longer accept `messages` and `streaming_callback` as positional arguments — they must be passed as keyword arguments.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About haystack
Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and conversational systems.
Related context
Related tools
Earlier breaking changes
- v2.30.0 LLM.run and LLM.run_async no longer accept positional messages or streaming_callback arguments.
Beta — feedback welcome: [email protected]