- Custom score rerankers in scoreModifiers via BM25 or vector closeness
- lexicalOperand parameter for controlling lexical operators (or, and, weakAnd)
- CONTAINS filter for token-level matching on lexical fields
Release history
marqo releases
Ecommerce Search and Discovery - marqo.ai
All releases
33 shown
Adds configurable connection recycling to mitigate connection imbalance in long-running deployments.
- center parameter for reproducible recency scores
- applyToSubqueries to control recency in hybrid search
Full changelog
2.25.1
New Features
- Add center and applyToSubqueries parameters to recency scoring (https://github.com/marqo-ai/marqo/pull/1376)
- center — A fixed Unix epoch timestamp (seconds) to use as the reference point instead of now(), enabling reproducible recency scores across queries
- applyToSubqueries — Control which hybrid subqueries ("tensor", "lexical", or both) receive recency boosting in RRF hybrid search
- Triton-based inference orchestrator
- Model Management Container with lifecycle management
- Centralized model registry in marqo-common
Full changelog
2.25.0
Changes
Triton-based Inference Architecture
Marqo's inference layer has been restructured from a monolithic design into three dedicated components:
- Inference Orchestrator — A FastAPI service that coordinates inference requests. (https://github.com/marqo-ai/marqo/pull/1315)
- Model Management Container — A FastAPI service for managing ML model lifecycles with Triton Inference Server, including model loading/unloading, health checks, and environment variable consistency. (https://github.com/marqo-ai/marqo/pull/1322)
- Marqo API adaptations — The core Marqo API has been updated to work with the new Triton-backed components. (https://github.com/marqo-ai/marqo/pull/1317)
This architecture enables independent scaling and deployment of inference, model management, and search API layers.
Other Changes
- Centralize model registry into a shared components/common package (marqo-common). (https://github.com/marqo-ai/marqo/pull/1356)
- Fix model download auth handling to support public S3 buckets. (https://github.com/marqo-ai/marqo/pull/1352)
Minor fixes and improvements.
Full changelog
2.24.15
Bug Fixes and Minor Changes
- Back-port update_index_settings API to 2.24 release branch to support modifying modelProperties of an existing index (https://github.com/marqo-ai/marqo/pull/1369)
Minor fixes and improvements.
Full changelog
2.24.13
Bug Fixes and Minor Changes
- Use orjson in get_document(s) endpoints (https://github.com/marqo-ai/marqo/pull/1349)
- Support picking the representative document within each collapsed group based on a numeric field sorting result (https://github.com/marqo-ai/marqo/pull/1350)
Added support for dual-stack endpoints in S3 model downloads.
- weakAnd lexical retrieval support
- Second-phase score modifiers
Full changelog
2.24.11
Bug Fixes and Minor Changes
- Support weankAnd lexical retrieval and second-phase score modifiers (https://github.com/marqo-ai/marqo/pull/1344).
- IPv6 socket support with IPv4 default
Full changelog
2.24.10
Bug Fixes and Minor Changes
- Add support for IPV6 sockets and set the default to IPV4 (https://github.com/marqo-ai/marqo/pull/1341).
- Recency-based relevance score boosting
Full changelog
Fixed performance degradation when using collapsingFields with attributesToRetrieve.
Fixed _relevantCandidates changing unexpectedly with relevance cut-off and collapse field.
- Typeahead search with intelligent suggestions
- Caching for base64-encoded image embeddings
Fixes totalHits count in collapsed searches to reflect unique values instead of total matching documents.
- Configurable stemming for text fields
- Collapse fields for variant grouping
- Search result sorting with sortBy parameter
- Personalization with context documents
- Query logging for slow and failed queries
Bug fix that omits base64 image strings from search responses to reduce response size.
- Base64-encoded image search
- Language support for lexical fields
Fixed bug affecting recall when combining searchable_attributes with required terms in lexical and hybrid queries.
Fixed a bug affecting filter application in hybrid mode lexical search.
Improved model warm-up logic to reduce memory usage by warming models on a single device.
- Inference cache (40% throughput, 25% latency improvement)
- SigLIP2 model support
- Approximate threshold parameter