claude-flow

v3.10.19 Feature

This release adds 2 notable features for engineering teams evaluating rollout.

Published 1mo AI Agents & Assistants

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agentic-ai agentic-framework agentic-workflow agents ai-agents ai-assistant

+14 more

ai-coding ai-skills autonomous-agents claude-code codex harness mcp-server multi-agent multi-agent-systems npm skills swarm swarm-intelligence typescript

Summary

AI summary

Updates What changed in code, What's next, and Honest limits across a mixed release.

Changes in this release

Type	Severity	Summary	CVE
Feature
Feature	Medium	Adds multi-field BM25 retrieval treating subject and body as separate fields (subjectWeight=3, bodyWeight=1). Adds multi-field BM25 retrieval treating subject and body as separate fields (subjectWeight=3, bodyWeight=1). Source: llm_adapter@2026-05-30 Confidence: high	—
Feature	Low	Adds opt‑in type penalty for meta‑commits via `typePenalty()` function (default no‑op). Adds opt‑in type penalty for meta‑commits via `typePenalty()` function (default no‑op). Source: llm_adapter@2026-05-30 Confidence: high	—
Feature	Low	Adds new parameters `subjectWeight`, `bodyWeight`, and `typePenaltyFactor` to the neural_patterns MCP tool. Adds new parameters `subjectWeight`, `bodyWeight`, and `typePenaltyFactor` to the neural_patterns MCP tool. Source: llm_adapter@2026-05-30 Confidence: high	—
Performance
Performance	Medium	Improves top‑1 hit rate from 0% to 80% with multi‑field BM25 (Δ +80pp). Improves top‑1 hit rate from 0% to 80% with multi‑field BM25 (Δ +80pp). Source: llm_adapter@2026-05-30 Confidence: high	—
Performance	Medium	Improves MRR@3 from 0.000 to 0.800 (Δ +0.800). Improves MRR@3 from 0.000 to 0.800 (Δ +0.800). Source: llm_adapter@2026-05-30 Confidence: high	—
Performance	Low	Reduces average query latency from 28.7 ms to 39.0 ms (Δ +10 ms). Reduces average query latency from 28.7 ms to 39.0 ms (Δ +10 ms). Source: llm_adapter@2026-05-30 Confidence: high	—
Refactor	Low	Adds 39 unit tests to `__tests__/hybrid-retrieval.test.ts` (up from 21). Adds 39 unit tests to `__tests__/hybrid-retrieval.test.ts` (up from 21). Source: llm_adapter@2026-05-30 Confidence: high	—

Full changelog

What ships

Multi-field BM25 (subject 3× over body) and an opt-in type penalty for
meta-commits. Both improvements come with a full ablation table — and the
ablation drove a real decision (type penalty defaults to OFF because it
hurts top-1 when multi-field BM25 is doing the heavy lifting).

The cumulative win (3.10.17 cosine → 3.10.19)

| Metric (N=385, 10 queries) | 3.10.17 cosine | 3.10.18 hybrid | 3.10.19 | Δ since cosine |
|---|---:|---:|---:|---:|
| Top-1 hit rate | 0% | 50% | 80% | +80pp |
| Top-3 hit rate | 0% | 70% | 80% | +80pp |
| MRR@3 | 0.000 | 0.583 | 0.800 | +0.800 |
| Top-1 diversity | 100% | 80% | 100% | 0pp (recovered) |
| Avg query latency | 28.7 ms | 40.6 ms | 39.0 ms | +10 ms |

The ablation that drove the decisions

| Configuration | Top-1 | Top-3 | MRR@3 |
|---|:---:|:---:|:---:|
| Cosine baseline | 0/10 | 0/10 | 0.000 |
| Single-field BM25, no penalty (~3.10.18) | 5/10 | 7/10 | 0.583 |
| Single-field BM25 + type penalty 0.5 | 7/10 | 7/10 | 0.700 |
| Multi-field BM25 3:1, no penalty (3.10.19 default) | 8/10 | 8/10 | 0.800 |
| Multi-field BM25 3:1 + type penalty 0.5 | 7/10 | 8/10 | 0.750 |

Multi-field BM25 alone wins. Adding the type penalty hurts top-1 because
some real work commits start with Merge feat/... and get falsely demoted.
The penalty stays in the codebase as an opt-in for callers with different
commit conventions.

What changed in code

multiFieldBM25() in src/memory/hybrid-retrieval.ts — treats
pattern subject (name) and body (content) as separate fields with
independent IDF distributions. Defaults: subjectWeight=3.0, bodyWeight=1.0.
typePenalty() + META_COMMIT_REGEX — exported but defaults to no-op
(typePenaltyFactor=1.0). Callers can pass 0.5 for aggressive
meta-commit suppression.
neural_patterns MCP tool — new params: subjectWeight,
bodyWeight, typePenaltyFactor. Response shape unchanged.
39 unit tests in __tests__/hybrid-retrieval.test.ts (was 21 in
3.10.18); covers ranking, weight collapse cases, regex coverage, factor
bounds, undefined-name safety.

Reproduce

git clone https://github.com/ruvnet/ruflo && cd ruflo
npm install && ( cd v3/@claude-flow/cli && npx tsc )

# Unit tests
( cd v3/@claude-flow/cli && npx vitest run __tests__/hybrid-retrieval.test.ts __tests__/pretrain-from-github.test.ts )

# Live A/B
cd v3/@claude-flow/cli
node scripts/pretrain-from-github.mjs
node scripts/benchmark-pretrained-retrieval.mjs              # 3.10.19 default → 80% top-1
HYBRID=0 node scripts/benchmark-pretrained-retrieval.mjs     # cosine baseline → 0% top-1

Honest limits

N=385, 10 queries is small. Relevance metric is regex-over-commit-subject —
a labelled held-out corpus would tighten confidence intervals. Direction
(0% → 80% top-1) is robust to noise.
Subject:body 3:1 weight chosen by inspection, not grid-search. Future ADR
could grid-search on a wider corpus.
Type penalty regex is hand-curated for ruflo's conventions. Other repos
with different conventions need their own regex — the function takes one
as a parameter.

What's next

Cross-encoder reranker (3.11.0, MINOR — new dep): paper-proven path
for closing the remaining 80% → 100% top-1 gap
Learned distiller (paper's 11× compression): #2241 round-D
Grid-search for subject/body weights on a wider held-out corpus

Install

npx [email protected]    # latest / alpha / v3alpha all aligned

Full ADR: v3/docs/adr/ADR-079-multifield-bm25-and-type-penalty.md

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track claude-flow

Get notified when new releases ship.

About claude-flow

Deploy multi-agent swarms with coordinated workflows.

All releases →