claude-flow

v3.10.18 Feature

This release adds 2 notable features for engineering teams evaluating rollout.

Published 1mo AI Agents & Assistants

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agentic-ai agentic-framework agentic-workflow agents ai-agents ai-assistant

+14 more

ai-coding ai-skills autonomous-agents claude-code codex harness mcp-server multi-agent multi-agent-systems npm skills swarm swarm-intelligence typescript

Summary

AI summary

Updates What changed, What's next, and Honest limits across a mixed release.

Changes in this release

Type	Severity	Summary	CVE
Feature
Feature	Medium	Adds hybrid retrieval (BM25 + cosine + MMR) with outcome signal to pretrain harvester. Adds hybrid retrieval (BM25 + cosine + MMR) with outcome signal to pretrain harvester. Source: llm_adapter@2026-05-30 Confidence: high	—
Feature	Medium	Introduces new `neural_patterns` MCP tool search parameters: mode, alpha, mmrLambda, limit. Introduces new `neural_patterns` MCP tool search parameters: mode, alpha, mmrLambda, limit. Source: llm_adapter@2026-05-30 Confidence: high	—
Feature	Medium	Persists source text in `Pattern.content` field for BM25 tokenisation; backwards compatible fallback to `name`. Persists source text in `Pattern.content` field for BM25 tokenisation; backwards compatible fallback to `name`. Source: llm_adapter@2026-05-30 Confidence: high	—
Feature	Medium	Adds outcome signal detection (reverted, hotfixed) in pretrain harvester with verdict mix and metadata fields. Adds outcome signal detection (reverted, hotfixed) in pretrain harvester with verdict mix and metadata fields. Source: llm_adapter@2026-05-30 Confidence: high	—
Performance	Low	Hybrid retrieval increases average query latency from 28.7 ms to 40.6 ms (≈40% slower). Hybrid retrieval increases average query latency from 28.7 ms to 40.6 ms (≈40% slower). Source: llm_adapter@2026-05-30 Confidence: high	—

Full changelog

What ships

Hybrid retrieval (BM25 + cosine + MMR) plus outcome signal for the
pretrain harvester. Closes the relevance gap that ADR-077 exposed: cosine-only
search was returning plausible-but-off-topic results because the bridge ONNX
bi-encoder gets distracted on small corpora by IDF-cheap shared tokens.

The actual win

Measured on this checkout (N=385 patterns, 10 queries, real bridge ONNX
embedder — same setup ADR-077 used):

| Metric | Cosine (pre-3.10.18) | Hybrid (3.10.18) | Δ |
|---|---:|---:|---:|
| Top-1 hit rate (RELEVANCE) | 0% | 50% | +50pp |
| Top-3 hit rate (RELEVANCE) | 0% | 70% | +70pp |
| MRR@3 | 0.000 | 0.583 | +0.583 |
| Top-1 diversity | 100% | 80% | -20pp |
| Avg query latency | 28.7 ms | 40.6 ms | +11.9 ms |

Cosine was returning 0% relevant top-3 results — finding "something" but never
the right thing. Hybrid lands a relevant top-1 50% of the time, top-3 70%.

What changed

src/memory/hybrid-retrieval.ts — pure functions, no deps:
tokenize, buildCorpusStats, bm25Score, normalise, hybridScores,
cosineSim, mmrRerank. 21 unit tests covering edge cases.
neural_patterns MCP tool — new search params:
- mode: 'hybrid' | 'cosine' (default hybrid; cosine preserved for A/B)
- alpha — cosine weight in [0,1] (default 0.6)
- mmrLambda — 1.0 = pure relevance, 0.0 = pure diversity (default 0.5)
- limit — top-K (default 10, max 100)
- Response includes hybridScore, cosineScore, bm25Score, mmrScore
  so callers can inspect why a result ranked where it did
Pattern.content field — neural store now persists source text (cap
4096 chars). BM25 needs tokens to score against. Backwards compatible:
pre-3.10.18 patterns fall back to name for BM25 tokenisation.
Outcome signal in pretrain harvester — detects:
- reverted — later commit's subject is Revert "<this subject>"
- hotfixed — later commit (within window) shares ≥50% files AND has
  fix/hotfix/patch in subject
- Verdict mix in summary.feed.verdictMix; original outcome in
  metadata.outcomeVerdict on each trajectory

Reproduce

git clone https://github.com/ruvnet/ruflo && cd ruflo
npm install && ( cd v3/@claude-flow/cli && npx tsc -b )

# Unit tests (no I/O) — 21 + 7 tests
( cd v3/@claude-flow/cli && npx vitest run __tests__/hybrid-retrieval.test.ts __tests__/pretrain-from-github.test.ts )

# A/B benchmark
node v3/@claude-flow/cli/scripts/pretrain-from-github.mjs
node v3/@claude-flow/cli/scripts/benchmark-pretrained-retrieval.mjs        # hybrid (default)
HYBRID=0 node v3/@claude-flow/cli/scripts/benchmark-pretrained-retrieval.mjs   # cosine baseline

Honest limits

N=385 with 10 queries is small. The relevance metric is regex-over-subject —
a labelled held-out set would be stronger. Direction is robust; magnitude
could move on a different corpus.
Hybrid is 40% slower per query (28.7 → 40.6 ms). Still <50 ms but worth
budgeting on hot paths. Cosine-only mode preserved for callers who need it.
This checkout has zero reverts/hotfixes in the 200 most recent commits, so
the outcome detector emits a clean success=200 distribution. Detector is
unit-tested; the empty count reflects a clean recent history.

What's next

Cross-encoder reranker (3.11.0, MINOR — new dep): standard SOTA pattern
for another +0.05-0.15 MRR
Learned distiller (paper's 11× compression target): #2241 round-D
Negative-reward propagation on retrieval miss: needs agent-level success
attribution we don't yet emit reliably

Install

npx [email protected]             # or @latest, @alpha, @v3alpha (all aligned)

Full ADR: v3/docs/adr/ADR-078-hybrid-retrieval-and-outcome-signal.md

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track claude-flow

Get notified when new releases ship.

About claude-flow

Deploy multi-agent swarms with coordinated workflows.

All releases →