This release adds 2 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+14 more
Summary
AI summaryUpdates What changed in code, What's next, and Honest limits across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | Medium |
Adds multi-field BM25 retrieval treating subject and body as separate fields (subjectWeight=3, bodyWeight=1). Adds multi-field BM25 retrieval treating subject and body as separate fields (subjectWeight=3, bodyWeight=1). Source: llm_adapter@2026-05-30 Confidence: high |
— |
| Feature | Low |
Adds opt‑in type penalty for meta‑commits via `typePenalty()` function (default no‑op). Adds opt‑in type penalty for meta‑commits via `typePenalty()` function (default no‑op). Source: llm_adapter@2026-05-30 Confidence: high |
— |
| Feature | Low |
Adds new parameters `subjectWeight`, `bodyWeight`, and `typePenaltyFactor` to the neural_patterns MCP tool. Adds new parameters `subjectWeight`, `bodyWeight`, and `typePenaltyFactor` to the neural_patterns MCP tool. Source: llm_adapter@2026-05-30 Confidence: high |
— |
| Performance | Medium |
Improves top‑1 hit rate from 0% to 80% with multi‑field BM25 (Δ +80pp). Improves top‑1 hit rate from 0% to 80% with multi‑field BM25 (Δ +80pp). Source: llm_adapter@2026-05-30 Confidence: high |
— |
| Performance | Medium |
Improves MRR@3 from 0.000 to 0.800 (Δ +0.800). Improves MRR@3 from 0.000 to 0.800 (Δ +0.800). Source: llm_adapter@2026-05-30 Confidence: high |
— |
| Performance | Low |
Reduces average query latency from 28.7 ms to 39.0 ms (Δ +10 ms). Reduces average query latency from 28.7 ms to 39.0 ms (Δ +10 ms). Source: llm_adapter@2026-05-30 Confidence: high |
— |
| Refactor | Low |
Adds 39 unit tests to `__tests__/hybrid-retrieval.test.ts` (up from 21). Adds 39 unit tests to `__tests__/hybrid-retrieval.test.ts` (up from 21). Source: llm_adapter@2026-05-30 Confidence: high |
— |
Full changelog
What ships
Multi-field BM25 (subject 3× over body) and an opt-in type penalty for
meta-commits. Both improvements come with a full ablation table — and the
ablation drove a real decision (type penalty defaults to OFF because it
hurts top-1 when multi-field BM25 is doing the heavy lifting).
The cumulative win (3.10.17 cosine → 3.10.19)
| Metric (N=385, 10 queries) | 3.10.17 cosine | 3.10.18 hybrid | 3.10.19 | Δ since cosine |
|---|---:|---:|---:|---:|
| Top-1 hit rate | 0% | 50% | 80% | +80pp |
| Top-3 hit rate | 0% | 70% | 80% | +80pp |
| MRR@3 | 0.000 | 0.583 | 0.800 | +0.800 |
| Top-1 diversity | 100% | 80% | 100% | 0pp (recovered) |
| Avg query latency | 28.7 ms | 40.6 ms | 39.0 ms | +10 ms |
The ablation that drove the decisions
| Configuration | Top-1 | Top-3 | MRR@3 |
|---|:---:|:---:|:---:|
| Cosine baseline | 0/10 | 0/10 | 0.000 |
| Single-field BM25, no penalty (~3.10.18) | 5/10 | 7/10 | 0.583 |
| Single-field BM25 + type penalty 0.5 | 7/10 | 7/10 | 0.700 |
| Multi-field BM25 3:1, no penalty (3.10.19 default) | 8/10 | 8/10 | 0.800 |
| Multi-field BM25 3:1 + type penalty 0.5 | 7/10 | 8/10 | 0.750 |
Multi-field BM25 alone wins. Adding the type penalty hurts top-1 because
some real work commits start with Merge feat/... and get falsely demoted.
The penalty stays in the codebase as an opt-in for callers with different
commit conventions.
What changed in code
-
multiFieldBM25()insrc/memory/hybrid-retrieval.ts— treats
pattern subject (name) and body (content) as separate fields with
independent IDF distributions. Defaults:subjectWeight=3.0, bodyWeight=1.0. -
typePenalty()+META_COMMIT_REGEX— exported but defaults to no-op
(typePenaltyFactor=1.0). Callers can pass0.5for aggressive
meta-commit suppression. -
neural_patternsMCP tool — new params:subjectWeight,
bodyWeight,typePenaltyFactor. Response shape unchanged. -
39 unit tests in
__tests__/hybrid-retrieval.test.ts(was 21 in
3.10.18); covers ranking, weight collapse cases, regex coverage, factor
bounds, undefined-name safety.
Reproduce
git clone https://github.com/ruvnet/ruflo && cd ruflo
npm install && ( cd v3/@claude-flow/cli && npx tsc )
# Unit tests
( cd v3/@claude-flow/cli && npx vitest run __tests__/hybrid-retrieval.test.ts __tests__/pretrain-from-github.test.ts )
# Live A/B
cd v3/@claude-flow/cli
node scripts/pretrain-from-github.mjs
node scripts/benchmark-pretrained-retrieval.mjs # 3.10.19 default → 80% top-1
HYBRID=0 node scripts/benchmark-pretrained-retrieval.mjs # cosine baseline → 0% top-1
Honest limits
- N=385, 10 queries is small. Relevance metric is regex-over-commit-subject —
a labelled held-out corpus would tighten confidence intervals. Direction
(0% → 80% top-1) is robust to noise. - Subject:body 3:1 weight chosen by inspection, not grid-search. Future ADR
could grid-search on a wider corpus. - Type penalty regex is hand-curated for ruflo's conventions. Other repos
with different conventions need their own regex — the function takes one
as a parameter.
What's next
- Cross-encoder reranker (3.11.0, MINOR — new dep): paper-proven path
for closing the remaining 80% → 100% top-1 gap - Learned distiller (paper's 11× compression): #2241 round-D
- Grid-search for subject/body weights on a wider held-out corpus
Install
npx [email protected] # latest / alpha / v3alpha all aligned
Full ADR: v3/docs/adr/ADR-079-multifield-bm25-and-type-penalty.md
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
Related context
Related tools
Beta — feedback welcome: [email protected]