This release adds 3 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+14 more
Summary
AI summaryUpdates What changed in code, labelled, and v3/docs/adr/ADR-083-joint-rerank-grid.md across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | Medium |
`subjectWeight` default now conditional on `useRerank` flag (3.0 when reranking, 2.0 otherwise). `subjectWeight` default now conditional on `useRerank` flag (3.0 when reranking, 2.0 otherwise). Source: llm_adapter@2026-05-30 Confidence: high |
— |
| Feature | Medium |
Updated default hybrid and cross‑encoder weights to hw=0.7, cw=0.3. Updated default hybrid and cross‑encoder weights to hw=0.7, cw=0.3. Source: llm_adapter@2026-05-30 Confidence: high |
— |
| Feature | Low |
Extended `scripts/grid-search-retrieval.mjs` with joint rerank sweep (28 configs). Extended `scripts/grid-search-retrieval.mjs` with joint rerank sweep (28 configs). Source: llm_adapter@2026-05-30 Confidence: high |
— |
| Refactor | Low |
Updated schema descriptions to reflect conditional defaults. Updated schema descriptions to reflect conditional defaults. Source: llm_adapter@2026-05-30 Confidence: high |
— |
Full changelog
What ships
Joint rerank re-grid — the rerank path's hybrid sub-params (α, sw) had been
tuned against the OLD α=0.6/sw=3.0 baseline; with ADR-082 changing α/sw under it,
a joint re-grid was the next ceiling-raiser. It paid off: rerank nDCG@3 0.900 → 0.963.
The key finding
The rerank path wants different hybrid sub-params than the non-rerank path:
| Path | Best α | Best sw | Best hw/cw | nDCG@3 |
|---|---:|---:|---|---:|
| Non-rerank (hybrid only) | 0.5 | 2.0 | — | 0.963 |
| Rerank | 0.5 | 3.0 | hw=0.7 cw=0.3 | 0.963 |
When the cross-encoder is doing semantic understanding downstream, the hybrid
stage can be more keyword-focused (higher subjectWeight). When hybrid is
the final stage, lower subjectWeight gives body tokens room to contribute.
Implementation: subjectWeight default is now conditional on rerank flag
(3.0 when reranking, 2.0 otherwise). Explicit param overrides.
The win
| Metric (rerank path, labelled) | 3.10.22 | 3.10.23 | Δ |
|---|---:|---:|---:|
| Label top-1 | 90% | 90% | tied |
| Label top-3 | 90% | 100% | +10pp |
| Label MRR@3 | 0.925 | 0.950 | +0.025 |
| Label precision@3 | 0.700 | 0.700 | tied |
| Label nDCG@3 | 0.900 | 0.963 | +0.063 (+7%) |
| Label nDCG@5 | 0.904 | 0.944 | +0.040 |
Both paths now at corpus ceiling (nDCG@3 = 0.963)
The choice between them is now purely cost vs richness:
| Path | Latency | Top-3 precision | Use when |
|---|---:|---:|---|
| Hybrid | 39 ms | 0.533 | hot paths, throughput-bound |
| Rerank | 1000 ms | 0.700 | richness-first, latency-tolerant |
Cumulative SOTA push since cosine baseline (3.10.17 → 3.10.23)
| Metric (labelled) | 3.10.17 | 3.10.19 | 3.10.20 | 3.10.22 | 3.10.23 |
|---|---:|---:|---:|---:|---:|
| Hybrid nDCG@3 | 0.000 | 0.900 | 0.900 | 0.963 | 0.963 |
| Rerank nDCG@3 | — | — | 0.913 | 0.900 | 0.963 |
| Hybrid top-3 | 0% | 90% | 90% | 100% | 100% |
| Rerank top-3 | — | — | 100% | 90% | 100% |
| Rerank precision@3 | — | — | 0.667 | 0.700 | 0.700 |
What changed in code
subjectWeightdefault is now conditional onuseRerankinsrc/mcp-tools/neural-tools.ts(3.0 if reranking, 2.0 otherwise).hybridWeight/ceWeightdefaults updated to grid winners: 0.5/0.5 → 0.7/0.3.scripts/grid-search-retrieval.mjsextended with joint rerank sweep (28 configs across hw/cw × α × sw).- Schema descriptions updated to reflect the conditional defaults.
Pending for next iteration
Cross-repo generalisation test — all numbers in ADRs 077-083 are on the
ruflo corpus. The real SOTA test is "does this hold up on a different repo's
history?" Pretrain on agentdb / agentic-flow, run a similar labelled bench,
see if nDCG@3 stays near 0.96. Tracked for 3.10.24 (or its own ADR-084).
Reproduce
git clone https://github.com/ruvnet/ruflo && cd ruflo
npm install && ( cd v3/@claude-flow/cli && npx tsc )
node v3/@claude-flow/cli/scripts/pretrain-from-github.mjs
# Joint grid (~25 min)
cd v3/@claude-flow/cli && node scripts/grid-search-retrieval.mjs
# Verify both paths at corpus ceiling
BENCH_NO_WRITE=1 node scripts/benchmark-pretrained-retrieval.mjs # hybrid → nDCG@3 0.963
RERANK=1 BENCH_NO_WRITE=1 node scripts/benchmark-pretrained-retrieval.mjs # rerank → nDCG@3 0.963 (was 0.900)
Install
npx [email protected] # latest / alpha / v3alpha all aligned
Full ADR: v3/docs/adr/ADR-083-joint-rerank-grid.md
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
Related context
Related tools
Beta — feedback welcome: [email protected]