This release includes breaking changes for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
Affected surfaces
ReleasePort's take
Light signalv2.0.0b1 introduces a configurable cross‑encoder reranker, four idempotent query rewrites, and ReDoS hardening while fixing model ID defaults and lowercasing regressions.
Why it matters: Patch to v2.0.0b1 immediately; the new ReDoS safeguards require deployment before handling untrusted input.
Summary
AI summaryUpdates Stats, Install + test ```bash, and cost-controlled across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Security | Medium |
Adds ReDoS hardening via bounded quantifiers and token-count limits. Adds ReDoS hardening via bounded quantifiers and token-count limits. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Adds optional cross-encoder reranker via [reranker] extra installation. Adds optional cross-encoder reranker via [reranker] extra installation. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Adds four idempotent rule-based query rewrites to base install. Adds four idempotent rule-based query rewrites to base install. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Enforces 5-second wall-clock timeout on first-call reranker model load. Enforces 5-second wall-clock timeout on first-call reranker model load. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Reranker sets persistent kill switch after first load failure. Reranker sets persistent kill switch after first load failure. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Decomposes X-of-Y and possessive queries into structured entity+attribute hints. Decomposes X-of-Y and possessive queries into structured entity+attribute hints. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Feature | Medium |
Applies token-level misspelling substitution from bundled ~40-entry curated map. Applies token-level misspelling substitution from bundled ~40-entry curated map. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Strips leading articles from queries unless matching canonical entity titles. Strips leading articles from queries unless matching canonical entity titles. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Feature | Medium |
Adds master kill switch to disable all four query rewrite rules atomically. Adds master kill switch to disable all four query rewrite rules atomically. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Bugfix | Medium |
Fixes reranker default model ID from unsupported Xenova to BAAI/bge-reranker-base. Fixes reranker default model ID from unsupported Xenova to BAAI/bge-reranker-base. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Bugfix | Medium |
Fixes four code paths broken by unconditional query lowercasing. Fixes four code paths broken by unconditional query lowercasing. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Bugfix | Low |
Fixes several code paths broken by unconditional query lowercasing (e.g., command keywords, section extraction). Fixes several code paths broken by unconditional query lowercasing (e.g., command keywords, section extraction). Source: granite4.1:30b@2026-05-21-audit Confidence: low |
— |
| Refactor | Medium |
Adds 68 new tests: 43 query rewrite, 25 reranker; updates 13 pre-existing. Adds 68 new tests: 43 query rewrite, 25 reranker; updates 13 pre-existing. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Refactor | Medium |
CI expands to ubuntu/macos/windows with Python 3.12 and 3.13; all pass. CI expands to ubuntu/macos/windows with Python 3.12 and 3.13; all pass. Source: llm_adapter@2026-05-21 Confidence: low |
— |
Full changelog
First b-series release. Two new Phase D features ship together in the alpha → beta cutover: a cross-encoder reranker behind an opt-in extra (sub-D-1, originally PR #163) and four idempotent rule-based query rewrites in the base install (sub-D-2, PR #164). Combined: every user sees a cleaner query reaching Xapian; users who opt into [reranker] also see semantic reranking on the BM25 results. Pre-release pending the live-MCP smoke sweep.
Sub-D-1 — Cross-encoder reranker behind [reranker] extra (#163)
Optional dependency. pip install openzim-mcp[reranker] pulls in FastEmbed (~150 MB Python packages) and the first rerank-eligible query lazily downloads BAAI/bge-reranker-base (~1.1 GB) from HuggingFace into the FastEmbed cache. Air-gapped operators run openzim-mcp download-models once on a network-connected machine to pre-stage the model. The base install (pip install openzim-mcp) is untouched — reranker code lazy-imports inside openzim_mcp/ml/reranker.py and is never loaded when the extra is absent.
Wired into all four search surfaces: _handle_search, _handle_filtered_search, _handle_search_all, and synthesize._collect_passages. Each surface emits per-call telemetry:
reranker_engaged— cross-encoder actually scored the candidates (the returned list carries arerank_scorefield).reranker_skipped.not_installed—[reranker]absent OROPENZIM_RERANKER_DISABLE=1ORconfig.ml.reranker.enabled=false.reranker_skipped.no_results— Xapian returned zero candidates; nothing to rerank.reranker_skipped.passthrough— reranker ran but returned without scoring (skip-on-short-query gate fired belowmin_query_tokens=4, OR mid-inference failure tripped@ml_fallback).
Skip-on-short-query gate: queries with fewer than 4 word tokens bypass rerank because single-word entity queries (Berlin, Photosynthesis) already get a Xapian-score-1.0 canonical-title hit — the cross-encoder adds cost without value there.
Risk mitigations baked in:
- 5-second wall-clock timeout on first-call model load with non-blocking shutdown — operators don't hang waiting on a stuck download. (The original
with ThreadPoolExecutorpattern from the plan had a latent bug where__exit__blocked onshutdown(wait=True)despite the timeout firing; the fix manages the executor lifecycle manually withshutdown(wait=False).) - Persistent kill switch on load failure —
BGEReranker._load_failedis set after the first failure; every subsequent call returnsNoneimmediately. No retry storms. @ml_fallbackonrerank()— mid-inference exceptions (OOM, garbled UTF-8 tokens) degrade to Xapian-only ordering via_rerank_passthrough, log WARNING once + DEBUG on retry.- Production model ID guarded by integration test —
test_production_default_model_is_supported_by_fastembedchecksBAAI/bge-reranker-baseis still inTextCrossEncoder.list_supported_models(). Catches FastEmbed registry drift before users do. - Cross-archive rerank in
_handle_search_allredistributes globally-top-K hits back to per-archive buckets; ordering test pins this contract. synthesizererank propagatesrerank_scoreintop["score"]so the downstream_boost_by_section_affinitysort doesn't silently revert to BM25 order.
Reality-check correction from the integration test: the original plan's default Xenova/bge-reranker-base-onnx model ID does not exist in FastEmbed 0.8.0. Caught by test_reranker_live.py on first run; fixed in config.py to BAAI/bge-reranker-base.
Sub-D-2 — Tier 1 query rewriting (#164)
Four idempotent rule-based rewrites in the base install (no extras, no models). Run before the existing _strip_* chain in IntentParser.parse_intent, so every downstream pipeline — Xapian search, intent regex matching, the sub-D-1 reranker — inherits a cleaner query.
| Rule | Method | Behavior |
|---|---|---|
| 1 | _normalize_topic_case | Lowercase the query. Consolidates scattered .lower() calls into a single named pass. No telemetry (fires on every query). |
| 2 | _apply_misspelling_map | Token-level substitution from a bundled dict[str, str] (~40 starter entries from Wikipedia's "List of common misspellings for machines"). An optional title-index probe suppresses substitution when the original token is a canonical entity name. Telemetry: query_rewrite.misspelling. |
| 3 | _detect_stopword_phrase | Strip leading articles (the, a, an, of) unless the full query is a canonical title (The Beatles, Of Mice and Men stay intact when the probe is provided). Telemetry: query_rewrite.stopword_phrase. |
| 4 | _decompose_x_of_y | Decompose population of berlin and berlin's population shapes. Emits BOTH a cleaner query string (berlin population) AND a structured {"entity": ..., "attribute": ...} hint that rides inside params["decomposition_hint"]. _handle_tell_me_about consumes the hint and uses the structured entity directly, skipping its own topic-extraction. Telemetry: query_rewrite.x_of_y. |
Rule order is fixed: 1 → 2 → 3 → 4. Each rule is idempotent (running twice produces no further change). All four are pure-Python; no I/O on the hot path (the bundled data files load once at module init via @functools.lru_cache).
Risk mitigations baked in:
- Master kill switch (
QueryRewriteConfig.enabled = False) skips all four rules — not just telemetry. Verified bytest_query_rewrite_disabled_skips_all_rules. - Title-index probe (when an archive is in scope) suppresses false-positive rewrites of real proper nouns.
Bilogyis in the misspelling list but is also a surname; the probe checks for a canonical hit before substituting. - Hard 500-entry cap on the misspellings map; starter file ships with ~40 high-confidence entries and grows reactively from beta-test observations.
- Exclusions file (empty seed) lets operators pin specific tokens as "never rewrite."
- ReDoS hardening — both regex compile sites use bounded quantifiers (
{1,200}) and token-count bounds ((?:\s+\S+){0,8}) so adversarial inputs cannot induce polynomial backtracking. Confirmed by SonarCloud's regex analyzer.
Reality-check ripples discovered during integration: Rule 1 unconditionally lowercases every query, which broke code paths that depended on case-preserving inputs. Compensating fixes shipped in the same PR:
_DECOMPOSE_SKIP_ATTRSfrozenset prevents Rule 4 from decomposing intent command keywords (structure of X,links of Y,table of contents, etc.)._RULE3_SECTION_COMMAND_REprevents Rule 3 from strippingtheinget_sectioncommand phrases (the evolution section of biologystays parseable)._extract_get_zim_entriesregex widened from[A-Z]/to[A-Za-z]/for lowercase namespace paths (m/image.pnginstead ofM/Image.png)._looks_like_bare_topiclength threshold lowered from 5→2 chars (so post-lowercaseddna,pi,aistill qualify); filler-token set expanded with common 2-char prepositions._handle_find_by_titleD6 redirect switched fromisupper()toisalpha(); the deadnot title[0].isupper()post-zero-results sub-condition removed.
13 pre-existing test files in the a-series regression suites updated to expect lowercase params (assert params["topic"] == "Berlin" → "berlin"). No tests were deleted or weakened.
What's NOT in this release
- Multi-hop questions (
what year did the inventor of X die) — deferred to a potential sub-D-3 if live evidence warrants. - HyDE / hypothetical document synthesis — locked-in non-goal.
- Algorithmic spell-correction libraries (
pyspellchecker,autocorrect,symspellpy) — wrong precision/recall tradeoff for encyclopedia search; the curated map + title-index probe is the right tool here. - Embeddings sidecar (
hnswlib, deferred sub-D-4) — gated on live evidence that reranker engagement rate ≥15% AND operator-reported semantic-divergent misses. - Hybrid intent parser (deferred sub-D-3 alternative) — gated on live evidence of ≥5% low-confidence
parse_intentcalls OR multi-hop transcript failures.
Stats
- Tests: 2245 passing, 54 skipped on the no-extras path. ~43 new tests in
tests/test_query_rewrite_tier1.py(per-rule fix/no-op/boundary triads, integration, composition, hint handoff). ~25 new tests intests/ml/(reranker unit + integration). 13 pre-existing test files updated for the lowercase ripple. - CI: Full matrix passes — ubuntu/macos/windows × Python 3.12/3.13. New
test-rerankerjob runs FastEmbed integration tests on Linux only (cost-controlled). - SonarCloud: Quality Gate OK — 0 open issues, 0 unreviewed hotspots.
- Commits since a25: 27 (sub-D-1 squash + sub-D-2 squash; sub-D-1 originally shipped as 23 commits, sub-D-2 as 14).
Install + test
# Base install (sub-D-2 query rewriting only):
uv tool install --force --reinstall openzim-mcp==2.0.0b1
# With reranker (sub-D-1 + sub-D-2):
uv tool install --force --reinstall 'openzim-mcp[reranker]==2.0.0b1'
# Pre-stage the reranker model offline (only needed if using [reranker]):
openzim-mcp download-models
A two-pass live-MCP test prompt for this build is bundled in the repo at docs/superpowers/specs/2026-05-21-v2-b1-live-test-prompt.md — paste it into a fresh Claude conversation that has openzim-mcp connected to your archive, and it'll walk through both per-rule probes and cross-feature integration with explicit telemetry verification.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About cameronrye/openzim-mcp
Modern, secure MCP server for accessing ZIM format knowledge bases offline. Enables AI models to search and navigate Wikipedia, educational content, and other compressed knowledge archives with smart retrieval, caching, and comprehensive API.
Related context
Related tools
Earlier breaking changes
- v2.0.0a15 _attribute_sections falls back to first section when no section brackets located passage
- v2.0.0a13 canonical‑splice gate tightened to require exact path equality, fixing H2/H3 surface end‑to‑end behavior across all shapes.
- v2.0.0a11 Exposed `content_offset` as top-level `zim_query` parameter, validated >=0, threaded through options.
- v2.0.0a10 `get article M/<key>` now returns ZIM metadata entry rather than aliased C-namespace article body.
- v2.0.0a10 `metadata for <file>` returns concise metadata strings instead of full article bodies for new-scheme archives.
Beta — feedback welcome: [email protected]