This release adds 3 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
ReleasePort's take
Light signalThis release fixes several parsing and handling bugs related to trailing/politeness tokens across multiple components, adds a new invariant test for regex modal class sharing, performs refactors to unify politeness processing, and confirms no performance or security regressions.
Why it matters: Patch now if your code relies on correct handling of multi‑token possessives or trailing politeness; the fixes resolve incorrect query stripping and decomposition issues. No migration deadline is imposed, but testing in dev is recommended to verify unchanged behavior.
Summary
AI summaryUpdates Out of scope, Pass-1 defects, and Methodology evolution across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Security | Medium |
No security vulnerabilities introduced; all scans passed. No security vulnerabilities introduced; all scans passed. Source: llm_adapter@2026-05-22 Confidence: low |
— |
| Feature | Medium |
New invariant test ensures leading and trailing politeness regexes share modal class. New invariant test ensures leading and trailing politeness regexes share modal class. Source: llm_adapter@2026-05-22 Confidence: low |
— |
| Performance | Medium |
No performance regression reported; benchmarks unchanged across CI runs. No performance regression reported; benchmarks unchanged across CI runs. Source: llm_adapter@2026-05-22 Confidence: low |
— |
| Bugfix | Medium |
Trailing modal politeness ≥2 words now correctly stripped from user queries. Trailing modal politeness ≥2 words now correctly stripped from user queries. Source: llm_adapter@2026-05-22 Confidence: high |
— |
| Bugfix | Medium |
Reranker telemetry comment now emitted on no-results searches. Reranker telemetry comment now emitted on no-results searches. Source: llm_adapter@2026-05-22 Confidence: high |
— |
| Bugfix | Medium |
Compact filtered search now retains the “filtered” qualifier in results. Compact filtered search now retains the “filtered” qualifier in results. Source: llm_adapter@2026-05-22 Confidence: high |
— |
| Bugfix | Medium |
Possessive topic handling now retries decomposition for multi‑token possessives. Possessive topic handling now retries decomposition for multi‑token possessives. Source: llm_adapter@2026-05-22 Confidence: low |
— |
| Bugfix | Medium |
Possessive multi-token topics now retry decomposition in _handle_tell_me_about. Possessive multi-token topics now retry decomposition in _handle_tell_me_about. Source: granite4.1:30b@2026-05-22-audit Confidence: low |
— |
| Refactor | Medium |
Universal trailing‑politeness regex now shares modal class with leading counterpart. Universal trailing‑politeness regex now shares modal class with leading counterpart. Source: llm_adapter@2026-05-22 Confidence: low |
— |
| Refactor | Medium |
Chained‑intent guidance now strips trailing politeness from both halves before rendering. Chained‑intent guidance now strips trailing politeness from both halves before rendering. Source: llm_adapter@2026-05-22 Confidence: low |
— |
Full changelog
Post-b2 sweep packaged from PR #167 (commits 45de8da → cc26b3d).
Sweep shape: 4 → 1 → 1 across pass-1, pass-2, pass-3. All eight b2
user-facing fix families verified clean on live MCP first; sweep then
probed the adversarial shapes the b2 fixes unlocked. Both pass-2 and
pass-3 surfaced single narrow-scope siblings of pass-1 fixes —
consistent with the "narrow-scope sibling" pattern (now 8 sweeps
strong) and the "fix unlocks new paths" pattern (now 9 sweeps strong).
Pass-1 defects (4, 45de8da)
- D1 — trailing modal politeness ≥2 words falls through. The
trailing-politeness regex in_extract_tell_me_aboutonly matched
please/to me/for me; the LEADING regex (line ~374)
recognised the modal class (could/can/would/will+you) but
the trailing twin was missing. Live:tell me about Tokyo if you would→Would(verb stub);... if you could→Could;... would you→Would_Youdisambig. Fix: add a trailing pattern
symmetric to the leading one (both branches require ayouso a
bare trailing modal verb in real article titles isn't stripped). - D2 — reranker telemetry comment suppressed on no-results. The
b1 D-1 in-band telemetry contract promised<!-- reranker=<state> -->
on every multi-token search._handle_searchcompact path
early-returned ontotal == 0BEFORE reaching
_maybe_rerank_compact, so neither_RERANKER_SKIPPED_NO_RESULTS
nor_RERANKER_SKIPPED_NOT_INSTALLEDbumped and the envelope
writer skipped the comment. Live:search for asdfqwerzxcv nonexistent→ no reranker comment. Fix: invoke
_maybe_rerank_compacton the empty payload before the bail
(no-op aside from the counter bump; the rerank singleton is
cached). - D3 — Rule 2 + multi-token possessive picks wrong token. Live:
tell me about Photosythesis's reproduction→Reproduction
article (expectedPhotosynthesis). Rule 2's affix retry
correctly fires (Photosythesis's→Photosynthesis's), but
the b1 P1-D5 fix unlocked the path — pre-fix returnedNo search results found, post-fix returns a SILENT WRONG ANSWER.
Root cause: Rule 4's_POSSESSIVE_REis^...$-anchored and
runs against the FULL query at parse time; the verb prefix
prevents the match. Fix: in_handle_tell_me_about, when no
decomposition hint was attached AND the topic carries an
apostrophe-s followed by another token, retry
_decompose_x_of_yon the bare topic. Scope narrowed to
the possessive shape ONLY (NOTX of Y) to avoid regressing
non-canonical X-of-Y queries. - D4 — compact filtered search drops "filtered" qualifier.
Live:search Berlin in namespace C→Found 3 matches for "Berlin"(legacy non-compact path emitsFound N filtered matches for "X"<filter_text>). Both paths shared
_format_search_text; pre-fix the formatter had no filter
awareness. Fix: add optionalfilter_textkwarg to
_format_search_text(mirrorsdisplay_query); compact filtered
call site threads through_format_filter_texthelper. Symmetric
treatment for filtered no-results.
Pass-2 sibling (1, ed674b5)
- D1 universal-layer mirror. Pass-1 added the modal-politeness
strip inside_extract_tell_me_aboutonly, but the universal
_TRAILING_POLITENESS_RE(called by_strip_trailing_politeness
atparse_intentline 1048) was added by the post-a20 PD2-1
sweep specifically so every extractor sees the cleaned query.
Every NON-tell_me_about intent kept leaking the modal class:
search for biology if you would→query="biology if you would";find article titled Berlin if you would→ looks up
Berlin if you would(not found). Fix: lift the modal class into
_TRAILING_POLITENESS_RE. Pass-1 extractor-level strip kept as
defense-in-depth. New invariant pinned:
TestD1RegexSync.test_leading_and_trailing_share_modal_class—
leading + trailing politeness regexes must share the modal class.
Pass-3 sibling (1, cc26b3d)
- Chained-intent trailing-politeness leak.
_chained_intent_guidanceruns UPSTREAM ofparse_intenton the
raw user query. The post-a24 P1-D6 sweep mirrored the param-leak
strip there; the equivalent mirror of_strip_trailing_politeness
was never added. Pre-fix every trailing-politeness token (the
full set, including the pass-2 modal class) leaked into chain
rejection bullets —tell me about Tokyo if you would then list namespacesproduced a rejection whose left bullet read
tell me about Tokyo if you wouldverbatim, modal politeness and
all. Caller would copy the suggested left half back,
re-introducing the politeness on every iteration. Same structural
sibling pattern as the post-a24 P1-D6 param-leak version. Fix:
apply_strip_trailing_politenessto BOTH chain halves after the
existing connector / punct trim loop, before bullets render.
Per-half rather than full-query because the politeness can appear
inside the chain (not just at the very end). Structurally safe —
_CHAINED_OPERATION_PREFIX_REchecks the LEADING op verb, which
the trailing strip never touches.
Out of scope (deferred design call)
- D5 —
death of stalin→Death_and_state_funeral_of_Joseph_Stalin
instead of the 2017 Iannucci film. P1-D3 probe-gate correctly
suppressed the Stalin disambig misroute; title-probe picked a
different canonical X-related title rather than the film
(canonical isThe_Death_of_Stalin). Picking the film would
require a prefix-widening probe (The <query>) — unwanted side
effects on arbitrary bare topics — or a popularity ranker. Both
are design choices beyond the b2 sweep scope.
D2 / D3 / D4 sibling audits clean
- D2:
_handle_filtered_searchalways routes through
_maybe_rerank_compact;_handle_search_alluses its own
rerank apply that bumps a counter on every path._handle_search
was the only early-return gap. - D3:
_handle_tell_me_aboutis the only handler that
auto-fetches a single article based on the extracted topic.
Other intents take the topic literally; synthesize uses RAG-style
passage retrieval where decomposition would lose the attribute
context (pre-existing design out of scope). - D4:
_format_search_texthas three call sites — only the
compact filtered one neededfilter_text.
search_with_filters_with_canonical_splice(non-compact filtered)
already uses_format_filtered_responsewhich natively emits
the qualifier.
Cross-feature composition verified
search for Photosythesis's reproduction in namespace C if you would→ universal trailing strip peelsif you would→ intent
= filtered_search →_maybe_rerank_compactbumps counter →
_format_search_textrenders withfilter_text. D1+D2+D4
compose.tell me about Photosythesis's reproduction if you would→
universal strip peelsif you would→ intent = tell_me_about
→ D3 retry fires on possessive topic →photosynthesis. D1+D3
compose.
Tests
- 40 new tests in
tests/test_post_b2_beta_fixes.pyacross 10
classes (TestD1TrailingModalPoliteness,
TestD1ParseIntentEndToEnd,TestD1SiblingUniversalTrailingModal,
TestD1RegexSync,TestD1Pass3ChainedIntentPolitenessLeak,
TestD2RerankerCounterOnNoResults,
TestD3PossessiveDecompositionRetry,
TestD4FilteredSearchEchoQualifier,TestRegressionGuards). - Full suite: 2360 passing, 54 skipped, 38 deselected. mypy
clean across 52 source files. black + flake8 clean. CI checks
all green (CodeQL, SonarCloud, bandit, security scanning,
6 OS × Python matrix, both[reranker]-extra suites,
performance benchmarks).
Methodology evolution
- "Narrow-scope sibling" pattern — now 8 sweeps strong. Both
pass-2 and pass-3 surfaced a single sibling of pass-1's D1
fix-family: pass-2 caught the universal-layer mirror (modal
class missing from_TRAILING_POLITENESS_RE); pass-3 caught the
upstream-chained-guidance mirror (trailing-politeness strip
missing from_chained_intent_guidance). Both are STRUCTURAL
mirrors of fixes already shipped — pass-2's sibling mirrors the
post-a20 PD2-1 universal-strip extension, pass-3's sibling
mirrors the post-a24 P1-D6 param-leak strip placement. - "Fix unlocks new paths" — 9th consecutive sweep. D3 is
particularly nasty because the failure mode changed from
explicitNo search results found(pre-b1 P1-D5) to silent
wrong answer (post-b1 P1-D5 affix retry → post-b2 D3 retry). - New invariants pinned via canonical-source tests — two
feature-level guards: (a) leading + trailing politeness regexes
must share the modal class; (b) the no-results early-return path
in_handle_searchmust route through_maybe_rerank_compact.
These pin the "added X to one side, forgot the other side"
drift class that drove both pass-2 and pass-3 defects.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About cameronrye/openzim-mcp
Modern, secure MCP server for accessing ZIM format knowledge bases offline. Enables AI models to search and navigate Wikipedia, educational content, and other compressed knowledge archives with smart retrieval, caching, and comprehensive API.
Related context
Related tools
Earlier breaking changes
- v2.0.0a15 _attribute_sections falls back to first section when no section brackets located passage
- v2.0.0a13 canonical‑splice gate tightened to require exact path equality, fixing H2/H3 surface end‑to‑end behavior across all shapes.
- v2.0.0a11 Exposed `content_offset` as top-level `zim_query` parameter, validated >=0, threaded through options.
- v2.0.0a10 `get article M/<key>` now returns ZIM metadata entry rather than aliased C-namespace article body.
- v2.0.0a10 `metadata for <file>` returns concise metadata strings instead of full article bodies for new-scheme archives.
Beta — feedback welcome: [email protected]