This release fixes issues for SREs watching stability and regressions.
✓ No known CVEs patched in this version
Topics
Summary
AI summaryFixed multiple parsing, validation, and response‑format defects affecting disambiguation handling, intent extraction, namespace walking, browsing, suggestions, chained intents, and politeness stripping.
Full changelog
The multi-pass live sweep of a15 against
wikipedia_en_all_maxi_2026-02.zim (~118 GB, ~27.2 M entries) ran
across seven passes. Pass 1 surfaced four user-facing defects (D4 in
the tell_me_about disambig-page handling for Mercury-class bare
titles; D5 in the intent parser's politeness-prefix regex; D6 in
find_by_title's response to namespace-prefixed input; D7 a
schema-consistency gap in walk_namespace). Pass 2 self-audited
every D-fix in both verbose and compact rendering modes and
exercised the canonical-article paths (Berlin / Apollo 11 / Java)
the disambig-detection logic must not regress. Pass 3 re-tested
across a broader disambig set (Mars, Sun, Moon, Paris, Apollo bare),
walked empty namespaces B / X / Z, and exercised cross-fix
interactions (could you find article titled M/Title); both
passes 2 and 3 found zero new defects. Pass 4 then deliberately
stress-tested the four landed D-fixes from angles the earlier
passes hadn't probed (more bare-title disambigs, pathological
politeness combinations, find_by_title edge cases, walk_namespace
malformed args) AND exercised the intent paths the earlier passes
had barely touched (synthesize, browse namespace, show structure
of, links in, suggestions for, search in namespace); it surfaced
three more defects (P4-D1 / P4-D2 / P4-D3). Pass 5 verified those
three fixes; zero new defects. Pass 6 went deeper — a source-level
audit of every intent handler for the silent-default pattern
P4-D3 fixed (params.get("X", DEFAULT)) caught the same shape in
_handle_browse, and a parallel audit of every intent extractor
for the trigger-word-capture pattern P4-D1 fixed caught a sibling
extractor permissiveness in _extract_browse; plus a leading-
politeness probe surfaced a third defect (P6-D3) — please tell me about X leaks the leading politeness into the parsed topic
just like the original D5 did for modal verbs. Pass 7 verified
all ten fixes and audited cumulative regressions across the three
commits; zero new defects.
Fixed
- D4:
tell me about Mercuryno longer attaches a misleading
_May also refer to: Mercury_Monterey — use tell me about <full title>_footer to the disambiguation-page body. Two cooperating
bugs:SimpleToolsHandler._is_disambig_leadreturned False
wheneverpre_h2exceeded 400 chars — Mercury's 628-char pre-H2
(the "most commonly refers to" preamble, three top-level entries,
and the "may also refer to" header) blew past the cap, so the
existing disambig-page detection in_lead_with_tocnever fired;
AND the trailing-footer block in_handle_tell_me_abouthad no
way to suppress thedisambig_twin_path/related_extends_paths
hints when the resolved body was itself a disambig page. Fixed
by checking only the trailing 400 characters ofpre_h2(the
regex-freeendswithstays bounded, but long preambles now
trigger) and by gating both trailing footers on a fresh
body_is_disambig_pagecheck on the fetched body. Canonical
pages with disambig twins (Berlin) keep their footer; canonical
pages with extends-topic siblings (Apollo 11 → anniversaries /
lunar sample display / goodwill messages) keep their footer. - D5:
could you tell me about Photosynthesisnow parses
topic = "Photosynthesis"instead of leaking the modal lead-in
into the topic. The verb-prefix regex in
_extract_tell_me_aboutanchored at^\s*and never matched
"could you" / "can you" / "would you" / "will you", so the whole
query fell through to thetopic = query.strip()fallback and
downstream relied on the tail-probe entity rescue to find the
article anyway. Fixed by stripping the modal scaffold
((?:could|can|would|will)\s+(?:you|we|i)\s+(?:please\s+)?) before
the verb regex runs. Leaves non-modal queries unchanged; combines
cleanly with the existing trailing-politeness strip
(could you tell me about X please→ topic=X). - D6:
find article titled M/Titlenow redirects toget article M/Titleinstead of returning a silent0_hits. The title index
only stores titles (M/Title's title is "Title"), so passing a ZIM
namespace path through the title-lookup backend was guaranteed to
return nothing — with no signal to the caller that the wrong tool
was in use._handle_find_by_titlenow detects the
uppercase-letter + slash + non-empty-suffix shape upfront and
returns a structured Namespace Path, Not a Title message that
points at bothget article <path>(direct lookup) andfind article titled <stripped>(title-only fallback). Lowercase
prefixes (a/b) and titles without the namespace shape pass
through to the backend unchanged. - D7:
walk namespace A(and any other empty new-scheme
namespace) now includesnamespace_entry_count: 0in the
response. The short-circuit at
openzim_mcp/zim/namespace.pyfor new-scheme non-C/M/W namespaces
built an empty result without passingnamespace_entry_countto
_build_walk_result, so the field was omitted entirely while
walk-M and walk-W (which surface their bounded totals) included
it. Downstream consumers had to special-case "missing" vs "zero".
Fixed by passingnamespace_entry_count=0in the short-circuit.
Updated thewalk_A_10golden to reflect the new schema; walk-M
and walk-W goldens are unchanged (already carried the field). - P4-D1:
suggestions for(no actual prefix) now returns the
structured "Missing Search Term" error instead of silently
autocompleting against the literal word "for". The regex's
optional(?:for\s+)?group failed to match without trailing
whitespace, so the mandatory capture greedily swallowed "for"
itself; the handler's existing missing-arg guard then saw a
non-emptypartial_queryand ran the suggestion fallback (which
spent ~70 s scanning for "for" — a high-frequency English token).
Fixed in_extract_suggestionsby discarding a bare-"for"
capture so the guard takes over. Legitimate prefixes that happen
to start with "for" (e.g.,suggestions for forest) still work. - P4-D2: chained-intent detector no longer bypassed by a modal
lead-in._chained_intent_guidance's
_CHAINED_OPERATION_PREFIX_REis anchored at^and only
recognised operation verbs at position 0, socould you tell me about Photosynthesis then list namespacesshifted the verb past
the anchor —left_is_opevaluated False, the chain gate failed,
and the query fell through to normal intent classification where
the higher-confidencelist_namespaceswon and silently dropped
thetell me abouthalf. The D5 modal-strip lives inside
_extract_tell_me_about; it only runs AFTER the chain detector
has already decided. Fixed by pre-stripping the same modal
scaffold ((?:could|can|would|will)\s+(?:you|we|i)\s+ (?:please\s+)?) at the top of_chained_intent_guidanceso
detection sees the cleaned query. - P4-D3:
walk namespacewith a malformed argument now returns
a structured "Missing or Invalid Namespace" error instead of
silently walking C. Multi-char (AB), digit (1), special
(_), and missing-argument forms all fell through to
params.get("namespace", "C")in_handle_walk_namespacewith
no signal to the caller that the input was rejected. Sibling
tools (find_by_title,links_in,suggestions,
tell_me_about) already return structured missing-arg errors;
this one didn't. Fixed by adding an upfront guard that mirrors
their shape (rule / examples) before the C-default kicks in. - P6-D1 + P6-D2:
browse namespacenow reaches input-validation
parity withwalk namespace. Two cooperating gaps — the
handler_handle_browsehad the same
params.get("namespace", "C")silent-default that P4-D3 fixed
for walk; AND the extractor_extract_browseaccepted multi-char,
digit, and special-character namespace arguments
(browse namespace AB / 1 / _) without uppercasing lowercase
input — diverging from the strict
_extract_walk_namespace. The two siblings now agree: regex
tightened tonamespace\s+['"]?([A-Za-z])\b['"]?with.upper()
on the captured letter, and the handler returns a structured
"Missing or Invalid Namespace" error when the extractor produces
nothing. - P6-D3: leading
please/kindlynow strip cleanly from the
parsed topic.please tell me about Photosynthesisand
kindly describe Photosynthesispreviously parsed with the
politeness phrase leaking into the topic — same shape as the
pass-1 D5 defect but for non-modal politeness words. The article
still resolved via tail-probe rescue, but the parsed topic was
wrong. Fix extends the leading-strip in_extract_tell_me_about
to coverplease/kindlyAND wraps both the modal-strip and
the politeness-strip in a loop so composite phrases
(please could you tell me about X,please please tell me about X) peel cleanly. Same loop also applied to the chain-
detector's_chained_intent_guidancepre-strip so leading
politeness doesn't bypass chain detection (mirror of P4-D2).
Leaves the existing trailing-politeness strip alone, so
tell me about X pleasestill works, and the leading-only
anchor (^\s*) prevents stripping mid-query mentions of
please/kindlythat are legitimately part of the topic.
Tests
tests/test_post_a15_beta_fixes.py— 80 regression tests
pinning all ten defects. Each defect gets:- The fix-case test (Mercury body has no misleading trailer;
could you tell me about Xparses topic=X;find article titled M/Titlereturns redirect;_build_walk_resultexposes the
zero-count field;suggestions fortriggers the missing-arg
guard;could you tell me about X then list namespacesis
detected as chained;walk namespace ABreturns the missing-
namespace error;browse namespace ABreturns the same error
andbrowse namespace clowercases to "C";please tell me about Xstrips cleanly). - Negative self-audit cases (Berlin keeps its disambig-twin
footer; non-modal queries unchanged; lowercase a/b not
redirected by find_by_title;namespace_entry_countomitted
when caller passes None; legitimatesuggestions for forest
still captures the prefix; non-chainedcould you tell me about Xnot tripped by the chain detector; trailingpleasestill
works; mid-queryplease in linguisticsnot stripped). - Cross-defect probes (Java disambig body suppresses
disambig_twin_pathfooter too;please could you tell me about Xpeels both layers;please tell me about X then list namespacestrips chain detector).
- The fix-case test (Mercury body has no misleading trailer;
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About cameronrye/openzim-mcp
Modern, secure MCP server for accessing ZIM format knowledge bases offline. Enables AI models to search and navigate Wikipedia, educational content, and other compressed knowledge archives with smart retrieval, caching, and comprehensive API.
Related context
Related tools
Earlier breaking changes
- v2.0.0a15 _attribute_sections falls back to first section when no section brackets located passage
- v2.0.0a13 canonical‑splice gate tightened to require exact path equality, fixing H2/H3 surface end‑to‑end behavior across all shapes.
- v2.0.0a11 Exposed `content_offset` as top-level `zim_query` parameter, validated >=0, threaded through options.
- v2.0.0a10 `get article M/<key>` now returns ZIM metadata entry rather than aliased C-namespace article body.
- v2.0.0a10 `metadata for <file>` returns concise metadata strings instead of full article bodies for new-scheme archives.
Beta — feedback welcome: [email protected]