This release adds 1 notable feature for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
Summary
AI summaryUpdates Tests, P1-D1, and P1-D2 across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Bugfix | Medium |
Cross-tool cursor reuse with stuffed s.q now reports tool-mismatch instead of q-mismatch (P1-D1). Cross-tool cursor reuse with stuffed s.q now reports tool-mismatch instead of q-mismatch (P1-D1). Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Medium |
Soft-connector footer now suppresses for asymmetric alias cases (P1-D2). Soft-connector footer now suppresses for asymmetric alias cases (P1-D2). Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Medium |
Trailing politeness now strips across all simple-mode intents (PD2-1). Trailing politeness now strips across all simple-mode intents (PD2-1). Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Medium |
zim_query tool docstring no longer contains literal path example (PD2-2). zim_query tool docstring no longer contains literal path example (PD2-2). Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Medium |
_normalize_zim_file_path auto-selects when single archive loaded, even for slashed candidates (PD2-3). _normalize_zim_file_path auto-selects when single archive loaded, even for slashed candidates (PD2-3). Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Medium |
'ZIM File Not Found' error now surfaces real archive paths and omit-to-auto-select recovery (PD2-4). 'ZIM File Not Found' error now surfaces real archive paths and omit-to-auto-select recovery (PD2-4). Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
Full changelog
Live-MCP beta sweep against wikipedia_en_all_maxi_2026-02.zim on
the freshly-deployed v2.0.0a20 build, plus a live small-model
failure transcript review. Pass 1 confirmed all nine prior fixes
(post-a17 P1-D1/P1-D2/P1-D3, post-a18 P3-D1/P3-D2/P1-D4, post-a19
P1-D1/P1-D2/P1-D3) still work as designed in production, then
surfaced two new defects. Pass 2 wave 1 widened probe coverage to
politeness wrappers across all simple-mode intents (one defect).
Pass 2 wave 2 reviewed a Qwen3-8B-Q4 failure transcript and
surfaced three more defects — a docstring-bait hallucinated path
that dropped small models into a retry loop. Pass 3 source-level
audit found zero new siblings across all six fix sites.
All six defects follow either the recurring "fixes unlock
previously-broken code paths" pattern (P1-D1, P1-D2, PD2-1
landed on surfaces a20's three landed fixes opened up) or the
"weak-instruction-follower defect class" pattern (PD2-2/3/4 —
small-model behaviour that adversarial query probes structurally
can't reach). The latter shape is new to the methodology and is
captured in the post-a20 refinement: live-transcript review should
join live-MCP probing as a recurring sweep input.
Fixed
-
Cross-tool cursor reuse with stuffed
s.qnow reports
tool-mismatch instead of q-mismatch (P1-D1). The dispatcher's
cursor-decode block runs the cursor'ss.qoverlap check before
any handler-level_cursor_tool_mismatchguard fires. When a
cross-tool cursor carries ans.qfield (a hand-stuffed
walk_namespace cursor withs.q="biology"passed tosearch for photosynthesis, or a real search cursor reused with a different
tool), the dispatcher previously emitted the misleading "Cursor
was issued for query X; current request shares no terms" error
and advised the user to start the search over — even though the
cursor was from a different tool entirely. Fix: scope the
dispatcher's q-overlap check to cursors whosetclaims a
q-emitting tool (search_zim_file/search_with_filters— the
onlyCursor.encodecallsites that puts.qin their envelope).
Cursors claimingwalk_namespace/browse_namespace/
extract_article_linksnow pass through the dispatcher's
q-check; the handler-edge guard emits the correct
Cursor / Tool Mismatchdiagnosis. -
Soft-connector footer now suppresses for asymmetric alias
cases (P1-D2)._soft_connector_footer's post-a18 P3-D2
alias-fallback was gated onnot left_in and not right_in— it
only ran when BOTH halves missed the substring check. The
asymmetric case (one half matches substring, the other matches
only via title alias) slipped through:
tell me about Köln or Colognereturned the Cologne article
with a footer suggestingtell me about Köln, but Köln's
title-index entry redirects back to Cologne — a 2-hop journey
to the same article. Same shape reproduced for京都 or Kyoto,
上海 or Shanghai,München or Munich,Москва or Moscow,
Αθήνα or Athens, and their reverse-order variants. Fix:
widen the gate tonot (left_in and right_in)so the alias
probe runs whenever either half misses substring. The probe
still only upgrades a half whose top-scored title-index hit
equalstop_path, so genuinely different chain halves
(Berlin and 東京) still surface the footer correctly. The
irreducible東京 or Tokyocase stays unsuppressed —東京
has its own disambig article that doesn't alias toTokyo. -
Trailing politeness now strips across all simple-mode
intents (PD2-1). Pre-fix,tell_me_aboutwas the only
intent that stripped trailingplease/kindly/thanks/
thank you. Every other extractor that captured the topic with
a greedy end-anchored pattern (_extract_search,
_extract_search_all,_extract_find_by_title,
_extract_related,_extract_suggestions,
_extract_entry_path_keyworded— feeding get_article / links /
structure / toc / summary, plus_extract_get_zim_entries/
_extract_get_section) silently swallowed the politeness:
search for biology pleasesearched for"biology please"
(rankingThanks MaaaboveBiology);find article titled Berlin pleaselooked up"Berlin please"(not found);
links in Photosynthesis pleaseandshow structure of Photosynthesis pleaseshowed the same shape. Comma forms
("biology, please") and combinations
("biology, thanks please") reproduced too. Fix: lift the
trailing-politeness strip intoIntentParser.parse_intentat
the entry point — a single end-anchored regex, looped so
combinations peel cleanly, runs before pattern matching +
extractor dispatch. Legitimate content uses
(search for "Please Understand Me"— song title) are
unaffected because the strip is end-anchored and quoted phrases
enclose the content. -
zim_querytool docstring no longer contains a literal-
looking path example (PD2-2). The parameter description for
zim_file_pathpreviously included
(e.g. /data/wikipedia_en_all_maxi.zim)as an illustrative
path. Small models with weak instruction-following parse "e.g."
inconsistently and routinely copied the example as the actual
zim_file_pathvalue. Real archives are date-suffixed in
production (wikipedia_en_all_maxi_2026-02.zim) so the
basename doesn't match either. Live transcript captured
Qwen3-8B-Q4 doing exactly this and dropping into a
File does not existretry loop with no recovery signal.
Fix: rewrote the docstring to lead with
Omit entirely (recommended), dropped the literal path
example, added an explicit "do NOT invent a path from this
docstring" line. A regression test pins the absence of the
bait string so any future docstring edit reintroducing it
fails CI. -
_normalize_zim_file_pathauto-selects when single archive
loaded, even for slashed candidates (PD2-3). The previous
contract (H14: "explicit paths must reach the backend so it can
surface a clearer error") only made sense when there was
genuine ambiguity about which archive the caller wanted —
single-archive setups have none. Pre-fix, a slashed candidate
that didn't match anything still fell through to the backend in
single-archive setups, producing the sameFile does not exist
error that small models can't act on. Fix: when the candidate
matches nothing via path-or-basename AND exactly one archive
is loaded, auto-select regardless of separator. Multi-archive
setups still preserve the candidate so the backend error
surfaces and PD2-4 enriches it with the actual listing — H14
narrowed but intact for the case it was actually defending. -
"ZIM File Not Found" error now surfaces real archive paths
and the omit-to-auto-select recovery (PD2-4). The catch-all
inhandle_zim_querypreviously emitted a generic four-step
troubleshooting block that gave small models no learning
signal — they just retried with the same args. Fix: detect the
validate_zim_fileexception family (File does not exist/
Path is not a file/is not a zim file/Access denied)
and replace the template with aZIM File Not Foundshape:
single-archive setups get "omit the parameter — only one
archive loaded" + the actual path (defence-in-depth alongside
PD2-3); multi-archive setups get a bulleted listing of real
archive paths with "pass one verbatim" guidance. The generic
template's step 1 was also rewritten to suggest
"omitzim_file_path" as the canonical fix.
Tests
- 65 new regression tests in
tests/test_post_a20_beta_fixes.py
covering all six defects plus the edge cases probed live
(reverse-order alias variants, irreducible Tokyo disambig,
multi-archive H14 preservation, zero archives edge case,
defence-in-depth backend-failure paths, quoted-inner-please
content preservation, etc.). - 4 existing H14 tests updated to reflect the narrowed-to-multi-
archive contract (single-archive auto-select + multi-archive
preserve split into separate cases). - 4 mock-realism updates in post-a16 / post-a17 test files (the
widened P1-D2 alias-fallback calls the title backend for
connector halves, so the blanketreturn_valuemocks that
reported every half resolves totop_pathnow use per-title
side_effect).
Full suite: 1902 passed, 50 skipped.
Methodology refinement (post-a20)
Live-transcript review is a distinct test surface from live-MCP
probing. The Qwen3-8B-Q4 transcript captured PD2-2 (a
docstring-bait hallucination source) that adversarial query
probes structurally couldn't reach — the bait was in the TOOL
DESCRIPTION, not in any user query. The transcript also exposed
PD2-4 ("no learning signal on retry" failure mode) that mocked
tests can't easily catch. Future sweeps should incorporate
small-model transcript review when available — the marginal
cost is low and the defect class it catches
(tool-self-described hallucination sources + error-message-
quality issues for weak-instruction-follower models) is
otherwise invisible.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About cameronrye/openzim-mcp
Modern, secure MCP server for accessing ZIM format knowledge bases offline. Enables AI models to search and navigate Wikipedia, educational content, and other compressed knowledge archives with smart retrieval, caching, and comprehensive API.
Related context
Related tools
Earlier breaking changes
- v2.0.0a15 _attribute_sections falls back to first section when no section brackets located passage
- v2.0.0a13 canonical‑splice gate tightened to require exact path equality, fixing H2/H3 surface end‑to‑end behavior across all shapes.
- v2.0.0a11 Exposed `content_offset` as top-level `zim_query` parameter, validated >=0, threaded through options.
- v2.0.0a10 `get article M/<key>` now returns ZIM metadata entry rather than aliased C-namespace article body.
- v2.0.0a10 `metadata for <file>` returns concise metadata strings instead of full article bodies for new-scheme archives.
Beta — feedback welcome: [email protected]