This release fixes issues for SREs watching stability and regressions.
✓ No known CVEs patched in this version
Topics
Summary
AI summaryUpdates Pass-2 source-level audit, P3-D1, and P3-D2 across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | Medium |
Telemetry counter `subject_attribute_table_dominated` added for future tuning. Telemetry counter `subject_attribute_table_dominated` added for future tuning. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
| Dependency | Medium |
Tests added for P3-D1, P3-D2, and P1-D4 fixes in `tests/test_post_a18_beta_fixes.py`. Tests added for P3-D1, P3-D2, and P1-D4 fixes in `tests/test_post_a18_beta_fixes.py`. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
| Performance | Medium |
_maybe_render_subject_section detects placeholder dominance and substitutes recovery pointer efficiently. _maybe_render_subject_section detects placeholder dominance and substitutes recovery pointer efficiently. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
| Bugfix | Medium |
Subject-attribute section dominated by table placeholders now falls back to recovery pointer. Subject-attribute section dominated by table placeholders now falls back to recovery pointer. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Medium |
Soft-connector footer recognizes non-Latin halves via title-alias fallback. Soft-connector footer recognizes non-Latin halves via title-alias fallback. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Bugfix | Medium |
Cross-tool cursor reuse rejected at simple-tools handler edge with clear mismatch error. Cross-tool cursor reuse rejected at simple-tools handler edge with clear mismatch error. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high |
— |
| Refactor | Medium |
Cursor decoding now stashes `decoded_payload.get("t")` into `options["_cursor_t"]` for tool validation. Cursor decoding now stashes `decoded_payload.get("t")` into `options["_cursor_t"]` for tool validation. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: low |
— |
Full changelog
Live-MCP beta sweep against wikipedia_en_all_maxi_2026-02.zim on
the freshly-deployed v2.0.0a18 build. Pass 1 confirmed all three
a18 fixes (P1-D1 soft-connector title-spans, P1-D2 Unicode tail
tokenisation, P1-D3 walk-namespace cursor ai preservation) work
as designed in production, then surfaced three new user-facing
defects. Pass 2 source-level self-audit found zero new defects.
Both P3-D1 and P3-D2 are examples of a recurring pattern: fixes
that unlock previously-broken code paths surface new defects in
those paths. Neither was reachable from the canonical reproducers
before a18 because the Unicode tokenisation defect intercepted
every non-Latin topic earlier in the pipeline.
Fixed
- Subject-attribute section dominated by table placeholders falls
back to recovery pointer (P3-D1).musicians from München
resolved correctly via the new Unicode tail probe to Munich,
then subject-attribute decomposition fired on the Notable people
section. But that section is two H3 sub-tables
(Born in Munich/Notable residents) which compact mode
renders as[Table N: M rows x P cols - pass compact=False to expand]placeholders. The LLM got zero substantive content
from a query that should list musicians — exactly the
content-less-response shape wave 4's empty-lead fallback was
designed to prevent. The bundleget_section_datareads is
always built withcompact=True(openzim_mcp/bundle.py:307),
so the section can't be re-emitted with tables expanded.
_maybe_render_subject_sectionnow detects placeholder
dominance (≥1 placeholder AND <100 chars of substantive prose
after stripping them) and substitutes acompact=False
recovery pointer that names the exact call to make. Telemetry
countersubject_attribute_table_dominatedfor future tuning. - Soft-connector footer recognises non-Latin halves via title-
alias fallback (P3-D2).tell me about Berlin and München
resolved correctly to Munich (right-promote), but the soft-
connector footer was silently suppressed. The substring check
"berlin" in "munich"is False;"münchen" in "munich"is
also False because the title-alias index crosses the Unicode +
language boundary (München → Munich) and substring matching
can't see through that. Soleft_in == right_in == Falsehit
the "neither in title — unclear which was picked" suppression
branch. User never learned Berlin was dropped. Fix: when both
halves fail substring, fall back to title-alias probing — probe
the title index for each half, and if a half's top-scored hit
resolves totop_path, treat that half as "in title"
semantically. Cheap (in-memory title-index lookup) and only
fires on the rare both-missed branch. The legacy positional-only
call signature (withoutzim_file_path/top_pathkwargs)
continues to work — alias fallback is gated on those kwargs. - Cross-tool cursor reuse rejected at simple-tools handler
edge (P1-D4 — deferred from the post-a17 sweep).
walk namespace Memits a cursor; passing that cursor to
browse namespace Mpreviously walked browse silently from
walk's offset (=3 in the canonical reproducer), returning
entries 4-6 and emitting a freshbrowse_namespacecursor as
if nothing was wrong. The simple-tools dispatcher had decoded
onlys.oands.nsfrom any received cursor, ignorings.t
(issuing tool). The advanced tools already enforce tool-binding
viaCursor.decode(expected_tool=...). Fix: stash
decoded_payload.get("t")intooptions["_cursor_t"]at decode
time; add the_cursor_tool_mismatchhelper alongside the
existing_cursor_ns_mismatch; fire it at the top of both
_handle_browseand_handle_walk_namespace(defence-in-depth
for the symmetric direction). User now sees a clear
Cursor / Tool Mismatchrejection before any backend call.
Tests
9 regression tests in tests/test_post_a18_beta_fixes.py:
- P3-D1 (3): table-dominated falls back to recovery pointer;
prose + 1 table returns body unchanged; zero tables unchanged. - P3-D2 (3): alias-resolved half makes the footer fire;
neither-half-resolves still suppresses; legacy positional-only
call signature still works. - P1-D4 (3): walk cursor passed to browse → rejected; browse
cursor passed to walk → rejected; same-tool round-trip preserves
the post-a17 P1-D3 fix.
Full test suite: 1823 passed, 50 skipped (up from 1814 in a18).
Pass-2 source-level audit (no siblings)
- P3-D1: the table-placeholder shape is unique to subject-
attribute decomposition. Other content-fetch paths surface
tables embedded in larger prose bodies; the defect class is
the "single section, all-tables" shape. - P3-D2:
_soft_connector_footeris the only substring-in-
title site in the codebase. Other footers (disambig twin probe,
related extends paths) use exact path/title matching from search
results. - P1-D4:
_handle_search/_handle_links/
_handle_filtered_searchalso readoptions["offset"]from any
decoded cursor, but they use search-tool offsets that aren't
cross-tool meaningful in the same way as walk/browse's shared
namespace-offset semantics. Filed as a follow-up opportunity to
widen the tool-mismatch guard later if a live probe ever
surfaces the issue.
PR: #147.
Commit on the sweep branch: 7be575e (pass-1 fixes + 9 tests).
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About cameronrye/openzim-mcp
Modern, secure MCP server for accessing ZIM format knowledge bases offline. Enables AI models to search and navigate Wikipedia, educational content, and other compressed knowledge archives with smart retrieval, caching, and comprehensive API.
Related context
Related tools
Earlier breaking changes
- v2.0.0a15 _attribute_sections falls back to first section when no section brackets located passage
- v2.0.0a13 canonical‑splice gate tightened to require exact path equality, fixing H2/H3 surface end‑to‑end behavior across all shapes.
- v2.0.0a11 Exposed `content_offset` as top-level `zim_query` parameter, validated >=0, threaded through options.
- v2.0.0a10 `get article M/<key>` now returns ZIM metadata entry rather than aliased C-namespace article body.
- v2.0.0a10 `metadata for <file>` returns concise metadata strings instead of full article bodies for new-scheme archives.
Beta — feedback welcome: [email protected]