Skip to content

cameronrye/openzim-mcp

v2.0.0b7 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

Published 12d MCP Data & Storage
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

kiwix mcp mcp-server openzim zim

Affected surfaces

auth rbac

ReleasePort's take

Light signal
editorial:auto 12d

Release v2.0.0b7 fixes D1 filter missing associative redirects and Z2 synthesize malformed insert issues.

Why it matters: Addresses bugs in title promotion synthesis and path handling, improving reliability for affected workflows.

Summary

AI summary

Updates HIGH, topic, and backwards-compat across a mixed release.

Changes in this release

Feature Medium

Adds `pre_redirect_path` annotation propagation through `find_entry_by_title_data`.

Adds `pre_redirect_path` annotation propagation through `find_entry_by_title_data`.

Source: granite4.1:30b@2026-05-23-audit

Confidence: low

Feature Medium

Introduces `extract_possessor_tokens` helper to parse possessor tokens from topics.

Introduces `extract_possessor_tokens` helper to parse possessor tokens from topics.

Source: granite4.1:30b@2026-05-23-audit

Confidence: low

Feature Medium

Adds shared filter `accept_possessive_promotion` centralizing possessive topic logic.

Adds shared filter `accept_possessive_promotion` centralizing possessive topic logic.

Source: granite4.1:30b@2026-05-23-audit

Confidence: low

Bugfix High

Rejects possessive topic associative redirects lacking matching possessor token.

Rejects possessive topic associative redirects lacking matching possessor token.

Source: granite4.1:30b@2026-05-23-audit

Confidence: low

Bugfix High

Ensures synthesize pass-0 inserts use correct `search_top_k` shape.

Ensures synthesize pass-0 inserts use correct `search_top_k` shape.

Source: granite4.1:30b@2026-05-23-audit

Confidence: low

Bugfix Medium

Fixes D1 filter missing associative redirects and Z2 synthesize malformed insert.

Fixes D1 filter missing associative redirects and Z2 synthesize malformed insert.

Source: llm_adapter@2026-05-23

Confidence: low

Other Low

Adds 20 new tests covering pre-redirect-path propagation, possessor token extraction, redirect filter, and synthesize insert shape.

Adds 20 new tests covering pre-redirect-path propagation, possessor token extraction, redirect filter, and synthesize insert shape.

Source: granite4.1:30b@2026-05-23-audit

Confidence: low

Full changelog

Post-b6 sweep packaged from PR #174. Live-MCP verification against
v2.0.0b6 confirmed all prior fixes land cleanly (b3 Einstein's /
Plato's canonicals, b4 non-possessive carve-out, b3 trailing-modal
politeness, b2 D3 typo retry, all earlier b-series invariants).
TWO new HIGH-severity defects unlocked by deeper probing of the
match_type="redirect" shape and the synthesize-path
promotion's insert contract.

Z1 (HIGH) — D1 filter misses associative redirects

The post-b4 D1 filter rejected fuzzy_suggest for possessive
topics but accepted redirect blindly. libzim's suggestion-
search occasionally produces an associative redirect: a
redirect entry whose pre-resolution path is unrelated to the user's
possessor entity, but whose redirect chain walks to a canonical
that shares one user-typed token.

Live silent-wrong-answers:

  • tell me about Darwin's evolutionEvolution (cert=0.85)
  • tell me about Plato's republic philosophyCzech_philosophy
    (cert=0.85)

Z2 (HIGH) — Synthesize pass-0 produces malformed insert

The post-b4 D3 synthesize pass-0 inserted the raw find_title_match
dict into top_hits. The dict has shape {path, title, zim_file, match_type, pre_redirect_path} but top_hits items expect the
search_top_k shape {path, snippet, score}. Downstream score-
sort demoted the canonical to the bottom when it wasn't already in
top_hits.

Live impact (synthesize=true): Einstein's theory
Theory_of_relativity surfaced at rank 6 with score 0 (BM25 hits
dominate; the buggy insert was demoted). Plato's cave happened
to work because Allegory_of_the_cave IS in BM25 top_hits — the
reorder branch fired with the existing properly-shaped entry.

Fixes

  1. pre_redirect_path annotation through
    find_entry_by_title_data (fast-path + suggestion-search).
    find_title_match propagates the field. Schema is
    non-breaking (FindEntryHit.pre_redirect_path is
    NotRequired[str]).

  2. New extract_possessor_tokens(topic) helper pulls bare
    possessor tokens from each X's/X' shape.
    "Plato's cave"["plato"]; "O'Brien"[]
    (name, not possessive).

  3. New shared filter accept_possessive_promotion in
    title_promotion (single source of truth for simple_tools
    AND synthesize). Acceptance matrix:

    • Non-possessive topic: accept all match_types (b4 win preserved).
    • Possessive + direct: accept.
    • Possessive + fuzzy_suggest: REJECT (b6 D1).
    • Possessive + redirect: accept iff any query possessor token
      appears in the pre-redirect path's tokens.
    • Missing match_type: accept (backwards-compat).
  4. search_top_k-shaped pass-0 insert in synthesize.
    _build_pass0_promoted_hit re-probes via
    search_handler.title_match_hit(archive, full_probe.title)
    to produce the proper {path, snippet, score: 1.0} shape.
    Fallback to a minimal {path, snippet: "", score: 1.0} hit
    when the re-probe handler misses.

Tests

20 new tests in tests/test_post_b6_beta_fixes.py across 5
classes (TestPreRedirectPathPropagation,
TestPossessorTokenExtraction with 12 parametrized cases,
TestRedirectFilterRejectsUnrelatedRedirect with 3 parametrized
cases, TestSynthesizePass0InsertShape, TestRegressionGuards).
Updated 2 b4 tests + 1 golden snapshot.

Full suite: 2410 passing, 54 skipped. mypy clean across 52
source files. black + flake8 + pip-audit clean. All 14 CI checks
pass on PR #174 (after three cleanup waves: SonarCloud S1192 /
S5869 / S5799 deduplication; helper consolidation to
title_promotion; S5852 ReDoS bound on the possessor regex).

Methodology — "fix unlocks new paths" 14 sweeps strong

Each prior sweep peeled back another layer; post-b6 added two:

  1. match_type="redirect" was assumed semantic. The post-b6
    live probe revealed associative redirects where libzim's fuzzy
    token-matching produces a redirect entry whose pre-resolution
    path is unrelated to the user's possessor.
  2. The synthesize pass-0 insert worked only when the canonical
    was already in BM25 top_hits. Otherwise the malformed insert
    leaked through and was demoted by score-sort.

Three new invariants pinned: pre-redirect-path propagation;
possessor-token filter for redirects; search_top_k shape for
synthesize pass-0 inserts.


Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track cameronrye/openzim-mcp

Get notified when new releases ship.

Sign up free

About cameronrye/openzim-mcp

Modern, secure MCP server for accessing ZIM format knowledge bases offline. Enables AI models to search and navigate Wikipedia, educational content, and other compressed knowledge archives with smart retrieval, caching, and comprehensive API.

All releases →

Related context

Earlier breaking changes

  • v2.0.0a15 _attribute_sections falls back to first section when no section brackets located passage
  • v2.0.0a13 canonical‑splice gate tightened to require exact path equality, fixing H2/H3 surface end‑to‑end behavior across all shapes.
  • v2.0.0a11 Exposed `content_offset` as top-level `zim_query` parameter, validated >=0, threaded through options.
  • v2.0.0a10 `get article M/<key>` now returns ZIM metadata entry rather than aliased C-namespace article body.
  • v2.0.0a10 `metadata for <file>` returns concise metadata strings instead of full article bodies for new-scheme archives.

Beta — feedback welcome: [email protected]