cameronrye/openzim-mcp releases

No immediate action

v2.5.3 Bug fix 26d

Cache invalidation fix

Open

No immediate action

v2.5.2 Maintenance 1mo

Routine maintenance and dependency updates.

Open

Upgrade now

v2.5.1 Maintenance 1mo

Dependencies

Routine maintenance and dependency updates.

Open

No immediate action

v2.5.0 New feature 1mo

Smithery + MCP Registry publishing

Open

No immediate action

v2.4.5 Maintenance 1mo

Routine maintenance and dependency updates.

Open

Upgrade now

v2.4.4 Maintenance 1mo

Dependencies

Routine maintenance and dependency updates.

Open

Upgrade now

v2.4.3 Mixed 1mo

Dependencies

Result materialization fix + dep security

Open

Upgrade now

v2.4.2 Bug fix 1mo

Dependencies

Resolved zim_query test findings

Open

No immediate action

v2.4.1 Maintenance 1mo

Routine maintenance and dependency updates.

Open

No immediate action

v2.4.0 New feature 1mo

dispatch‑eval + link‑graph improvements

Open

No immediate action

v2.3.0 Mixed 1mo

Link‑graph sidecar + prompt routing + pagination

Open

No immediate action

v2.2.2 Maintenance 1mo

Routine maintenance and dependency updates.

Open

No immediate action

v2.2.1 Bug fix 1mo

Docker stdout fix

Open

No immediate action

v2.2.0 New feature 1mo

archive-type presets

Open

No immediate action

v2.1.8 Maintenance 1mo

Routine maintenance and dependency updates.

Open

Upgrade now

v2.1.7 Bug fix 1mo

Auth Breaking upgrade

Tail-hijack fix

Open

Upgrade now

v2.1.6 Security relevant 1mo

Auth Breaking upgrade

pyjwt security bump

Open

No immediate action

v2.1.4 Security relevant 1mo

Tail‑hijack fix

Open

No immediate action

v2.1.3 Bug fix 1mo

cross‑archive leak fix

Open

No immediate action

v2.1.2 Bug fix 1mo

HTTP allowed‑hosts fix

Open

No immediate action

v2.1.1 Bug fix 1mo

Empty‑link fix & media exclusion

Open

No immediate action

v2.1.0 New feature 1mo

Native libzim reader capabilities

Open

No immediate action

v2.0.5 Maintenance 1mo

Routine maintenance and dependency updates.

Open

No immediate action

v2.0.4 Maintenance 1mo

Routine maintenance and dependency updates.

Open

No immediate action

v2.0.2 Bug fix 1mo

Dispatcher fixes & audit

Open

No immediate action

v2.0.1 Bug fix 2mo

Bug fixes + documentation updates

Open

No immediate action

v2.0.0 Maintenance 2mo

Stage E results & low dispatch accuracy on new ops

Open

No immediate action

v2.0.0b13 Bug fix 2mo

Disambig phrase extension

Open

Review required

v2.0.0b12 New feature 2mo

Auth RBAC

Z4 check + disambig rejection

Open

Review required

v2.0.0b9 Mixed 2mo

Auth RBAC

Tail‑hijack fix + possessive rule relaxation

Open

Review required

v2.0.0b8 Bug fix 2mo

Auth

Possessive redirect fix

Open

Review required

v2.0.0b7 Bug fix 2mo

Auth RBAC

Possessive redirect fix + synthesize insert

Open

Upgrade now

v2.0.0b6 Security relevant 2mo

Dependencies

starlette CVE patch

Open

Review required

v2.0.0b4 Bug fix 2mo

Auth

Fixes possessive auto-fetch bug

Open

No immediate action

v2.0.0b3 Bug fix 2mo

Trailing politeness + rerank + possessive + filter fix

Open

Review required

v2.0.0b2 Mixed 2mo

Auth RBAC

CLI env fix + OTP cooldowns + timeout raise

Open

Review required

v2.0.0b1 Breaking risk 2mo

Auth RBAC

Reranker + query rewrites

Open

Review required

v2.0.0a25 Bug fix 2mo

Auth RBAC

Slashed‑compound widening + politeness expansion + param‑leak fixes

Open

Review required

v2.0.0a24 Mixed 2mo

Auth RBAC

Query‑param leak + ALL-CAPS acronym fix

Open

No immediate action

v2.0.0a23 Mixed 2mo

Multi‑entity parse fix + SMS politeness + drift guard

Open

Review required

v2.0.0a22 Mixed 2mo

Auth RCE / SSRF

Multi‑entity chains + politeness strip

Open

No immediate action

v2.0.0a21 Bug fix 2mo

Cursor, alias, politeness, docstring, path, error fixes

Open

Review required

v2.0.0a20 Bug fix 2mo

Auth RBAC

Cursor guard fixes + Unicode footer

Open

Monitor

v2.0.0a19 Bug fix 2mo

Table fallback, footer fix, cursor rejection

Open

Review required

v2.0.0a18 Bug fix 2mo

Auth RCE / SSRF

Connector footer + Unicode tokenisation + Cursor ai

Open

No immediate action

v2.0.0a17 Breaking risk 2mo

Section routing + empty‑lead fallback

Open

No immediate action

v2.0.0a16 Bug fix 2mo

Defect fixes

Open

No immediate action

v2.0.0a15 Bug fix 2mo

Citation attribution + bold handling fixes

Open

No immediate action

v2.0.0a14 Breaking risk 2mo

Entity resolution + section affinity

Open

No immediate action

v2.0.0a13 Bug fix 2mo

Canonical splice fix

Open

Review required

v2.0.0a12 Bug fix 2mo

Auth RBAC

France query fix

Open

Review required

v2.0.0a11 Breaking risk 2mo

`content_offset` exposure + infobox fixes

Open

Review required

v2.0.0a10 Breaking risk 2mo

Auth RBAC

Metadata correctness + cursor validation

Open

Review required

v2.0.0a9 Breaking risk 2mo

Auth Breaking upgrade

Cache accounting fixes + search error fix

Open

v2.0.0a8 Security relevant 2mo

Security fixes

dep: CVE-2026-44431 — fixed by upgrading urllib3 to 2.7.0
dep: CVE-2026-44432 — fixed by upgrading urllib3 to 2.7.0

Notable features

make security passes --skip-editable to avoid pip-audit failure on local package

Full changelog

Re-cut of v2.0.0a7 — the v2.0.0a7 tag exists but its GitHub Release
failed to publish because pip-audit surfaced two upstream urllib3
CVEs (CVE-2026-44431 / 44432) that landed in the audit database
between the v2.0.0a6 and v2.0.0a7 builds. v2.0.0a8 carries the same
v2.0.0a7 content plus the urllib3 → 2.7.0 bump that closes the CVEs.
Also adjusts make security to pass --skip-editable so pip-audit
doesn't fail looking for the local package on PyPI mid-release.

Defect + opportunity batch on top of v2.0.0a6, found by end-to-end
testing against a real Wikipedia ZIM (118 GB, 27.2M entries,
Feb 2026 snapshot). 14 defects fixed, 8 opportunities added.
1388 tests pass (+13 from new test modules); no regressions.

Fixed — Phase A (snippets, infobox, typo fallback)

#14: _typo_variants now reaches "Photosythesis" → "Photosynthesis".
v2.0.0a4 shipped only transposition + deletion edits — mathematically
unable to recover the missing 'n' (insertion). Added insertion +
substitution against the full a-z alphabet, length-gated at ≥ 5 chars
to bound cost (~700 variants for a 13-char input; ≤ 10 ms/call).
#1: snippet highlighter no longer produces malformed markdown.
_highlight_terms previously wrapped query terms verbatim, producing
**Artificial **photosynthesis****, _****Berlin****_, and
[**Photosynthesis**](**Photosynthesis** "**Photosynthesis**") when
the match landed inside existing bold / italic / link constructs.
Added a skip regex covering paired emphasis runs and full
[text](href "tooltip") link constructs (deliberately not bare
parens, so prose like (also called assimilation) keeps its
highlighting).
#1: snippet fallback to stem-prefix substring match. When no
whole-word match existed, the snippet used to drop to the lead
paragraph. Now it falls back to a stem-prefix substring (first ⅔ of
the query term) so "photosynthesis" catches paragraphs mentioning
"photosynthetic" instead of returning the article's unrelated lead.
Op1: snippets drop the duplicate # <Title> H1. create_snippet
accepts an optional title=; _get_entry_snippet forwards the
entry title so the heading that already appears in the result row
doesn't burn 5–15 tokens per result.
#2 / Op5: infobox extraction tracks parent-section context.
extract_infobox now prefixes labels with their parent
<th colspan> heading row, so a Berlin infobox renders
Area — City/State / Population — City/State instead of three
identical City/State rows. Also skips rows whose nearest table
ancestor isn't the infobox (handles nested chronology / coords
microformats) and rejects <th> / <td> candidates borrowed from
inside nested tables.
Op6: strip image-caption / hatnote / sidebar / navbox / inline
citation noise. UNWANTED_HTML_SELECTORS now drops figure,
figcaption, .thumb, .thumbcaption, .gallery, .hatnote,
.sidebar, .navbox, .metadata.mbox-small, sup.reference,
.reference, .mw-collapsible-toggle, and the .geo-* coordinate
microformats. Article leads now start with the actual prose, not
Schematic of … For other uses, see X (disambiguation). Part of a series on … 52°31'07"N 13°24'16"E ….

Fixed — Phase B (response contract)

#3 / Op8: zim_query accepts a cursor parameter. Tools advertised
opaque base64 cursors in their responses, but the simple-mode
zim_query tool only took an integer offset — the cursors were
decorative. Now decoded; s.o populates options["offset"] and the
per-tool state is preserved. Length-capped at 2 KB
defense-in-depth.

Fixed — Phase C (primitives)

#9 / #7: get_section table rendering now matches get_zim_entry.
The bundle's rendered_markdown was built with compact=False while
get_zim_entry rendered with compact=True. Result: get_section "Geography" returned pipe-soup tables while the surrounding article
fetch path showed [Table N: M rows x P cols - pass compact=False to expand] placeholders. Bundle and search-snippet rendering paths now
both apply compact=True, so the markdown is consistent everywhere.
#10 / D8: synthesize attribution carries the #section_id suffix.
_locate_passage couldn't find passages containing **bold**
highlight markers inside the bundle's plain markdown — every citation
fell back to entry-level (section_id: null). Now strips **
markers before locating so attribution resolves correctly.
#10 / D5: synthesize strips natural-language interrogative prefix.
synthesize=True with "tell me about Berlin" previously fed the
entire phrase to BM25 — returning Irving Berlin songs, Nat King Cole
albums, and a graffiti article instead of the canonical Berlin
entry. Intent-parses first, hands only the topic to the search
stage; preserves the original query for response echo.
#10 / D8 / Op4: response dedupe + link-strip in compact mode.
passages[].text_markdown previously duplicated answer_markdown
verbatim (~50% token bloat on every synthesize call). In compact
mode, passages now omit the body text. Wikipedia link-soup
([text](href "tooltip")) is also stripped from passages — small
models can't follow inline links from inside tool responses anyway.
Op3: get_section supports narrow scoping. New
include_subsections=False parameter on get_section_data (and the
narrow section X of Y / just section X of Y query syntax in
simple mode) ends the slice at the next heading of any level, so a
caller can fetch just the H2 lead paragraphs without the cascading
H3 sub-tree.
Op2: compact structure response carries per-heading summaries.
The 80-char summary field is derived from each section's body
preview so a small model can choose which section to drill into,
not just see which exist.

Fixed — namespace / metadata / `tell me about`

D2: browse namespace C no longer crashes on new-scheme archives.
Legacy code built a full 27 M-entry list before slicing 50 rows out
of it — slow, memory-hostile, and triggered "session expired" errors
on real Wikipedia archives. New _browse_new_scheme_c_paginated
pages directly through the entry-id range.
D3: browse namespace W returns the actual W entries. New-scheme
archives keep W off libzim's iterable surface, but the well-known
paths (W/mainPage, W/favicon, ...) are reachable via
has_entry_by_path. New _browse_new_scheme_w_paginated probes
them so the response matches list_namespaces' count.
D11: metadata previews cap at 800 chars. Wikipedia ZIMs store
M/Title as a full HTML document (~1 MB) rather than the bare title
string. The metadata for <archive> call previously returned 980 KB,
starving every other metadata field. Each entry is now capped with
a [truncated, N chars total] marker.
D6 / Op7: tell me about <topic> auto-fetches on title-index hit.
When the top BM25 result wasn't a strong-title match (Xapian ranked
List of songs about Berlin above the canonical Berlin article),
the response used to render the search list. Now falls back to
find_entry_by_title_data; promotes any score-1.0 result past the
BM25 ranking and inlines the article body.

CI / quality

3 new test modules, 47 additional assertions covering each fix:
test_typo_variants_v2a7.py, test_content_processor_fixes_v2a7.py,
test_v2a7_fixes_helpers.py. End-to-end proof that "Photosythesis"
resolves through the full call path (mock archive + suggester); perf
guard against quadratic regressions in _typo_variants; cursor
garbage-rejection; metadata cap on both long and short values.
Goldens regenerated (all strict improvements): pipe-soup infobox
snippet → clean lead-paragraph snippet for Einstein; H1 dedup +
section attribution on the Berlin / Munich synthesize fixtures.
Test infra: explicit encoding="utf-8" on golden read/write so
non-ASCII characters in goldens survive Windows runners.
SonarCloud quality gate: factored shared test setup
(_make_simple_handler, _build_metadata_mock_archive,
_wire_typo_fallback_archive) and namespace browse-payload shape
(_new_scheme_browse_payload, _materialise_paths) so new-code
duplication stays under 3%.

All releases

Fixed — Phase A (snippets, infobox, typo fallback)

Fixed — Phase B (response contract)

Fixed — Phase C (primitives)

Fixed — namespace / metadata / tell me about

CI / quality

#7 — New tool get_section

#10 — New zim_query(synthesize=True) mode

Other

Added

Changed

Dependencies

1.3.0 (2026-05-08)

Features

1.2.0 (2026-05-06)

Features

1.1.2 (2026-05-05)

Bug Fixes

1.1.1 (2026-05-05)

Bug Fixes

1.1.0 (2026-05-05)

Features

Bug Fixes

Bug Fixes

Features

Improvements

Removed

Bug Fixes

Security

Correctness

Performance

Refactoring

Hardening (other)

Pre-release fix-up

Final pre-release sweep

Headline features

Multi-archive search

MCP Prompts (first use of the primitive)

Find entries by title

Power-user tools

MCP Resources (first use of the primitive)

Reliability fixes

Simple-mode patterns

Documentation

Testing

Out of scope (deferred)

Known follow-ups

Versioning

Bug Fixes

Bug Fixes

0.8.1 (2026-01-29)

Features

Bug Fixes

Details

Enhanced

0.7.1 (2026-01-28)

Bug Fixes

Features

Fixed — namespace / metadata / `tell me about`

#7 — New tool `get_section`

#10 — New `zim_query(synthesize=True)` mode