This release includes 2 breaking changes for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
+13 more
Affected surfaces
ReleasePort's take
Moderate signalRelease v0.9.0 removes the `packs/ru.yaml` and `packs/uk.yaml` configuration packs, setting the default active packs to an empty tuple.
Why it matters: The deletion of `packs/ru.yaml` and `packs/uk.yaml` eliminates Russian and Ukrainian language support; operators relying on those locales must adjust configurations before upgrading.
Summary
AI summaryUpdates π Honest limits, β Verified, and β¨ Highlights across a mixed release.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Breaking | High |
Deletes `packs/ru.yaml` and `packs/uk.yaml`; default active packs set to empty tuple. Deletes `packs/ru.yaml` and `packs/uk.yaml`; default active packs set to empty tuple. Source: llm_adapter@2026-06-13 Confidence: high |
β |
| Feature | Medium |
Introduces Semantic Anchor Engine (SAE) for intent detection and keyedβfact extraction using English semantic anchors. Introduces Semantic Anchor Engine (SAE) for intent detection and keyedβfact extraction using English semantic anchors. Source: llm_adapter@2026-06-13 Confidence: high |
β |
| Feature | Medium |
Adds AnchorβLexicon Distillation (ALD) that selfβcompiles highβprecision nβgrams from traffic into `$PMB_HOME/lang/auto.yaml`. Adds AnchorβLexicon Distillation (ALD) that selfβcompiles highβprecision nβgrams from traffic into `$PMB_HOME/lang/auto.yaml`. Source: llm_adapter@2026-06-13 Confidence: high |
β |
| Feature | Low |
Provides guidance to upgrade the embedder for weak recall in nonβEnglish languages (`pmb config set embedding.model BAAI/bge-m3 && pmb reindex`). Provides guidance to upgrade the embedder for weak recall in nonβEnglish languages (`pmb config set embedding.model BAAI/bge-m3 && pmb reindex`). Source: llm_adapter@2026-06-13 Confidence: high |
β |
| Bugfix | Medium |
Ensures RU/UK recall remains byteβidentical after pack removal by leveraging the embedder. Ensures RU/UK recall remains byteβidentical after pack removal by leveraging the embedder. Source: llm_adapter@2026-06-13 Confidence: high |
β |
Full changelog
PMB now understands many languages through the embedder instead of a
hand-written pack per language. The RU/UK packs are gone; recall, intent
detection and keyed-fact extraction ride one mechanism that transfers across
every language the embedder knows β and the cold path teaches itself the rest
from your own traffic.
β¨ Highlights
- Semantic Anchor Engine (SAE). Intent detection and keyed-fact extraction
run on English semantic anchors, classified by margin against calibrated
per-set thresholds (FPR β€ 1%). The multilingual embedder projects any language
next to the English exemplars, so βwas sind meine Ziele" and βwhat are my
open goals" hit the same anchor - no per-language data. - AnchorβLexicon Distillation (ALD). The cold lexical path self-compiles
from your traffic: the maintenance tick mines high-precision n-grams that
co-fire with anchors into$PMB_HOME/lang/auto.yaml. A language you actually
use gets faster over time, with zero configuration. - One mechanism, every language. ~2,000 lines of hand-written RU/UK lists
deleted in favour of the embedder + anchors. Adding a language is now usually
nothing.
β οΈ Breaking changes
packs/ru.yamlandpacks/uk.yamlare deleted; no pack is active by
default (_DEFAULT_ACTIVE = ()). The packs-off eval is now a blocking CI gate.- RU/UK recall is unaffected - the embedder carries it (verified byte-identical).
- On a cold, daemon-less stdio path, RU/UK lexical matchers (first-person,
self-intent, relation, negation, future-intent, general atomic extraction) no
longer fire until ALD distils them from traffic. The warm-daemon path β the
default β is unaffected; the anchor tier handles those.
β Verified
- V1 recall: en/ru/uk top-1 = 1.00 (RU/UK byte-identical to the pack era).
- Multilingual eval (101 queries): overall top-1 = 0.77, top-3 = 0.91;
top-1 = 1.00 for en/fr/pt/ru. - Anchor classify latency: p50 β 48 ms, p95 β 81 ms.
- Tests: 1308 passed / 4 skipped / 0 failed Β· eval gates 18 passed / 0 failed.
π Honest limits (0.9, not 1.0)
- Non-English intents/extraction are warm-only (need the daemon); the cold
path self-heals with use. - CJK (zh/ja) is weak on exact top-1 (strong in top-3); ALD covers
space-delimited languages only. - The new hypothesis-margin keyed extraction ships default-off pending
real-world precision data. - ALD's real-traffic self-healing rate is proven in tests, not yet measured in
the wild. - The latency SLO was measured on a constrained box - re-measure on your hardware.
β¬οΈ Upgrading
No action needed for English. For other languages: run the daemon (so the warm
anchor tier + ALD are active) and use PMB normally - the cold path fills in over
a few days. If recall is weak for your language, upgrade the embedder:
pmb config set embedding.model BAAI/bge-m3 && pmb reindex.
Details: https://github.com/oleksiijko/pmb/blob/main/docs/adding-a-language.md
Full changelog: https://github.com/oleksiijko/pmb/blob/main/CHANGELOG.md Β·
https://github.com/oleksiijko/pmb/compare/v0.8.0...v0.9.0
Breaking Changes
- Deleted `packs/ru.yaml` and `packs/uk.yaml`; no pack is active by default (`_DEFAULT_ACTIVE = ()`).
- On a cold, daemonβless stdio path, RU/UK lexical matchers no longer fire until ALD distils them from traffic.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About PMB
All releases βRelated context
Beta — feedback welcome: [email protected]