This release adds 2 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+14 more
Summary
AI summaryEmbeddings now raise EmbeddingError instead of silently returning zero vectors, fixing corruption bugs.
Full changelog
Highlights
Critical fix: no more silent zero-vector corruption (#36)
FastEmbedEmbeddings.__call__ no longer swallows exceptions and returns [[0.0]*dim, ...] when the ONNX model fails to load. That bug pre-existed in master but was masked: ChromaDB happily stored zero embeddings, count() reported normal numbers, smart-reindex skipped them as "already indexed", and queries returned garbage similarity scores with no error visible. v3.8.0 lazy-load expanded the impact (failures moved from startup to query time).
Now raises EmbeddingModelLoadError / EmbeddingError loudly. Sticky _load_failed flag prevents retry storms against HuggingFace rate limits. Sanity checks in __call__ catch count and dim mismatches.
Windows CI flake eliminated (#38)
Five stacked root causes finally untangled:
- Test flake:
test_concurrent_first_call_loads_oncespawned threads withjoin(timeout=5). If Windows scheduler delayed t2 past the timeout,with patch(TextEmbedding=slow_init)exited while t2 was still scheduled — t2 then called the REALTextEmbeddingand triggered an HF download. Replaced with deterministictest_double_checked_lock_prevents_double_load. - HF download leakage: any real
TextEmbedding(...)call spawnedconcurrent.futures.ThreadPoolExecutorworker threads that outlived pytest and crashed on closed-stdout warnings. SetHF_HUB_OFFLINE=1+TRANSFORMERS_OFFLINE=1+HF_HUB_DISABLE_PROGRESS_BARS=1in CI to fail fast on any leak. - pytest atexit OSError:
cleanup_numbered_dirglob raised on Windows during interpreter shutdown.conftest.pywraps it as defense-in-depth. - PowerShell shell: GHA windows-latest default
pwshpropagated stderr writes as exit code 1 even when pytest returned 0. Forcedshell: bashon the test step. - Version drift:
__init__.py,pyproject.toml,npm/package.jsonwere drifting since v3.5.x. Now atomic.
The Windows runner is now genuinely green — no admin merges needed for the v3.8.1 release.
Changes
- FIX (critical) Loud-fail embeddings — no more silent zero-vector corruption (#36)
- FIX Sticky
_load_failedto avoid HF retry storm - NEW
EmbeddingError+EmbeddingModelLoadErrorexception classes - NEW Embed count + dim mismatch sanity checks
- TEST 7 new regression cases including
test_does_not_return_zero_vectors_silently - CI
HF_HUB_OFFLINE=1+TRANSFORMERS_OFFLINE=1+HF_HUB_DISABLE_PROGRESS_BARS=1(#38) - CI
shell: bashon Windows test step (#38) - TEST Deterministic replacement for flaky concurrent test (#38)
- TEST
conftest.pywraps_pytest.pathlib.cleanup_numbered_dir(defense-in-depth, #38)
Install
pip install knowledge-rag==3.8.1
npx -y [email protected]
docker pull ghcr.io/lyonzin/knowledge-rag:3.8.1
Backwards compatibility
- Public API unchanged
__call__raises instead of returning silent zeros — callers that already handled exceptions (index_allline 771,search_knowledgeline 1010) work without changes- New exception classes are additive
Who should upgrade
All v3.8.0 users, especially anyone who saw "0 results" queries or has models_cache/ with 0-byte files.
Full Changelog: https://github.com/lyonzin/knowledge-rag/compare/v3.8.0...v3.8.1
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About lyonzin/knowledge-rag
Local RAG system for Claude Code with hybrid search (BM25 + semantic), cross-encoder reranking, markdown-aware chunking, query expansion, and 12 MCP tools. Runs entirely offline with zero external servers.
Related context
Beta — feedback welcome: [email protected]