Skip to content

lyonzin/knowledge-rag

v3.8.1 Feature

This release adds 2 notable features for engineering teams evaluating rollout.

Published 24d MCP Developer Tools
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

antigravity claude claude-code claude-code-cli codex cursor-ai
+14 more
document-search hybrid-search inteligencia-artificial knowledge-base local-ai mcp mcp-server llm rag-chatbot rag-pipeline reranking retrieval-augmented-generation semantic-search vector-db

Summary

AI summary

Embeddings now raise EmbeddingError instead of silently returning zero vectors, fixing corruption bugs.

Full changelog

Highlights

Critical fix: no more silent zero-vector corruption (#36)

FastEmbedEmbeddings.__call__ no longer swallows exceptions and returns [[0.0]*dim, ...] when the ONNX model fails to load. That bug pre-existed in master but was masked: ChromaDB happily stored zero embeddings, count() reported normal numbers, smart-reindex skipped them as "already indexed", and queries returned garbage similarity scores with no error visible. v3.8.0 lazy-load expanded the impact (failures moved from startup to query time).

Now raises EmbeddingModelLoadError / EmbeddingError loudly. Sticky _load_failed flag prevents retry storms against HuggingFace rate limits. Sanity checks in __call__ catch count and dim mismatches.

Windows CI flake eliminated (#38)

Five stacked root causes finally untangled:

  1. Test flake: test_concurrent_first_call_loads_once spawned threads with join(timeout=5). If Windows scheduler delayed t2 past the timeout, with patch(TextEmbedding=slow_init) exited while t2 was still scheduled — t2 then called the REAL TextEmbedding and triggered an HF download. Replaced with deterministic test_double_checked_lock_prevents_double_load.
  2. HF download leakage: any real TextEmbedding(...) call spawned concurrent.futures.ThreadPoolExecutor worker threads that outlived pytest and crashed on closed-stdout warnings. Set HF_HUB_OFFLINE=1 + TRANSFORMERS_OFFLINE=1 + HF_HUB_DISABLE_PROGRESS_BARS=1 in CI to fail fast on any leak.
  3. pytest atexit OSError: cleanup_numbered_dir glob raised on Windows during interpreter shutdown. conftest.py wraps it as defense-in-depth.
  4. PowerShell shell: GHA windows-latest default pwsh propagated stderr writes as exit code 1 even when pytest returned 0. Forced shell: bash on the test step.
  5. Version drift: __init__.py, pyproject.toml, npm/package.json were drifting since v3.5.x. Now atomic.

The Windows runner is now genuinely green — no admin merges needed for the v3.8.1 release.

Changes

  • FIX (critical) Loud-fail embeddings — no more silent zero-vector corruption (#36)
  • FIX Sticky _load_failed to avoid HF retry storm
  • NEW EmbeddingError + EmbeddingModelLoadError exception classes
  • NEW Embed count + dim mismatch sanity checks
  • TEST 7 new regression cases including test_does_not_return_zero_vectors_silently
  • CI HF_HUB_OFFLINE=1 + TRANSFORMERS_OFFLINE=1 + HF_HUB_DISABLE_PROGRESS_BARS=1 (#38)
  • CI shell: bash on Windows test step (#38)
  • TEST Deterministic replacement for flaky concurrent test (#38)
  • TEST conftest.py wraps _pytest.pathlib.cleanup_numbered_dir (defense-in-depth, #38)

Install

pip install knowledge-rag==3.8.1
npx -y [email protected]
docker pull ghcr.io/lyonzin/knowledge-rag:3.8.1

Backwards compatibility

  • Public API unchanged
  • __call__ raises instead of returning silent zeros — callers that already handled exceptions (index_all line 771, search_knowledge line 1010) work without changes
  • New exception classes are additive

Who should upgrade

All v3.8.0 users, especially anyone who saw "0 results" queries or has models_cache/ with 0-byte files.

Full Changelog: https://github.com/lyonzin/knowledge-rag/compare/v3.8.0...v3.8.1

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track lyonzin/knowledge-rag

Get notified when new releases ship.

Sign up free

About lyonzin/knowledge-rag

Local RAG system for Claude Code with hybrid search (BM25 + semantic), cross-encoder reranking, markdown-aware chunking, query expansion, and 12 MCP tools. Runs entirely offline with zero external servers.

All releases →

Beta — feedback welcome: [email protected]