This release includes 2 breaking changes for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
+4 more
ReleasePort's take
Light signalFastembed's sidecar and gate‑refresh runner have been rewritten in native Rust.
Why it matters: Migrating these components to Rust reduces cold latency by ~0.87 s (≈16% improvement) and warm p95 latency by 121 ms (≈73% improvement).
Summary
AI summaryMigrated fastembed sidecar and gate-refresh runner from Python to native Rust.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | Medium |
A/B benchmark snapshot shows performance improvements across fastembed lanes. A/B benchmark snapshot shows performance improvements across fastembed lanes. Source: llm_adapter@2026-05-21 Confidence: low |
— |
| Performance | Medium |
Fastembed `/embed` cold latency reduced from 5.465659s to 4.600979s. Fastembed `/embed` cold latency reduced from 5.465659s to 4.600979s. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Performance | Medium |
Fastembed `/embed` warm average latency reduced from 0.073445s to 0.020961s. Fastembed `/embed` warm average latency reduced from 0.073445s to 0.020961s. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Performance | Medium |
Fastembed `/embed` warm p95 latency reduced from 0.166389s to 0.045181s. Fastembed `/embed` warm p95 latency reduced from 0.166389s to 0.045181s. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Refactor | Medium |
`fastembed-sidecar` migrated from Python/Uvicorn to native Rust service. `fastembed-sidecar` migrated from Python/Uvicorn to native Rust service. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Refactor | Medium |
`fastembed-gate-refresh` runner migrated from Python to native Rust service. `fastembed-gate-refresh` runner migrated from Python to native Rust service. Source: llm_adapter@2026-05-21 Confidence: high |
— |
| Refactor | Medium |
Updated compose healthchecks to `wget` probes for runtime alignment. Updated compose healthchecks to `wget` probes for runtime alignment. Source: llm_adapter@2026-05-21 Confidence: low |
— |
Full changelog
ContextLattice Public v3.3.25
Runtime updates
- Migrated
fastembed-sidecarfrom Python/Uvicorn to native Rust service. - Migrated
fastembed-gate-refreshrunner from Python to native Rust service. - Updated compose healthchecks to
wgetprobes so health is runtime-aligned with non-Python sidecars.
A/B benchmark snapshot (same model/cache)
| lane | old cold | new cold | old warm avg | new warm avg | old warm p95 | new warm p95 |
|---|---:|---:|---:|---:|---:|---:|
| fastembed /embed | 5.465659s | 4.600979s | 0.073445s | 0.020961s | 0.166389s | 0.045181s |
Operational result
http://127.0.0.1:8075/statushealthy after restart:9/9services, strict no-python-runtime unchanged for gateway lanes.
Breaking Changes
- Removed Python/Uvicorn implementation of `fastembed-sidecar` and replaced it with a native Rust service.
- Removed Python implementation of `fastembed-gate-refresh` runner and replaced it with a native Rust service.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About sheawinkler/ContextLattice
Private-by-default memory and context layer for agents with Go/Rust runtime, staged retrieval across fused data backends, and long-horizon context continuity.
Related context
Related tools
Beta — feedback welcome: [email protected]