This release includes 2 breaking changes for platform teams planning a safe upgrade.
Published 1mo
AI Agents & Assistants
✓ No known CVEs patched
✓ No known CVEs patched in this version
Topics
agents
inference
kv-cache
llm
reinforcement-learning
sglang
+1 more
vllm
Affected surfaces
breaking_upgrade
Summary
AI summaryStrict mode now raises on native extension load failure, eliminating the slow Python fallback path.
Full changelog
Highlights
- Freeze throughput: 0.19 → 19.62 GB/s on H100 (104× rewrite of the pipelined freeze path, O_DIRECT
pwrite, double-buffered WC pinned memory) - KV cache freeze/restore wired through the pipelined Rust path — no more Python-loop-bound
serialization for the competitive-moat KV snapshot - New on-disk KV format: standard
.thawfile +.metaJSON sidecar. Legacy single-file readers
preserved for backward compat - Unified
restore_from_bytes_auto— triescudaHostRegisterzero-copy first, falls back to
memcpy-through-staging on failure. One entry point, no caller-side flag - WC pinned-buffer thread-local cache — amortizes
cudaHostAllocacross freeze/restore calls.
Critical forthaw servehot-swap - Strict mode default flipped — failed native extension load now raises instead of silently falling
back to 50 MB/s Python path. Opt out viaTHAW_ALLOW_PYTHON_FALLBACK=1 - Slow PyO3 siblings removed (
freeze_to_file/restore_from_file) — silent 100× footgun
eliminated
Upgrade
pip install -U thaw-native thaw-vllm
Breaking Changes
- Strict mode default flipped: failed native extension load now raises instead of silently falling back to the 50 MB/s Python path. Opt out via THAW_ALLOW_PYTHON_FALLBACK=1
- Removed slow PyO3 siblings freeze_to_file and restore_from_file
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About Thaw
All releases →Related context
Related tools
Beta — feedback welcome: [email protected]