This release adds 3 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Summary
AI summaryCPU hardware acceleration (Intel MKL, Apple Accelerate, AVX2) and thread over‑subscription fix limit workers to two threads by default.
Full changelog
This release unlocks the full potential of your CPU by introducing hardware-accelerated machine learning calculations and mitigating thread exhaustion.
🧠 CPU Hardware Acceleration (AVX2, MKL, Accelerate)
- Intel MKL Support:
candlenow compiles with themklfeature on Linux and Windows, connecting to the Intel Math Kernel Library. This yields a massive 3x-5x speedup for matrix multiplication during embedding generation. - Apple Accelerate: MacOS builds now compile with the
acceleratefeature to utilize the blazing fast Apple Accelerate framework on Intel and Apple Silicon chips. - AVX2 & FMA Instructions: CI/CD now builds the binaries targeting the
x86-64-v3architecture (covering all Intel/AMD CPUs since ~2014). This enables LLVM to emit AVX2 vector instructions, doubling performance on generic loops.
🧵 Thread Over-Subscription Fix
When deploying this server across an async worker pool (like tokio), handling concurrent inference tasks would cause Rayon and MKL to spawn hundreds of threads (N_cores * N_requests), crushing performance with OS context switching (thread thrashing).
- The server now aggressively limits the underlying mathematical workers by pinning
RAYON_NUM_THREADS,MKL_NUM_THREADS, andOMP_NUM_THREADSto2during startup. - You can still override these using your own Environment Variables, but the default provides the perfect sweet spot for multi-tenant, high-throughput environments.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About pomazanbohdan/memory-mcp-1file
A self-contained Memory server with single-binary architecture (embedded DB & models, no dependencies). Provides persistent semantic and graph-based memory for AI agents.
Related context
Beta — feedback welcome: [email protected]