Hivemind turns agent traces into skills and shares with your team

v0.7.32 Bugfix

This release fixes issues for SREs watching stability and regressions.

Published 2mo AI Agents & Assistants

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai ai-agents ai-memory anthropic artificial-intelligence claude

+12 more

claude-agent-sdk claude-agents claude-code-plugin claude-skills codex embeddings long-term-memory memory-engine openclaw openclaw-skills postgresql llm

Affected surfaces

breaking_upgrade

Summary

AI summary

Fixed wasted re-spawns on every turn and added stale lock recovery with a 10‑minute timeout.

Changes in this release

Type	Severity	Summary	CVE
Feature	Medium	Writes current timestamp to lock file on acquire for staleness detection. Writes current timestamp to lock file on acquire for staleness detection. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Performance	Medium	Reduces unnecessary worker spawns, cutting ~50ms Node cold-start and ~200ms DB I/O per session. Reduces unnecessary worker spawns, cutting ~50ms Node cold-start and ~200ms DB I/O per session. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Bugfix	High	Detects and removes stale lock files, preventing permanent mining halts. Detects and removes stale lock files, preventing permanent mining halts. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Bugfix	Medium	Prevents redundant worker spawns in long sessions via runtime deduplication. Prevents redundant worker spawns in long sessions via runtime deduplication. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—
Refactor	Low	Introduces Set to track spawned sessions within a gateway runtime. Introduces Set to track spawned sessions within a gateway runtime. Source: granite4.1:8b-q6_K@2026-05-19 Confidence: high	—

Full changelog

Fixes #100 and #110.

Why

Two spawn-lifecycle bugs in openclaw/src/index.ts:

#100 — Wasted re-spawns: agent_end fires on every turn. The on-disk lock at ~/.deeplake/state/skillify/<projectKey>.worker.lock prevents overlapping workers, but as soon as a worker exits and releases its lock, the NEXT agent_end re-acquires it and spawns a fresh worker. The fresh worker does one watermark-check SQL roundtrip, sees nothing new to mine, and exits — but each spawn costs ~50ms Node cold-start + ~200ms DB I/O. A 50-turn session ends up doing 2-5 spawns instead of 1.

#110 — Stale locks halt mining permanently: tryAcquireOpenclawSkillifyLock does O_CREAT | O_EXCL | O_WRONLY and treats any pre-existing lock as "live worker, skip." There's no staleness check. If a worker dies abnormally (host kill, OOM, segfault) before its finally releases the lock, the lock persists forever and every subsequent agent_end silently no-ops mining for that project_key permanently. Hit live during the 2026-05-07 PR #98 E2E — a manual rm <lockfile> was needed to recover.

What changed

Per-runtime dedup (#100)

New module-level const skillifySpawnedFor = new Set<string>(). Tracks which session IDs have already triggered a spawn in this gateway runtime.
agent_end handler now wraps the spawnOpenclawSkillifyWorker(...) call in if (!skillifySpawnedFor.has(sid)) { skillifySpawnedFor.add(sid); … }.
The on-disk lock stays authoritative across processes (e.g. multiple gateway restarts). The new in-memory Set only suppresses within-runtime redundancy.

Stale-lock recovery (#110)

Lock file now writes String(Date.now()) on acquire (was an empty file).
On O_EXCL failure, reads the existing lock body, parses it as a ms timestamp. If Date.now() - ts > 10 minutes OR the body is unparseable (NaN), the lock is treated as stale → unlinked → retry acquire.
Mirrors the staleness logic in src/skillify/state.ts:tryAcquireWorkerLock for the non-openclaw agents.
Migration: empty pre-existing lock files (from earlier code) parse as NaN and are treated as immediately stale on the first patched run — no manual cleanup needed.
10-minute max age is generous vs typical worker runtime (<30s + buffer). Pathological hangs longer than that release the spawn slot to the next agent_end, instead of leaking mining for the rest of the gateway's lifetime.

Tests

npm run typecheck — clean
npm test — 2380/2380 passing (one bundle-scan regex distance bumped 500→1500 to accommodate the new dedup comment block between Auto-captured and the spawn site; same assertion intent)

Test plan after merge

[ ] Long-running openclaw session (50+ turns). grep -c "Auto-captured" /tmp/openclaw/openclaw-*.log should be many; ls ~/.deeplake/state/skillify/*.worker.lock should show at most one mtime-bump per session (one spawn, not 2-5).
[ ] Kill a worker mid-mine (kill -9 $WORKER_PID). Wait 11 minutes. Next agent_end should successfully re-acquire the lock (stale-recovery path).

Summary by CodeRabbit

Bug Fixes
- Improved reliability of background worker spawning in extended agent sessions by preventing redundant spawn attempts
- Enhanced detection and cleanup of stale worker states
- Added error handling to gracefully manage worker startup failures
Tests
- Updated test validations for worker spawning behavior

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track Hivemind turns agent traces into skills and shares with your team

Get notified when new releases ship.

About Hivemind turns agent traces into skills and shares with your team

All releases →

Related context

Related tools

Earlier breaking changes

v0.7.52 Removes `hivemind tasks` CLI and related code surfaces.
v0.7.51 Removes `hivemind tasks` CLI and related code surfaces.
v0.7.19 Module name skilify replaced with skillify; affects all imports
v0.7.19 CLI command skilify removed; renamed to skillify without deprecation alias
v0.7.18 CLI subcommand renamed from `skilify` to `skillify`; no deprecation alias.