Skip to content

This release fixes issues for SREs watching stability and regressions.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai ai-agents ai-memory anthropic artificial-intelligence claude
+12 more
claude-agent-sdk claude-agents claude-code-plugin claude-skills codex embeddings long-term-memory memory-engine openclaw openclaw-skills postgresql llm

Affected surfaces

breaking_upgrade

Summary

AI summary

Fixed wasted re-spawns on every turn and added stale lock recovery with a 10‑minute timeout.

Changes in this release

Feature Medium

Writes current timestamp to lock file on acquire for staleness detection.

Writes current timestamp to lock file on acquire for staleness detection.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Performance Medium

Reduces unnecessary worker spawns, cutting ~50ms Node cold-start and ~200ms DB I/O per session.

Reduces unnecessary worker spawns, cutting ~50ms Node cold-start and ~200ms DB I/O per session.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Bugfix High

Detects and removes stale lock files, preventing permanent mining halts.

Detects and removes stale lock files, preventing permanent mining halts.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Bugfix Medium

Prevents redundant worker spawns in long sessions via runtime deduplication.

Prevents redundant worker spawns in long sessions via runtime deduplication.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Refactor Low

Introduces Set to track spawned sessions within a gateway runtime.

Introduces Set to track spawned sessions within a gateway runtime.

Source: granite4.1:8b-q6_K@2026-05-19

Confidence: high

Full changelog

Fixes #100 and #110.

Why

Two spawn-lifecycle bugs in openclaw/src/index.ts:

#100 — Wasted re-spawns: agent_end fires on every turn. The on-disk lock at ~/.deeplake/state/skillify/<projectKey>.worker.lock prevents overlapping workers, but as soon as a worker exits and releases its lock, the NEXT agent_end re-acquires it and spawns a fresh worker. The fresh worker does one watermark-check SQL roundtrip, sees nothing new to mine, and exits — but each spawn costs ~50ms Node cold-start + ~200ms DB I/O. A 50-turn session ends up doing 2-5 spawns instead of 1.

#110 — Stale locks halt mining permanently: tryAcquireOpenclawSkillifyLock does O_CREAT | O_EXCL | O_WRONLY and treats any pre-existing lock as "live worker, skip." There's no staleness check. If a worker dies abnormally (host kill, OOM, segfault) before its finally releases the lock, the lock persists forever and every subsequent agent_end silently no-ops mining for that project_key permanently. Hit live during the 2026-05-07 PR #98 E2E — a manual rm <lockfile> was needed to recover.

What changed

Per-runtime dedup (#100)

  • New module-level const skillifySpawnedFor = new Set<string>(). Tracks which session IDs have already triggered a spawn in this gateway runtime.
  • agent_end handler now wraps the spawnOpenclawSkillifyWorker(...) call in if (!skillifySpawnedFor.has(sid)) { skillifySpawnedFor.add(sid); … }.
  • The on-disk lock stays authoritative across processes (e.g. multiple gateway restarts). The new in-memory Set only suppresses within-runtime redundancy.

Stale-lock recovery (#110)

  • Lock file now writes String(Date.now()) on acquire (was an empty file).
  • On O_EXCL failure, reads the existing lock body, parses it as a ms timestamp. If Date.now() - ts > 10 minutes OR the body is unparseable (NaN), the lock is treated as stale → unlinked → retry acquire.
  • Mirrors the staleness logic in src/skillify/state.ts:tryAcquireWorkerLock for the non-openclaw agents.
  • Migration: empty pre-existing lock files (from earlier code) parse as NaN and are treated as immediately stale on the first patched run — no manual cleanup needed.
  • 10-minute max age is generous vs typical worker runtime (<30s + buffer). Pathological hangs longer than that release the spawn slot to the next agent_end, instead of leaking mining for the rest of the gateway's lifetime.

Tests

  • npm run typecheck — clean
  • npm test2380/2380 passing (one bundle-scan regex distance bumped 500→1500 to accommodate the new dedup comment block between Auto-captured and the spawn site; same assertion intent)

Test plan after merge

  • [ ] Long-running openclaw session (50+ turns). grep -c "Auto-captured" /tmp/openclaw/openclaw-*.log should be many; ls ~/.deeplake/state/skillify/*.worker.lock should show at most one mtime-bump per session (one spawn, not 2-5).
  • [ ] Kill a worker mid-mine (kill -9 $WORKER_PID). Wait 11 minutes. Next agent_end should successfully re-acquire the lock (stale-recovery path).

Summary by CodeRabbit

  • Bug Fixes

    • Improved reliability of background worker spawning in extended agent sessions by preventing redundant spawn attempts
    • Enhanced detection and cleanup of stale worker states
    • Added error handling to gracefully manage worker startup failures
  • Tests

    • Updated test validations for worker spawning behavior

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Hivemind turns agent traces into skills and shares with your team

Get notified when new releases ship.

Sign up free

About Hivemind turns agent traces into skills and shares with your team

All releases →

Related context

Earlier breaking changes

  • v0.7.52 Removes `hivemind tasks` CLI and related code surfaces.
  • v0.7.51 Removes `hivemind tasks` CLI and related code surfaces.
  • v0.7.19 Module name skilify replaced with skillify; affects all imports
  • v0.7.19 CLI command skilify removed; renamed to skillify without deprecation alias
  • v0.7.18 CLI subcommand renamed from `skilify` to `skillify`; no deprecation alias.

Beta — feedback welcome: [email protected]