agentic-context-engine

v0.12.0 Breaking

This release includes 2 breaking changes for platform teams planning a safe upgrade.

Published 2mo AI Agents & Assistants

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

agent-learning agent-memory agents ai ai-agents ai-tools

+5 more

context-engineering llm machine-learning memory python

Summary

AI summary

Skillbook v1 legacy aliases removed, breaking the old schema.

Full changelog

This is the merger of two release lines that had not yet shipped to PyPI: the 0.11.0 architectural rewrite and the 0.12.0 SkillManager hardening. Skipping a separate v0.11.0 tag — v0.12.0 supersets it.

0.11.0 — Architectural rewrite

RecursiveAgent core abstraction extracted from RR (ace/core/recursive_agent.py). Generic recursive PydanticAI agent with sandbox, microcompaction, default tool set, depth-aware sub-agent registration.
RR collapsed into a single RRStep. Orchestrator/worker split, batch machinery, and AttachInsightSourcesStep removed. RR is now a true recursive loop.
Skillbook v2 — full schema rewrite, section-grouped storage (context / harness), richer InsightSource provenance, BM25-backed retrieval (rank-bm25 runtime dep). Skillbook.as_prompt() now returns markdown; python-toon dropped.
Agentic SkillManager (first cut) — tool-calling loop (ace/implementations/sm_tools.py) with atomic mutation tools (add_skill, update_skill, remove_skill, tag_skill) and read-only tools (search_skills, read_skill).
Reflector skillbook tools — Reflector can introspect / propose updates from inside the recursive loop.
Anthropic prompt caching enabled by default for RR; cache_read_tokens / cache_write_tokens forwarded in run metadata.
Logfire spans around recursive agent sessions.
Online / offline mode in the ACE runner.
record_observation renamed to think.

0.12.0 — SM hardening

Cross-trace generalization gate (four-criterion: ≥3 instances across ≥2 domains, named slot, no API-specific params in action, verifiable runtime trigger). Backed by skill_generalization.md (14 cited sources).
Action-equivalence rule — splits on action, not trigger surface.
Atomicity rule for insight — one trigger + one action; explicit good/bad shape examples.
ICL-grounded insight format drawn from icl_skill_formatting.md: 15-50 word cap, imperative voice, positive framing default.
Evidence-only tagging — SM no longer iterates injected_skill_ids; tags only skills the reflection actually implicates.
Broaden-via-comparison for UPDATE — same root cause in different niches → broaden issue, don't duplicate.
Prompt caching for SM via CachePoint(ttl="5m"), mirroring RR.
Hard removal cap removed — harmful_count >= 3 no longer auto-REMOVES skills.
update_skills signature: source is optional; SkillbookView dropped from parameters.
Skillbook v1 legacy aliases removed — v2 is the only schema.

End-to-end retail result (Haiku 4.5)

| Metric | Value |
|---|---|
| Baseline pass@1 | 45.0% |
| With learned skillbook | 67.5% |
| Δ pass@1 | +22.5 pp (12 improved, 3 regressed) |
| Skillbook size | 35 skills |

Tau-bench fix

evaluation_type=ALL_WITH_NL_ASSERTIONS on both run_task and run_tasks call sites in ace-eval/src/ace_eval/e2e/benchmarks/tau_bench.py. Retail and any future benchmark with NL_ASSERTION in reward_basis now produces real reward numbers instead of crashing in reward computation.

See CHANGELOG.md for full details.

Breaking Changes

Skillbook v1 legacy aliases removed — only Skillbook v2 schema remains
`record_observation` renamed to `think`

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track agentic-context-engine

Get notified when new releases ship.

About agentic-context-engine

All releases →