This release adds 3 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+11 more
Summary
AI summaryIngero adds full CUDA Graph lifecycle tracing, a remediation API via Unix socket, and straggler detection.
Full changelog
Ingero can now trace the full CUDA Graph lifecycle — capture, instantiate, launch — via eBPF uprobes on libcudart.so.
Zero application modification, zero CUPTI dependency, production-safe overhead.
CUDA Graph Observability
- eBPF probes for cudaStreamBeginCapture, cudaStreamEndCapture, cudaGraphInstantiate, and cudaGraphLaunch — covers the stream capture path used by PyTorch torch.compile, vLLM, and TensorRT-LLM
- Causal correlation connects graph events to system state: OOM during graph capture, CPU scheduling interference delaying graph dispatch, graph launch frequency anomalies (pool exhaustion), and captured-but-never-launched graphs wasting VRAM
- MCP tools: graph_lifecycle (timeline of all graph events for a PID) and graph_frequency (per-executable launch rates, hot/cold graph classification, pool saturation detection)
- ingero explain now includes graph context in causal chains when graph events are relevant
- Graceful degradation — if graph API symbols are absent (older CUDA), Ingero skips graph probes silently and continues normally
- Validated at 5,000+ GraphLaunch/sec on EC2 g4dn.xlarge with torch.compile(mode="reduce-overhead"), overhead within <2% budget
Remediation API
Ingero now exposes an optional remediation API over a Unix domain socket (/tmp/ingero-remediate.sock) using type-discriminated NDJSON. External tools can consume real-time {"type":"memory"} and {"type":"straggle"} signals to build custom remediation workflows. Enable with --remediate on ingero trace. See docs/remediation-protocol.md for integration details.
Straggler Detection
- New internal/straggler package: per-PID EMA throughput baseline tracking with sched_switch contention counting
- Correlated detection — both throughput drop and scheduling contention must fire to avoid false positives
- Sustained signal re-emission for downstream consumers that need periodic updates
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About ingero-io/ingero
eBPF-based GPU causal observability agent with MCP server. Traces CUDA Runtime/Driver APIs and host kernel events to build causal chains explaining GPU latency.
Related context
Related tools
Earlier breaking changes
- v0.17.0 Dropped 'annotate --socket' option from CLI.
Beta — feedback welcome: [email protected]