last9/gpu-telemetry](https:
Monitoring & MetricsA vendor‑neutral, OpenTelemetry‑based GPU telemetry agent that attributes utilization and health metrics to Kubernetes pods or Slurm jobs for per‑team accounting
Features
- Emits OTLP telemetry with built‑in workload attribution (Kubernetes pod/namespace/deployment or Slurm job/user)
- Supports NVIDIA, AMD MI300X/MI325X and Intel Gaudi GPUs via unified collectors
- Works as a Helm DaemonSet for Kubernetes or as a pip‑installable systemd service on bare metal
- Provides pre‑built Grafana dashboards and Prometheus alert rules for fleet monitoring
Recent releases
View all 3 releases →No releases yet
We'll surface new releases as they're published — check back soon.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Install & Platforms
Install via
pip
docker