This release adds 2 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+5 more
Summary
AI summaryRate-limit gauges now show empirical Bayes forecasts with confidence tags.
Full changelog
Forecasts on the rate-limit gauges
Each gauge now shows a projection of where utilization will land at reset, with an 80% credible band, an ETA to threshold (default 100%), and a LOW/MED/HIGH confidence pill. This is an empirical Bayes forecast: the prior on the rate and the path-noise variance are both estimated from past data and plugged in to a posterior update, rather than being marginalized under a hyperprior. The full spec lives in internal/forecast/MODEL.pdf; the gist:
-
Rate posterior. Inside the open window, utilization is modeled as
u(t) = u_now + r·t + W(t), withran unknown per-hour rate andWBrownian path noise with per-hour varianceσ². The current rate is fit by OLS on the last 30 minutes of snapshots; that OLS slope and its standard error are treated as a Gaussian likelihoodr̂_OLS | r ~ N(r, SE²_OLS), then fused with a Gaussian empirical prior onr(meanμ₀, varianceτ₀²) via the standard conjugate normal-normal update. The prior is refit daily from up to 200 completed past windows. -
Closed-form projection. Both the rate-uncertainty and path-noise pieces are Gaussian, so projected utilization at reset is
F ~ N(u_now + r̂·ΔT, ΔT²·τ_post² + ΔT·σ²). The 80% credible interval isF ± z₀.₉·σ_F, clipped to[0%, 100%]for display. (Surfaced to users as "80% CI" for brevity.) -
Path-noise calibration.
σ²is recovered by replaying the forecaster across past windows: at each replay point, the squared forecast errore²against the actualu_finalis regressed against[ΔT, ΔT²]with no intercept. The linear coefficient is path noise; the quadratic coefficient absorbs rate-uncertainty contamination and is discarded. This is a method-of-moments estimator, then plugged into the posterior; the prior is refit once more with the noise correction applied to its sample variance. -
Monte Carlo ETA. For each threshold, 500 trajectories of the SDE are simulated at 5-minute steps, drawing one rate sample per trajectory from the posterior. The reported ETA is the median first-passage time with the 80% CI from the 10th/90th percentiles. If at least half the trajectories never cross before reset, the threshold is reported as unreachable (
p_infis exposed on the API payload). -
Confidence tag. Derived from effective sample size
n_eff = min(n_recent, τ₀²/SE_OLS² + N_sessions).n_eff ≥ 50 → HIGH,≥ 15 → MEDIUM, elseLOW. Falls back to the prior alone when fewer than three recent snapshots exist.
The forecast is computed once per poll and shipped inside the existing usage SSE event; the same payload is also exposed at GET /api/forecast for pull-style clients. It is suppressed when fewer than two completed past windows exist, when the window has just reset, or when a drop in the recent snapshot series indicates a missed reset between polls.
Fixes
- Canonicalize
reset_attimestamps at write time. The Claude API returnsreset_atrecomputed asnow + remainingon each poll, so the same nominal window drifts by hundreds of milliseconds across polls and occasionally straddles a minute boundary. Snapshots were being written with these drifting strings, soGROUP BY session_reset_atshattered every window into singletons and downstream aggregation lost track of session boundaries. Reset times are now rounded to the nearest minute on write, and a one-time idempotent migration canonicalizes existing rows on startup.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About Claumon
All releases →Related context
Related tools
Earlier breaking changes
- v0.14.0 Forecast model v2.1 removes CI inversion cap; interval now raw p10/p90 uncapped.
Beta — feedback welcome: [email protected]