Downloads
| Platform | Binary | Architecture |
|----------|--------|-------------|
| macOS (Apple Silicon) | onwatch-darwin-arm64 | ARM64 (M1/M2/M3/M4) |
| macOS (Intel) | onwatch-darwin-amd64 | x86_64 |
| Linux | onwatch-linux-amd64 | x86_64 |
| Linux (ARM) | onwatch-linux-arm64 | ARM64 (Raspberry Pi 4+, AWS Graviton) |
| Windows | onwatch-windows-amd64.exe | x86_64 |
One-line install (macOS/Linux):
curl -fsSL https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.sh | bash
Existing users: Click the update badge in the dashboard footer, or run onwatch update.
What's New
Runtime Memory Optimization
onWatch now actively manages its memory footprint through Go runtime tuning. Three techniques work together to keep RSS low:
| Technique | What It Does |
|-----------|-------------|
| GOMEMLIMIT=40MiB | Sets a soft heap limit that triggers MADV_DONTNEED on macOS/Linux, causing freed pages to actually be released back to the OS instead of being held as reclaimable (MADV_FREE) |
| GOGC=50 | Triggers garbage collection at 50% heap growth instead of the default 100%, running GC more frequently with negligible CPU cost for an I/O-bound daemon |
| FreeOSMemory() goroutine | Every 5 minutes, forces the Go runtime to return unused memory to the OS, preventing RSS from permanently ratcheting up after transient allocations |
This is a soft limit — it cannot crash the application. If the heap needs to exceed 40 MiB, Go simply exceeds the limit rather than OOM-killing.
SQLite Connection Pool Tuning
The default database/sql pool was creating multiple connections, each with its own 2 MB page cache. Now tuned to:
MaxOpenConns=2 — allows one concurrent read + write
MaxIdleConns=1 — keeps only one connection cached when idle
cache_size=-500 (512 KB) — reduced from 2 MB, sufficient for sequential inserts
HTTP Transport Optimization
All four HTTP clients (Synthetic, Z.ai, Anthropic, and the updater) now use MaxIdleConns=1, MaxIdleConnsPerHost=1. Each client only talks to a single host, so the default pool of 100 idle connections was wasteful.
Server-Side Chart Downsampling
History endpoints now downsample large datasets to a maximum of 500 data points before sending to the browser. This keeps chart rendering fast without truncating your data:
- Algorithm: Even step sampling that always preserves the first and last data points
- Threshold: Only activates when the dataset exceeds 500 points (e.g., 7-day view at 1-minute polling = ~10,080 points → 500)
- No data loss: The full dataset is still queried from SQLite; only the JSON response is thinned
- Time ranges unaffected: No artificial SQL LIMIT on chart/history endpoints — the time range remains the natural bound
Anthropic Query Optimization
The raw_json column (stored on every Anthropic snapshot INSERT for debugging) is no longer loaded on queries. This avoids unnecessary memory allocations when rendering the dashboard.
Bounded Cycle Table Queries
Cycle history endpoints now cap results at 200 per quota type, and insight cycle queries cap at 50. The frontend already paginates client-side, so this has zero visual impact while preventing unbounded memory growth as cycle history accumulates.
Performance
Measured with tools/perf-monitor while all three provider agents (Synthetic, Z.ai, Anthropic) ran in parallel:
| Metric | v2.3.4 | v2.4.0 | Budget |
|--------|--------|--------|--------|
| Idle RSS (avg) | 27.5 MB | 28.0 MB | 30 MB |
| Idle RSS (P95) | 27.5 MB | 28.0 MB | 30 MB |
| Load RSS (avg) | 28.5 MB | 35.1 MB | 50 MB |
| Load RSS (P95) | 29.0 MB | 35.9 MB | 50 MB |
| Load delta | +0.9 MB | +7.1 MB | <20 MB |
| Throughput | 1,160 reqs/15s | 1,104 reqs/15s | — |
| Avg API response | 0.28 ms | 0.75 ms | <5 ms |
| Avg dashboard | 0.69 ms | 2.58 ms | <10 ms |
Why is load RSS higher? The downsampling allocates temporary slices to process the full dataset before thinning. This is a deliberate tradeoff: the previous v2.3.4 numbers reflected hardcoded LIMIT 200 SQL queries that truncated 7-day charts. v2.4.0 queries the full time range and downsamples server-side, preserving data fidelity. Idle RSS (the number that matters for a background daemon) remains well within budget.
Backend Changes
| File | Change |
|------|--------|
| main.go | Added GOMEMLIMIT, GOGC, FreeOSMemory goroutine |
| internal/store/store.go | MaxOpenConns=2, MaxIdleConns=1, cache_size=-500 |
| internal/api/client.go | MaxIdleConns=1 on HTTP transport |
| internal/api/zai_client.go | MaxIdleConns=1 on HTTP transport |
| internal/api/anthropic_client.go | MaxIdleConns=1 on HTTP transport |
| internal/update/update.go | MaxIdleConns=1 on HTTP transport |
| internal/store/anthropic_store.go | Removed raw_json from SELECT queries |
| internal/web/handlers.go | Added downsampleStep() + maxChartPoints=500, applied to all 4 history handlers; cycle queries capped at 200, insight cycles at 50 |
Upgrade Guide
From v2.3.x: Update via dashboard or onwatch update. No configuration changes needed — all optimizations are internal.
From source:
git pull origin main
make build
./onwatch stop && ./onwatch
Full Changelog
https://github.com/onllm-dev/onWatch/compare/v2.3.4...v2.4.0