Skip to content

Fono

v0.2.0 Breaking

This release includes breaking changes for platform teams planning a safe upgrade.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

assistant dictation linux llm local-first rust
+5 more
speach-to-text stt vulkan whisper wyoming

Summary

AI summary

Broad release touches Added — hardware-adaptive setup, Added — tray, Documented, and Added — providers.

Full changelog

0.2.0 — 2026-04-27

Single-binary local stack: STT (whisper.cpp) and LLM cleanup
(llama.cpp) now ship together in one statically-linked fono binary,
out of the box, with hardware-accelerated CPU SIMD selected at runtime.

Added — single-binary local STT + LLM

  • llama-local is now part of the default features set. The previous
    compile_error! guard in crates/fono/src/lib.rs is gone — both
    whisper-rs and llama-cpp-2 link into the same ELF.
  • .cargo/config.toml adds -Wl,--allow-multiple-definition to
    deduplicate the otherwise-colliding ggml symbols vendored by both sys
    crates. Both copies originate from the same ggerganov upstream and
    are ABI-compatible; the linker keeps one set, no UB at runtime.
  • New accel-cuda / accel-metal / accel-vulkan / accel-rocm /
    accel-coreml / accel-openblas features on crates/fono that
    forward to matching whisper-rs / llama-cpp-2 features for opt-in
    GPU acceleration.
  • Startup banner prints a new hw accel : <accelerators> + CPU <SIMD>
    line (runtime SIMD probe: AVX512 / AVX2 / AVX / SSE4.2 + FMA + F16C on
    x86; NEON + DotProd + FP16 on aarch64).
  • LlamaLocal::run_inference redirects llama.cpp / ggml's internal
    printf-style logging through tracing (matches the existing
    whisper_rs::install_logging_hooks pattern). Default verbosity now
    emits a single LLM ready: <model> (<MB>, <threads> threads, ctx=<n>) in <ms> line; cosmetic load-time warnings (control-token type,
    n_ctx_seq < n_ctx_train) are silenced. Re-enable on demand with
    FONO_LOG=llama-cpp-2=info.
  • New smoke test crates/fono/tests/local_backends_coexist.rs boots
    WhisperLocal and LlamaLocal in the same process to lock in the
    no-collision contract.

Added — wizard local LLM path

  • First-run wizard now offers Local LLM cleanup (qwen2.5, private, offline) as a top-level option in both the Local and Mixed paths, in
    addition to Skip and Cloud. New configure_local_llm helper picks
    a tier-aware model: qwen2.5-3b-instruct (HighEnd),
    qwen2.5-1.5b-instruct (Recommended/Comfortable),
    qwen2.5-0.5b-instruct (Minimum/Unsuitable). All Apache-2.0 per
    ADR 0004.
  • The wizard's auto-download now fires for either local STT or local
    LLM (was STT-only).

Added — tray UX

  • Tray STT and LLM submenus now show a marker beside the active
    backend (was missing — active_backends() returned the trait name()
    while the comparison logic expected the canonical config-string
    identifier).
  • Switching to the local STT or LLM backend from the tray now ensures
    the corresponding model file is on disk first, with a "downloading…"
    notification, a "ready" notification on completion, and a clear error
    notification on failure (with the orchestrator reload skipped to keep
    the user on a working backend).

Changed — hotkey defaults

  • toggle = "F9" (was Ctrl+Alt+Space). Single key, no default
    binding on any major desktop, easy to fire blind.
  • hold = "F8" (was Ctrl+Alt+Grave). Adjacent to F9 for natural
    push-to-talk muscle memory.
  • cancel = "Escape" unchanged (only grabbed while recording).
  • paste_last hotkey removed. The tray's "Recent transcriptions"
    submenu and the fono paste-last CLI cover the same need with a
    better UX (re-paste any of the last 10, not just the newest).
    Request::PasteLast IPC and Cmd::PasteLast CLI are preserved and
    now route directly to orch.on_paste_last().

Changed — release profile size

  • [profile.release] now sets strip = "symbols" and lto = "thin",
    trimming the dev cargo build --release artifact from ~23 MB → ~19 MB
    (no code removal — only .symtab / .strtab deduplication).
    release-slim (used by packaging CI) is unchanged at ~15 MB.

Documented

  • docs/status.md — new entries for hotkey ergonomics and the
    single-binary local-stack resolution.
  • docs/troubleshooting.md, docs/wayland.md, README.md updated for
    the new default hotkeys.
  • New plans: plans/2026-04-27-shared-ggml-static-binary-v1.md (the
    shared-ggml strategy that informed the linker-dedupe shortcut),
    plans/2026-04-27-llama-dynamic-link-sota-v1.md,
    plans/2026-04-27-candle-backend-benchmark-v1.md,
    plans/2026-04-27-local-stt-llm-resolution-v1.md.

0.1.0 — 2026-04-25

First public release. Pipeline (audio → STT → LLM → inject) is fully wired
end-to-end; default release ships local whisper.cpp out of the box.

Added — pipeline

  • SessionOrchestrator (crates/fono/src/session.rs) glues hotkey FSM →
    cpal capture → silence trim → STT → optional LLM cleanup → text injection
    → SQLite history. Hot-swappable backends behind RwLock<Arc<dyn …>>.
  • fono record — one-shot CLI dictation (microphone → stdout / inject).
  • fono transcribe <wav> — runs a WAV file through the same pipeline; useful
    for verifying API keys without a microphone.

Added — providers

  • STT: local whisper.cpp (small / base / medium models), Groq cloud
    (whisper-large-v3-turbo), OpenAI cloud, optional Deepgram / AssemblyAI /
    Cartesia stubs.
  • LLM cleanup: optional, off-by-default. OpenAI-compatible endpoints
    (Cerebras, Groq, OpenAI, OpenRouter, Ollama) and Anthropic.
  • STT and TextFormatter traits with prewarm() so the first dictation
    after daemon start is not cold (latency plan L2/L3).
  • fono use {stt,llm,cloud,local,show} — one-command provider switching;
    rewrites config atomically and hot-reloads the orchestrator (no restart).
  • fono keys {list,add,remove,check} — multi-provider API-key vault with
    reachability probes.
  • Per-call overrides: fono record --stt openai --llm anthropic.

Added — hardware-adaptive setup

  • crates/fono-core/src/hwcheck.rs — pure-Rust probe of physical/logical
    cores, RAM, free disk, and CPU features (AVX2/NEON/FMA). Maps to a
    five-level LocalTier (Unsuitable, Minimum, Comfortable,
    Recommended, High-end).
  • Wizard prints the live tier and steers the user toward local vs cloud
    based on what the machine can sustain.
  • fono hwprobe [--json] exposes the snapshot for scripts.
  • fono doctor shows the active hardware tier alongside provider
    reachability and the chosen injector.

Added — input / output

  • Default key-injection backend Injector::XtestPaste — pure-Rust X11 XTEST
    paste via x11rb + xsel/wl-copy/xclip clipboard write. No system
    dependencies beyond a clipboard tool. Shift+Insert is the default paste
    shortcut (universal X11 binding).
  • Override paste shortcut via [inject].paste_shortcut = "ctrl-v" in config
    or FONO_PASTE_SHORTCUT=ctrl-shift-v env var.
  • Always-clipboard safety net: every successful dictation also writes to both
    CLIPBOARD and PRIMARY selections (general.also_copy_to_clipboard = true).
  • Always-notify: notify-rust toast on every dictation
    (general.notify_on_dictation = true).
  • fono test-inject "<text>" [--shortcut <variant>] — smoke-tests injection
    and clipboard delivery without speaking.

Added — tray

  • Recent transcriptions ▸ submenu with the last 10 dictations; click to
    re-paste.
  • STT: <active> ▸ and LLM: <active> ▸ submenus for live provider
    switching from the tray (same code path as fono use).
  • Open history folder (was misrouted to Dolphin in pre-release; now opens
    the directory itself via xdg-open).

Added — safety + observability

  • Per-stage tracing breadcrumbs at info: capture=…ms trim=…ms stt=…ms llm=…ms inject=…ms (raw_chars → cleaned_chars).
  • Pipeline in-flight guard refuses concurrent recordings with a toast.
  • Skip-LLM-when-short heuristic (configurable llm.skip_if_words_lt) saves
    150–800 ms per short dictation.
  • Trim leading/trailing silence pre-STT (audio.trim_silence); ~30 % faster
    STT on 5 s utterances with 1.5 s of tail silence.

Added — benchmark harness

  • New crates/fono-bench/ crate: 6-language LibriVox fixture set (en, es,
    fr, de, it, ro), Word Error Rate + per-stage latency report, criterion
    benchmark, regression gate. CI-fast (network-free) and full-stack modes.

Documented

  • docs/plans/2026-04-25-fono-pipeline-wiring-v1.md (W1–W22, all landed).
  • docs/plans/2026-04-25-fono-latency-v1.md (L1–L30, 17 landed, 13
    deferred-to-v0.2 with rationale).
  • docs/plans/2026-04-25-fono-local-default-v1.md (H1–H25).
  • docs/plans/2026-04-25-fono-provider-switching-v1.md (S1–S27).
  • docs/plans/2026-04-25-fono-roadmap-v2.md (post-v0.1 roadmap).
  • ADR docs/decisions/0007-local-models-build.md — glibc-linked default
    release vs musl-slim cloud-only artifact.

Models locked in v0.1.0

| Provider | Model | License | First-run download |
|---|---|---|---|
| Whisper local | ggml-small.bin (multilingual) | MIT | ~466 MB |
| Whisper local (light) | ggml-base.bin | MIT | ~142 MB |
| Groq cloud STT | whisper-large-v3-turbo | (cloud, no license) | n/a |
| OpenAI cloud STT | whisper-1 | (cloud) | n/a |
| Cerebras cloud LLM | llama-3.3-70b | (cloud) | n/a |
| Groq cloud LLM | llama-3.3-70b-versatile | (cloud) | n/a |

Local LLM (Qwen2.5 / SmolLM2) is opt-in behind the llama-local Cargo
feature and ships fully wired in v0.2.

Verification

  • 86 unit + integration tests; 2 latency-smoke #[ignore] tests.
  • cargo clippy --workspace --no-deps -- -D warnings clean (pedantic +
    nursery).
  • DCO sign-off enforced on every commit.

Known limitations

  • No streaming STT/LLM yet (latency plan L6/L7/L8 deferred to v0.2). Latency
    on cloud Groq+Cerebras is ~1 s end-to-end on a 5 s utterance.
  • Wayland global hotkey requires compositor binding to fono toggle
    (org.freedesktop.portal.GlobalShortcuts not yet stable in upstream
    compositors).
  • Local LLM cleanup (Qwen / SmolLM) is opt-in / preview.
  • Real winit + softbuffer overlay window is a stub (event channel only).

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Fono

Get notified when new releases ship.

Sign up free

Beta — feedback welcome: [email protected]