Skip to content

Fono

v0.3.5 Breaking

This release includes 5 breaking changes for platform teams planning a safe upgrade.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

assistant dictation linux llm local-first rust
+5 more
speach-to-text stt vulkan whisper wyoming

Summary

AI summary

Updates 0.6, 0.2, and 1.0 across a mixed release.

Full changelog

Fixed

  • Whisper trailing-closer hallucinations ("Thank you", "Bye", "Thanks
    for watching") on silent tails. Three layers, root-cause-first:
    • Layer A — local whisper-rs now opts in to the four
      hallucination guards that FullParams::new() leaves disabled by
      default: set_no_speech_thold(0.6), set_logprob_thold(-1.0),
      set_compress_thold(2.4), set_temperature_inc(0.2). Matches
      the canonical whisper.cpp CLI defaults.
    • Layer B — new [stt.prompts] config: a per-language
      HashMap<bcp47, String> whose entry for the request's resolved
      language is sent as the Whisper initial_prompt (local) or
      prompt (Groq + OpenAI form-data field). When no entry matches
      the resolved language, no prompt is sent — preserving today's
      unbiased behaviour for languages the user hasn't configured.
      English-only Whisper variants (e.g. tiny.en, small.en,
      *-en-q5_1) auto-seed prompts.en with a neutral professional-
      dictation default unless the user already set one.
    • Layer Cinteractive.hold_release_grace_ms default
      lowered from 300 ms to 150 ms. Halves the silent tail Whisper
      sees on F8 release. Smoke-test: if trailing words get truncated,
      raise back to 300.
  • LLM cleanup observability: new INFO line llm: cleanup added=N removed=M chars after each successful cleanup so users can see
    whether the LLM is doing real work or operating as a near-no-op
    pass-through.

Removed

  • [stt.cloud].streaming config field. Streaming for cloud Groq is
    now derived from [interactive].enabled — the master live-
    dictation switch — so there is no separate per-backend opt-in. A
    user who picks Groq and turns on live mode gets the pseudo-stream
    client automatically; cost can be bounded via
    interactive.streaming_interval > 3.0 (finalize-only mode) or
    interactive.budget_ceiling_per_minute_umicros. Existing configs
    with streaming = true parse without warning (serde silently
    ignores unknown fields); the value is no longer consulted. Plan:
    plans/2026-04-29-streaming-config-collapse-v1.md.

  • [interactive].overlay config field. The live-dictation overlay
    is now always shown when [interactive].enabled = true — it is
    the only feedback surface for live previews, so a per-section
    toggle was incoherent. The previous warn-and-ignore code path
    (added in v0.3.3) is gone. [overlay].enabled continues to
    control the passive recording indicator in batch mode.

  • Wizard's third question on the cloud-STT path ("Enable Groq
    streaming dictation?"). Live-mode users on Groq now go straight
    through; users who want batch-only Groq just leave
    [interactive].enabled = false.

  • general.notify_on_dictation config field. Redundant with the
    existing clipboard-fallback notification: when injection works the
    cleaned text is already at the cursor (the actual feedback); when
    it falls back to clipboard the dedicated "Fono — copied to clipboard" toast at session.rs:171 fires with a Ctrl+V hint.
    The per-dictation toast just duplicated case 1.

  • "Fono — live dictation active" toast on first F9 toggle-on.
    The on-screen overlay is the user-visible indicator.

  • "Fono — STT switched" / "Fono — LLM switched" tray success toasts.
    The user just clicked the tray menu and the tray label updates to
    reflect the change. Switch failures still fire critical-urgency
    notifications.

Changed

  • Linux desktop notifications now route through notify-send (libnotify
    CLI) instead of notify-rust's pure-Rust zbus path. Fixes a class of
    "no notification appeared" bugs in non-canonical environments (root
    sessions without XDG_RUNTIME_DIR/DBUS_SESSION_BUS_ADDRESS,
    systemd --user units without PassEnvironment=, container
    desktops, Flatpak/Snap launchers, etc.) where libnotify's autolaunch
    succeeds but zbus fails with "No such file or directory". notify-rust
    is retained behind cfg(any(target_os = "macos", target_os = "windows")) for the future cross-platform ports. New
    fono_core::notify::send() helper funnels every notification through
    one code path; ~40 inline notify_rust::Notification::new() call
    sites in daemon.rs/session.rs removed.

Added

  • interactive.hold_release_grace_ms config (default 300). On F8
    release (and F9 toggle-off), the orchestrator now waits this many
    milliseconds before signalling the capture thread to stop. Closes a
    truncation bug where the last 100–300 ms of audio buffered in the
    cpal host callback were abandoned when the user released the hotkey
    early on a short utterance.
  • Desktop notification on cloud STT rate-limit (HTTP 429), deduped to
    at most once per dictation session (per F8/F9 press). Surfaces via
    notify-rust in the default build; slim builds without the notify
    feature still emit a tracing::warn! line. A defensive 120 s
    auto-reset re-arms the flag if the orchestrator's reset path is
    skipped (e.g. by panic).
  • 60-second preview-lane throttle after any cloud STT 429. The
    streaming pseudo-stream loop checks
    rate_limit_notify::is_throttled() before each preview tick and
    skips it; only VAD-boundary finalize requests fire during the
    throttle window. Self-clears after 60 s.
  • Single-instance guard via the IPC socket. The daemon now probes the
    Unix socket on startup with UnixStream::connect; if a previous
    daemon answers, we bail before duplicating hotkey grabs and model
    loads. Stale sockets from crashed prior runs yield
    ConnectionRefused and proceed normally. No PID file parsing, no
    process probing — the socket itself is the source of truth.

Changed

  • Hotkey dispatch and live-dictation start/stop now log at DEBUG —
    the existing pipeline ok: capture=… stt=… llm=… inject=…
    summary at INFO is enough at default verbosity. Bump
    RUST_LOG=fono=debug to see the per-event detail. 429 sites
    upgraded from tracing::info! to tracing::warn! so they
    appear at default log level, with the verbose JSON body now
    compacted to a single human-readable line (model + RPM ceiling
    • retry-in seconds) instead of being dumped raw. Streaming
      finalize and preview lanes detect 429 in the closure-error
      string and trip the same warn + notification + throttle path
      the batch backend uses.

Fixed

  • Hotkey-grab conflicts on X11 no longer print the bare
    X Error of failed request: BadAccess … X_GrabKey to stderr.
    A custom XSetErrorHandler is installed at daemon startup that
    converts BadAccess-on-XGrabKey into an actionable
    tracing::error! message naming the conflict and pointing at
    [hotkeys].hold / [hotkeys].toggle in the config. Other X11
    errors are surfaced at WARN with their numeric codes instead of
    being printed by libxlib's default handler.

Breaking Changes

  • [stt.cloud].streaming config field removed; streaming now derived from [interactive].enabled
  • [interactive].overlay config field removed; overlay always shown when [interactive].enabled = true
  • general.notify_on_dictation config field removed (redundant with clipboard-fallback notification)
  • "Fono — live dictation active" toast on first F9 toggle-on removed
  • "Fono — STT switched" and "LLM switched" tray success toasts removed

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Fono

Get notified when new releases ship.

Sign up free

Beta — feedback welcome: [email protected]