Skip to content

Fono

v0.8.1 Breaking

This release includes 1 breaking change for platform teams planning a safe upgrade.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

assistant dictation linux llm local-first rust
+5 more
speach-to-text stt vulkan whisper wyoming

Summary

AI summary

Updates Breaking, F8, and streaming across a mixed release.

Full changelog

A quality-of-life release: two more cloud providers, polish on the
"Pondering" pause UI, headless servers install themselves, and a handful
of papercuts gone.

Added

  • Deepgram speech-to-text now actually works. Picking Deepgram in
    fono setup (or running fono use stt deepgram) had been broken
    since v0.8.0 — it offered the option but failed at startup. The full
    pipeline is now wired: both the batch endpoint and a real WebSocket
    for live dictation, with the newer Nova-3 model as the default
    (Nova-2 is still selectable for languages Nova-3 doesn't cover yet).
  • Cartesia speech-to-text. Same story — was advertised, now
    delivered. Batch transcription via the ink-whisper family;
    realtime ink-2 will follow in a future release.
  • Cartesia text-to-speech now picks a native voice per language.
    Speak Romanian, hear a Romanian voice; switch to English in the
    same session, hear an English voice. No more one-voice-fits-all.
  • Auto-stop on silence is now wired end-to-end. If you enable
    "Auto-stop after pause" in the tray, dictation actually stops once
    you've been quiet for the configured time — previously the
    PONDERING label appeared but nothing committed.
  • sudo fono install is friendlier on servers. Headless boxes
    (no graphical session, multi-user systemd target) are now detected
    and the systemd lane runs by default — no --server flag needed.
    A new --desktop flag forces the desktop lane on hosts that just
    look headless.
  • Server installs auto-enable LAN sharing. fono install --server
    now turns on the Wyoming STT listener on port 10300 out of the box,
    probes that it actually bound, and prints the address so other
    machines on the LAN can discover it immediately. fono uninstall
    on a server also cleans up /var/cache/fono (multi-GB model
    blobs).
  • A diagnostic VU bar. [overlay].volume_bar = "advanced" paints
    a dBFS-axis meter with reference ticks for your speaking level and
    the silence threshold — useful for tuning auto-stop without
    guesswork. The default simple bar is unchanged.

Changed

  • The "PONDERING" pause indicator is consistent everywhere.
    • It now shows up on the assistant flow (F8) too, in the green
      assistant palette, with the same auto-stop behaviour as
      dictation.
    • It only appears when you've actually enabled auto-stop — no
      more PONDERING under your finger if you've opted out.
    • It works in live (streaming) dictation, not just batch.
    • It doesn't flicker on a single breath, chair creak, or mouse
      click during a real pause.
  • Tray "Auto-stop after pause" presets reworked from
    Off / 0.8 s / 1.5 s / 3 s (chat-app numbers) to Off / 3 s / 5 s
    (prose-dictation numbers). Default stays Off.
  • Tray "Visualization" picker now turns the VU bar on automatically
    for the Transcript style and off for the others — sensible default,
    still overridable from config.toml.
  • fono hwprobe matches what the setup wizard actually picks.
    The recommendation table used to promise large-v3-turbo on
    CPU-only boxes that the wizard would then quietly downgrade. Now
    the report and the wizard agree.
  • Hotkey reliability on Wayland. Switching the overlay style from
    the tray now takes effect on the very next hotkey press (no
    restart). GNOME 47's portal hotkey rejection is detected upfront so
    Fono falls back to gsettings/X11 instead of silently dropping
    presses.
  • Local Whisper picks better defaults out of the box. Model names
    now resolve through a quality-tested quantization ladder
    (tiny → q5_1, small → q5_1, small.en → q8_0,
    large-v3-turbo → q8_0); CPU threads default to the physical core
    count, which doubles throughput on Zen 3 / Zen 4 SMT systems where
    the previous default over-subscribed logical threads.

Fixed

  • Wayland overlay no longer steals keyboard focus, paints as an
    opaque rectangle, or lands top-left on GNOME / Mutter. The overlay
    now runs through a pluggable backend layer: native
    wlr-layer-shell on KDE / wlroots / COSMIC / Hyprland; X11 via
    Xwayland on GNOME (which doesn't implement layer-shell). Set
    FONO_OVERLAY_BACKEND=… to force a specific backend.
  • PipeWire audio playback (pw-play) no longer fails on every
    assistant reply — the --raw flag was missing.
  • LAN dictation against a Wyoming peer that advertises IPv6 no
    longer fails with EINVAL when the peer's first-listed address is
    a link-local. Discovery now prefers routable IPv4 / IPv6.
  • History database rebuilds itself when it carries an older
    schema, instead of warning on every dictation.
  • The dictation key held down while pausing no longer flips the
    overlay into PONDERING and (with auto-stop on) no longer ends the
    session out from under you.
  • Shipped binaries no longer SIGILL on pre-VNNI / pre-AVX-512 CPUs.
    The release build inherited ggml's GGML_NATIVE=ON default, which
    appends -march=native to the C/C++ compile line. On the GitHub
    Actions Linux runner (AMD EPYC 7763, Zen 3) the C compiler's
    auto-vectoriser baked VPDPBUSD (AVX-VNNI) into the binary, causing
    immediate SIGILLs on users' Kaby Lake, 8th-gen Intel, and earlier
    laptops. The shipped binary now pins an explicit
    AVX2 / FMA / F16C / BMI2 baseline (Intel Haswell ≥ 2013, AMD
    Excavator ≥ 2015) via .cargo/config.toml, so what CI builds is
    what users download — regardless of which CPU GitHub puts in its
    runner pool. A/B verified on Lunar Lake: zero throughput loss
    (±7% noise) because ggml's hand-written VNNI kernels are separately
    gated by GGML_AVX_VNNI (also off by default), so -march=native
    was costing portability without delivering any actual VNNI speedup.
  • Hotkeys work immediately after sudo fono install on
    GNOME-Wayland.
    The post-install autostart spawned the daemon via
    runuser -u $SUDO_USER, which inherited the sudo-scrubbed
    environment: DISPLAY=:0 was preserved but WAYLAND_DISPLAY,
    XDG_RUNTIME_DIR, and DBUS_SESSION_BUS_ADDRESS were not. With
    only DISPLAY set, the daemon's hotkey-backend detector picked
    the X11 listener, the GNOME-gsettings shim never ran, and F7 / F8
    fell through in every native-Wayland app — users only saw working
    hotkeys after logging out and back in (when the XDG autostart entry
    fired with a real session env). The installer now reconstructs the
    graphical-session env from /run/user/$(id -u) inside the spawn
    command — after the user-switch — so the first daemon launched by
    sudo fono install is identical to what the next-login autostart
    would have produced. Drive-by: shutdown_existing_daemon no
    longer panics with "Cannot start a runtime from within a runtime"
    when re-running install while a previous daemon is still alive.

Removed

  • 14 inert config keys (the always-warm-mic flag, eight commit-tuning
    knobs, three session-budget knobs, and two more) — all of them
    were silently ignored at runtime. Defaults are unchanged.

Breaking

  • [overlay].volume_bar is now "off" | "simple" | "advanced"
    instead of a boolean, and defaults to "off". Existing configs
    need a one-line edit: volume_bar = true"simple",
    volume_bar = false"off". The tray picker handles new
    installs automatically.

Full Changelog: https://github.com/bogdanr/fono/compare/v0.8.0...v0.8.1

Breaking Changes

  • [overlay].volume_bar is now "off" | "simple" | "advanced" instead of a boolean; existing configs must be edited accordingly.

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Fono

Get notified when new releases ship.

Sign up free

Beta — feedback welcome: [email protected]