Skip to content

Fono

v0.9.0 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

assistant dictation linux llm local-first rust
+5 more
speach-to-text stt vulkan whisper wyoming

Summary

AI summary

Early‑preview voice loop lets coding agents speak and listen through Fono.

Changes in this release

Feature Medium

Adds early-preview voice loop for MCP-capable coding agents.

Adds early-preview voice loop for MCP-capable coding agents.

Source: llm_adapter@2026-05-26

Confidence: high

Feature Medium

Adds one-command `fono agent-setup <name>` to wire MCP server and config.

Adds one-command `fono agent-setup <name>` to wire MCP server and config.

Source: llm_adapter@2026-05-26

Confidence: high

Feature Medium

Adds background-speech filter with configurable relevance settings.

Adds background-speech filter with configurable relevance settings.

Source: llm_adapter@2026-05-26

Confidence: high

Feature Medium

Adds `fono speak --stream` for line‑by‑line TTS streaming.

Adds `fono speak --stream` for line‑by‑line TTS streaming.

Source: llm_adapter@2026-05-26

Confidence: high

Feature Medium

Adds "Coding agents" section to `fono doctor` output.

Adds "Coding agents" section to `fono doctor` output.

Source: llm_adapter@2026-05-26

Confidence: high

Feature Medium

Adds voice-friendly overlay during `fono.listen` calls.

Adds voice-friendly overlay during `fono.listen` calls.

Source: llm_adapter@2026-05-26

Confidence: low

Feature Medium

Adds tray icon amber indicator while voice turn is active.

Adds tray icon amber indicator while voice turn is active.

Source: llm_adapter@2026-05-26

Confidence: low

Feature Medium

Adds `fono use mcp-server on|off` toggle command.

Adds `fono use mcp-server on|off` toggle command.

Source: llm_adapter@2026-05-26

Confidence: low

Feature Low

Changes MCP listen default silence to 10 s (was 2 s).

Changes MCP listen default silence to 10 s (was 2 s).

Source: llm_adapter@2026-05-26

Confidence: high

Feature Low

Shows voice-friendly overlay while `fono.listen` is active, with waveform/transcript UI.

Shows voice-friendly overlay while `fono.listen` is active, with waveform/transcript UI.

Source: granite4.1:30b@2026-05-26-audit

Confidence: low

Feature Low

Changes tray icon to amber while a voice turn (speak/listen/confirm) is in progress.

Changes tray icon to amber while a voice turn (speak/listen/confirm) is in progress.

Source: granite4.1:30b@2026-05-26-audit

Confidence: low

Feature Low

Provides `fono use mcp-server on|off` toggle to enable/disable the MCP server without editing config.

Provides `fono use mcp-server on|off` toggle to enable/disable the MCP server without editing config.

Source: granite4.1:30b@2026-05-26-audit

Confidence: low

Feature Low

Lowers MCP listen `max_seconds` from 60 s to 45 s for tighter turn‑taking budget.

Lowers MCP listen `max_seconds` from 60 s to 45 s for tighter turn‑taking budget.

Source: granite4.1:30b@2026-05-26-audit

Confidence: low

Bugfix Medium

Fixes overlay not appearing on first run for Debian/Ubuntu installs.

Fixes overlay not appearing on first run for Debian/Ubuntu installs.

Source: llm_adapter@2026-05-26

Confidence: high

Full changelog

Talk to your coding agent. The headline feature is an early-preview voice
loop that lets any MCP-capable coding agent — Forge, Claude Code, Cursor,
Codex CLI, Gemini CLI, and others — speak and listen through Fono. Plus a
Debian/Ubuntu install fix so the on-screen overlay shows up on first run
instead of after a manual restart.

Added

  • Talk to your coding agent (early preview). Fono now ships an
    MCP server that lets any MCP-capable coding agent — Forge, Claude
    Code, Cursor, Codex CLI, Gemini CLI, and others — drive a voice
    loop through three tools: fono.speak (the agent speaks a reply),
    fono.listen (the agent asks a free-form question and gets your
    spoken answer back as text), and fono.confirm (the agent offers
    A/B/C choices and matches your spoken pick). Verified end-to-end
    against Forge and Claude Code; best-effort for the rest. Disabled
    by default, opt in with fono use mcp-server on. This is an
    early preview — the protocol, defaults, and tool surface may
    still shift before the feature graduates.
  • One-command setup for your coding agent.
    fono agent-setup <name> wires everything in one shot: enables
    the MCP server, merges the right mcpServers.fono entry into
    your agent's MCP config, and appends the shared voice-mode preset
    to your project's AGENTS.md / CLAUDE.md. Idempotent, supports
    --dry-run, and --list shows every registered agent. After
    setup, launch your agent the normal way and it can speak and
    listen.
  • Voice-friendly overlay while the agent is talking with you.
    When the agent calls fono.listen the same overlay you see for
    F7 dictation pops up — waveform/transcript while you speak, a
    PONDERING animation while it waits — so you always know whether
    Fono is listening. The overlay is scoped strictly to the
    microphone-open phase (not while the agent is speaking its
    prompt) and is skipped when a regular Fono daemon is already
    running so a daemon-paired environment never double-paints.
  • Background-speech filter. When the agent asks a question,
    Fono now filters out chatter that doesn't look like an answer —
    radio, TV, a side conversation in the room, or the agent's own
    prompt echoing back through the speakers. Tunable via
    [mcp].relevance_filter ("off" | "heuristic" | "llm", default
    "heuristic") and [mcp].relevance_max_rejections (default
    2). The optional "llm" mode uses the configured polish
    backend as a one-shot classifier with a 1.5 s timeout; on
    timeout or parse failure it fails open and accepts the
    utterance. Each rejection flashes a dim grey Ignoring badge in
    the overlay so you can see that Fono heard you but is still
    waiting for a real answer.
  • Tray icon turns amber while the agent is in a voice turn.
    Same colour Fono uses while STT or polish is running, so you can
    tell at a glance that a fono.listen / fono.speak /
    fono.confirm call is in flight. The previous tray state is
    restored when the call ends; nested spans (a prompt that speaks
    before listening) keep the icon steady. No configuration needed.
  • fono speak --stream — reads stdin line by line, segments
    into sentences, strips markdown, and speaks each sentence
    through the configured TTS backend. Backpressure prevents a
    fast producer from outrunning playback; SIGINT flushes cleanly.
    Pipe-friendly: echo "Hello. World." | fono speak --stream.
  • fono use mcp-server on|off — toggle [mcp.server].enabled
    without editing config by hand.
  • fono doctor "Coding agents" section — reports whether the
    MCP server is enabled, the last MCP handshake timestamp, the
    advertised tools, and the last tool-call result.
  • Tray "MCP server" submenu (visible only when enabled) —
    enable/disable toggle, last-connected timestamp, per-tool
    enable/disable rows.
  • Docs. New docs/coding-agents.md integration guide covering
    Forge, Claude Code and Cursor (verified) plus Codex CLI, Gemini
    CLI, Cline, Continue, Windsurf and Goose (best-effort), with an
    "Adding your own agent" section. Shared voice-mode preset at
    assets/agent-presets/voice.md.
  • ADR 0030 — records the Fono-as-MCP-server decision, the
    three-tool surface, the agent-agnostic design principle, and the
    agents.toml registry.

Changed

  • MCP listen default silence is now 10 s (was 2 s) so an
    agent turn can pause for thought without being cut off
    mid-sentence.
  • MCP listen default max_seconds lowered from 60 s to 45 s
    combined with the multi-utterance relevance loop this gives a
    responsive turn-taking budget without stranding you.

Fixed

  • Overlay now appears on first run on Debian/Ubuntu desktops.
    Installing via curl https://fono.page/install | sh silently
    skipped the post-install prompt that offers to add
    libxkbcommon-x11 and xdotool, so the on-screen recording
    overlay fell back to noop and only appeared after a manual
    restart. The prompt now reads from /dev/tty directly so it
    survives curl|sh + sudo PTY allocation, and the background
    daemon spawn also reconstructs DISPLAY and XAUTHORITY when
    sudo strips them. Server installs are unaffected — they never
    pull X11 libraries.

Full Changelog: https://github.com/bogdanr/fono/compare/v0.8.2...v0.9.0

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Fono

Get notified when new releases ship.

Sign up free

Related context

Beta — feedback welcome: [email protected]