Fono

v0.7.0 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

Published 2mo AI Agents & Assistants

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

assistant dictation linux llm local-first rust

+5 more

speach-to-text stt vulkan whisper wyoming

Summary

AI summary

Updates configurable, qwen-3-235b-a22b-instruct-2507, and openai/gpt-oss-120b across a mixed release.

Full changelog

Added

Voice assistant — F10 hold-to-talk, streaming chat, TTS playback.
A second push-to-talk key (F10 by default) turns Fono into an
offline-capable voice assistant. The pipeline diverges after STT:
instead of cleaning the transcript and injecting it, Fono asks a
chat-capable LLM, streams the reply sentence-by-sentence into a TTS
backend, and plays the audio. First sentence starts speaking before
the model finishes generating, so time-to-first-audio is bounded by
one sentence's synth latency rather than the full reply.
[assistant] and [tts] config blocks. Independent backend
selection from the [llm] cleanup pipeline — pick a fast local 3B
for cleanup and a bigger cloud model for the assistant, or any
mix-and-match. Multi-turn rolling history with a configurable time
window (default 5 minutes) and max-turn cap (default 12). Pressing
the dictation key clears assistant context (configurable);
pressing F10 again mid-reply barges in with history retained;
Escape stops playback ("shut up") without forgetting.
Cloud assistant backends. Anthropic (Claude Haiku 4.5) and the
full OpenAI-compatible family — OpenAI (gpt-5.4-mini), Cerebras
(qwen-3-235b-a22b-instruct-2507), Groq (openai/gpt-oss-120b),
OpenRouter, Ollama. Each
ships in the default binary; one feature flag per family lets slim
builds drop unused providers.
Cloud cleanup model defaults refreshed to match retired and
newly-released models: Cerebras llama3.1-8b, Groq
openai/gpt-oss-20b, OpenAI gpt-5.4-nano, Anthropic
claude-haiku-4-5-20251001. The OpenAI-compat client now sends
max_completion_tokens (the new field name newer OpenAI models
require; older models still accept it).
TTS backends. Wyoming protocol client (any
wyoming-piper-style server on the LAN), the OpenAI
/v1/audio/speech API (24 kHz PCM stream), and an in-process
Piper stub that points users at Wyoming-piper for now (the
static-musl ship build can't yet pull in onnxruntime). Audio
playback uses paplay on the Linux release variant (no libasound
link, matches the existing parec capture path) or cpal behind the
cpal-backend feature.
CLI surface. fono use assistant <backend>,
fono use tts <backend> [--uri tcp://host:port],
fono assistant {press,release,stop} for scripted end-to-end
testing.
Tray. New Stop assistant and Forget conversation entries;
Assistant backend and TTS backend submenus mirror the existing
STT/LLM submenus and switch backends live via Reload. Tray icon
flips amber while the assistant is active.
Wizard. fono setup ends with an opt-in assistant + TTS step;
reuses any cloud key already entered earlier in the flow so a
single OPENAI_API_KEY powers both chat and TTS without a second
prompt.
Doctor. fono doctor exercises both factories at startup so a
missing API key or unreachable Wyoming server surfaces in one
place; new Providers (assistant) and Providers (TTS) tables
show key/URI status per backend with an active marker.
Overlay feedback for the assistant flow. Recording paints
green ("ASSISTANT") with the chosen waveform style; the post-
release thinking + speaking phase paints amber ("THINKING") with
per-style synthetic animations distinct from the real-audio
recording shape:
- FFT — Gaussian "scanner" (σ ≈ 8 bins out of 100) sweeps
  across the panel; per-bin breathing baseline blends in via
  a screen composite so the bell emerges smoothly.
- Bars — symmetric centre-out, peak at midline rippling
  outward.
- Oscilloscope — two interfering sine waves with edge taper
  pinning x = 0 / x = 1 to the centerline; central antinode
  reaches ±1.0 without clipping.
- Heatmap — two anti-phased Gaussian "neural strands"
  crossing over the rolling 6 s window; transitions seamlessly
  from recording-FFT data without clearing the cache.
  Default [overlay].style flipped from Bars → FFT — most active
  visualisation across both phases.
Runtime overlay style swap. Changing [overlay].style via
fono use, the tray Waveform style submenu, or fono config edit now applies on the next frame instead of waiting for a
daemon restart.
Smoke-test binary (cargo run --release --example smoke_assistant -p fono) exercises each cloud assistant + the
OpenAI TTS path end-to-end. The release CI's new
cloud-assistant job runs the --ci subset (Groq + Cerebras,
the providers whose API keys are stored as GitHub Secrets).

Fixed

FSM stuck on a sub-300 ms F10 tap. Brief F10 taps released
before MIN_RECORDING left the orchestrator's
on_assistant_hold_release early-returning without firing
ProcessingDone; the FSM sat in AssistantThinking forever and
silently rejected subsequent F8/F9/F10 presses. Every early-return
path now emits ProcessingDone; AssistantRecording also accepts
ProcessingDone as a safety net.
Audio playback worker dying after every cancel. pb.stop()
used to send Cmd::Stop which made the worker break out of its
loop; the next turn's enqueue then failed with "audio playback
worker stopped". Cmd::Drain now drains queued items + clears the
abort flag without exiting the worker, so multi-turn conversations
keep working across barge-ins, Forget, and dictation pivots.
Frozen overlay during the post-release phase. The level task
was aborted in stop_and_drain the moment capture ended, leaving
the waveform on its last pre-release frame for 4–5 s while STT +
LLM ran. The overlay now switches into the synthetic thinking
animation as soon as F10 is released, and the FFT thinking
visualisation gets even-spaced inter-bar gaps via integer-aligned
slot widths.

Full Changelog: https://github.com/bogdanr/fono/compare/v0.6.1...v0.7.0

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track Fono

Get notified when new releases ship.

About Fono

All releases →

Fono

Summary

Added

Fixed

Related context

Related tools