This release includes 2 breaking changes for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
+5 more
Summary
AI summaryUpdates docs/decisions/0026-live-preview-as-overlay-style.md, greyed-out, and docs/decisions/0025-cloud-provider-catalogue.md across a mixed release.
Full changelog
Changed
- Live preview is now a waveform style, not a separate toggle. The
tray "Waveform style" submenu gains a fifth entry —Transcript (live preview — more CPU / tokens)— that replaces the old
config-file-only[interactive].enabledflag. Picking Transcript
both swaps the overlay to streaming text and routes the
dictation hotkey through the live pipeline (this is the fix for
"live transcription only worked for the assistant, not for
dictation").Fftremains the first-run default; live preview stays
opt-in because it costs more CPU on local STT and more tokens on
any cloud backend that bills per-second of streamed audio.
InternallyConfig::live_preview()is the single source of truth,
defined asoverlay.style == Transcript. See
ADR 0026.
Removed
[interactive].enabledconfig field (Fono has no users yet, so no
migration is provided — the field is just gone). The rest of the
[interactive]block — boundary heuristics, drain grace,
cleanup_on_finalize, prosody/filler vocab, chunk timing — stays
put as streaming-pipeline tuning that applies whenever Transcript is
active.
Added
-
scripts/capture-overlay.sh— reproducible overlay screencast
helper for the README. Three modes:overlay(tight 640×≤240 crop),
paste(overlay + target-app window for "lands in a real app"
demos), andgallery(records each waveform style — bars,
oscilloscope, FFT, heatmap — labels them, and stitches the clips
viaffmpeg -f concator a 2×2xstackgrid). Detects
X11 vs Wayland, resolves monitor geometry via xrandr / wlr-randr /
swaymsg, encodes MP4 + GIF (palette pipeline with 5 MB soft / 9.5 MB
hard budget auto-tiering) + animated WebP, and probes deps with
per-distro install hints. Dev-only; not part of the shipped binary.
Seedocs/troubleshooting.md→ "Capturing screencasts". -
Onboarding auto-start and contextual tray left-click. Three
small UX changes that turn the first-launch path into a one-command
experience:sudo fono install(and thereforecurl -fsSL https://fono.page/install | sh) now startsfonoin the
background as the invoking user — picked up from$SUDO_USER
and launched viarunuser/sudowithsetsiddetachment — and
then runs thefono setupwizard interactively in the same
terminal (also as$SUDO_USER, with stdio inherited so the
prompts reach the user). Running the installer as bare root (no
sudowrapper) is a fully supported path: fono spawns and the
wizard runs as root, writing under/root/.config/fono/— fono
is allowed to run as root if that's what you want.
packaging/install.shre-attaches</dev/ttyto the install
invocation under thecurl | shtransport so the wizard's
stdin still has a real terminal when curl is piping the script
in. The backgrounded daemon's stdout/stderr now append to
$XDG_STATE_HOME/fono/fono.log(typically
~/.local/state/fono/fono.log, or/root/.local/state/fono/fono.log
for the bare-root install path) — matchingPaths::log_file()
sotail -fand what fono itself considers its log path are the
same file. Previously the spawn redirected to/dev/null, which
made post-install troubleshooting needlessly hard. Each step now
reports a precise outcome (started / setup completed / skipped
because headless / spawn failed) so users always know exactly
what happened. Skipped on headless boxes (no
DISPLAY/WAYLAND_DISPLAY/XDG_RUNTIME_DIR) and bypassable
withFONO_INSTALL_NO_START=1for packagers and CI. The XDG
autostart entry still handles next-login start. The server-mode
install path is unchanged — systemd'ssystemctl enable --now
was already starting the unit (logs viajournalctl -u fono.service).- The daemon now fires a single low-urgency desktop notification
on startup when no TTS backend is configured, prompting the user
to runfono setup. Once per process; suppressed once setup
completes (the daemon's IPCReloadhook refreshes the
onboarding snapshot atomically so no restart is required). - The tray icon's SNI left-click is now contextual: when TTS is
not yet configured it nudges towardfono setup; once configured
it shows the current hotkey cheat sheet (dictation / assistant /
cancel). The "Show last transcription" menu entry continues to
work for users who want it; the left-click no longer fires that
action.
Implemented without adding any config field — the question "is setup
finished?" is answered by the newConfig::tts_configured(&Secrets)
helper, which folds the existingconfigured_tts_backendslogic.
packaging/install.shis now the canonical source for the
https://fono.page/installone-liner and lives next to the binary
it ships. -
Unified log file at
/var/log/fono.log. Single-user-box
convention: every fono process writes there (world-writable 0666,
pre-created byfono install).Paths::log_file()now points at
that path. The daemon'stracingformatter forces ANSI on, so the
file preserves colors.fono doctorappends the last 10 log lines
to its report;fono doctor -f(or--follow) streams the file in
real time viatail -F, ANSI escapes intact. The background spawn
infono installfalls back to/dev/nullif/var/log/fono.log
is not writable, so a permissions hiccup never blocks startup. -
Colorized
fono doctoroutput. Section headers in bold cyan,
ready/present/existsin green,FAIL/MISSING/
FAILED TO LOAD/NONEin bold red,disabled/(unset)/
(fallback)dimmed, active-provider*highlighted. Auto-disabled
when stdout is not a TTY (pipes, redirects, CI) and whenNO_COLOR
is set, so scripts parsing the output remain unaffected. -
Animated "POLISHING" overlay for local STT/LLM. The
standalone-waveform overlay's post-release phase used to show a
static "POLISHING" panel while STT (and optional LLM cleanup) ran;
with a local whisper.cpp backend that's a 1–3 s dead patch where
the user has no signal the dictation is actually progressing. The
overlay now reuses the assistant's per-style thinking animation
(FFT bell sweep, neural-strand heatmap, oscilloscope standing
wave, centre-out bars) during that phase whenever the active STT
backend reportsis_local()— or whenever LLM cleanup is enabled
and the LLM is local. Cloud STT+LLM (sub-second) keep the static
panel so it doesn't just flash. Implemented via a new
OverlayState::Polishingvariant that shares the amber accent +
"POLISHING" label with the existingProcessingstate but is
consumed by the same synthetic-frame renderer path as
AssistantThinking. New defaultis_local()method on both the
SpeechToTextandStreamingStttraits (alsoTextFormatter),
overridden totrueonly in thewhisper-localandllama-local
backends.
Fixed
-
OpenRouter TTS default swapped from
openai/gpt-4o-mini-tts-…to
openai/tts-1(default voicealloy). The LLM-based
gpt-4o-mini-ttsmodel produced higher-quality voices but its
streaming output was not reliably forwarded by OpenRouter's
/audio/speechproxy: the proxy flushed an ~9.6 KB preamble and
then buffered the rest of the synthesised body until upstream
finished (~30+ s for a typical 200-character reply), exceeding
every reasonable client timeout. Verified via thefono.http
instrumentation's one-shot stall hex dump — bytes were valid PCM,
just never delivered. Classicaltts-1produces audio in
~0.5-2 s regardless of length and the whole body is forwarded in
one go, sidestepping the proxy-buffering problem entirely. Users
who want the LLM-based voice can pin
[tts.cloud] model = "openai/gpt-4o-mini-tts-2025-12-15"in
config.tomland accept the failure mode on long replies, or
switch to OpenAI direct (where streaming works correctly). -
OpenRouter TTS second-sentence stalls eliminated by disabling
HTTP/2 connection-pool reuse on the TTS client. Previously, the
first sentence of an assistant turn synthesised correctly but every
subsequent sentence stalled identically (~9.6 KB chunk arrived,
then 15 s of silence, then watchdog fired) — symptomatic of
OpenRouter's/audio/speechproxy mishandling multiplexed HTTP/2
streams. The TTS reqwest client now runs with
pool_max_idle_per_host(0)andhttp1_only(), forcing a fresh
TCP+TLS handshake per request (~200-400 ms overhead, negligible
against multi-second LLM-based synthesis). Other backends (LLM,
STT, assistant chat) keep their HTTP/2 pooling because no
equivalent stall pattern was observed there. -
TTS inter-chunk watchdog set to 20 s. Empirically OpenRouter's
/audio/speechproxy delivers a small preamble (~9.6 KB across ~8
chunks) and then pauses for several seconds before resuming the
audio stream proper. The previous 5 s watchdog tripped during that
pause and produced false-stall failures on otherwise-healthy
synthesis; 20 s keeps headroom for that pause while still catching
genuinely wedged connections far faster than the overall 30 s
request timeout. A one-shotwarn!-level hex dump of the partial
body fires on the first TTS stall per process lifetime, surfacing
whether the preamble bytes are SSE framing, JSON metadata, or
genuine PCM — diagnostic data for the next round of investigation. -
Structured-log
chunksfield now reports the truth on stalled
/ transport-error outcomes. Previously hardcoded to0in the TTS,
LLM, and STT consumers, which made it impossible to distinguish
"proxy sent one chunk then hung" from "nothing ever arrived" in
fono.http=debuglogs. NewBodyError::chunks()and
BodyError::after_ms()accessors expose the underlying watchdog
state to all consumers uniformly. -
OpenRouter TTS time-to-first-audio collapsed from ~30 s to ~2-4 s
by sendingstream_format: "audio"on/audio/speechrequests for
models that benefit from it (OpenRouter'sgpt-4o-mini-ttsand
OpenAI direct). Without this field, OpenAI's LLM-based TTS models
buffer the entire synthesis server-side before opening the response
body — visible in thefono.httpinstrumentation as a ~30 s
headers_msfollowed by a ~200 msbody_ms. With it, the upstream
streams raw audio bytes as they are generated andheaders_msdrops
to sub-second. The catalogue gates the new field per provider:
enabled for OpenAI and OpenRouter, intentionally omitted for Groq's
Orpheus deployment (which is conservative about unknown request
fields). Classical models liketts-1are unaffected — they already
stream by default and accept the field as a no-op.
Added
-
Structured HTTP instrumentation across every cloud-backed
pipeline (STT transcribe, LLM cleanup chat, voice-assistant
streaming chat, TTS/audio/speech, wizard key validation). A new
fono-httpcrate provides a single per-stage stopwatch
(RequestTimings), an inter-chunk body watchdog
(read_body_with_watchdog), and one chokepoint
(emit_http_debug) that funnels every consumer through the same
schema (stage,provider,endpoint,status,headers_ms,
ttfb_ms,body_ms,decode_ms,total_ms,body_bytes,
content_length,chunks,request_id,attempt,outcome).
Silent by default; opt in per session with
RUST_LOG=info,fono.http=debug fono daemon. Detects stalled
bodies in 15-30 s (per stage) rather than waiting for the global
60 s reqwest timeout, surfaces the upstreamx-request-id/
request-idon every response (success and failure), and on TTS
retries once automatically when the upstream stalls mid-body
(typical OpenRouter proxy hiccup). The improved error surface for
stalled TTS now reads e.g.openrouter TTS body read failed (request_id=or-…, attempt=2)instead of the previous bare
reading openrouter TTS response body. Per-stage chunk watchdogs:
TTS 15 s (overall cap reduced from 60 s to 30 s), STT 30 s, LLM
cleanup 30 s, assistant SSE 20 s inter-event. -
OpenRouter app attribution is now sent on every outbound
request toopenrouter.ai(STT transcribe + prewarm, LLM chat +
prewarm, voice-assistant chat stream + prewarm, TTS
/audio/speech, and the wizard'svalidate_cloud_keyprobe),
not just from the STT backend as before. The three static headers
areHTTP-Referer: https://fono.page,
X-OpenRouter-Title: Fono, and
X-OpenRouter-Categories: personal-agent,writing-assistant—
identical across every install, no per-user or per-machine
identifier embedded, no request body changes. Fono now appears on
https://openrouter.ai/rankings, in the "Apps" tab of each model
it routes through, and gets a public dashboard at
https://openrouter.ai/apps?url=https://fono.page. The previous
STT-only attribution used the GitHub repo URL as the Referer; the
switch tofono.pageis a deliberate one-time reset onto the
canonical project homepage. See
https://openrouter.ai/docs/app-attribution and the new
fono_core::openrouter_attributionmodule. -
fono setupnow hot-reloads the daemon when it finishes.
Previously, running the wizard whilefonowas already running
saved the new config but the daemon kept using the old one until
manually restarted. The wizard now sendsRequest::Reloadover
IPC afterconfig.toml/secrets.tomlare written, and prints
Daemon reloaded — new settings are live.(or a friendly
fallback hint when no daemon is running). -
Desktop notification when a configured backend's API key is
missing at startup or after a config reload. Previously, a
rotated key or a wizard pick whose secret hadn't been added yet
surfaced only as a singletracing::WARNline (e.g.TTS unavailable; assistant replies will be silent: Cartesia TTS API key "CARTESIA_API_KEY" not found in secrets.toml or environment).
A newErrorClass::MissingKeyvariant is now classified from
reload errors and fired as a Critical-urgency popup with copy
that names the env var and thefono keys add <KEY>command.
Wired through the LLM / TTS / Assistant reload paths; subject to
the existing session cascade cap.
Changed
-
OpenRouter TTS default swapped from
hexgrad/kokoro-82mto
openai/gpt-4o-mini-tts-2025-12-15for native multilingual output
(default voicecoral, $0.60 / 1 M characters). Kokoro voices are
monolingual and prefixed by language code, so every non-English
synthesis was routed through an American-English voice; OpenAI Mini
TTS speaks French, German, Spanish, Romanian, Mandarin, etc.
natively with no per-call language argument or per-language voice
map needed. Existing users who prefer Kokoro can pin
[tts.cloud] model = "hexgrad/kokoro-82m"and
voice = "af_heart"inconfig.toml; full Kokoro support is
deferred to a future local+cloud-symmetric backend (see
plans/2026-05-14-kokoro-local-and-cloud-parity-v1.md). -
Voice assistant wizard step now renders as an aligned three-
column table (Provider · Model · Key). Model names are
human-readable (GPT-5.4 mini,Claude Haiku 4.5,
GPT-OSS 120B,Qwen 3 235B, …) rather than raw catalogue ids,
and the key-status column readsset/missinginstead of the
earlier(key already set)/(will ask for key)parenthetical. -
Assistant TTS auto-picked from the same key. When the chosen
assistant chat provider also offers TTS (e.g. OpenAI for both),
the wizard reuses the same provider + key for the spoken reply
and printsTTS: <provider> (same key as the assistant — no extra prompt).instead of running the explicit TTS picker. The
picker still runs when the chat provider has no TTS capability. -
Comfortable-tier first-run latency budget bumped from 1500 ms
to 2000 ms. The earlier 1.5 s ceiling tripped first-dictation
warnings on perfectly usable mid-range hardware (laptops on
battery, slower SSDs). 2.0 s reflects measured p50 latency on
the lower end of the Comfortable tier; tiers above it (HighEnd
600 ms / Recommended 1000 ms) are unchanged. -
Tray TTS submenu drops the redundant
cloud,prefix and greys
out unavailable backends. Every cloud backend was annotated
(cloud, will ask for key)or(cloud, key already set)— but
clicking the entry never asked for a key, so the message was
misleading. The submenu now shows backends whose key is missing
as non-clickable (greyed-out) rows with a plain(no key)
suffix; backends with a configured key remain clickable. A new
DISABLED_SENTINELprefix infono-traylets daemon submenus
opt rows out of activation without per-row plumbing.
Fixed
-
Groq TTS rejected
response_format: pcmwith HTTP 400
(response_format must be one of [wav]). Groq's Orpheus
deployment only emits WAV-wrapped audio. The OpenAI-compat TTS
client now reads itsresponse_formatfrom the catalogue
(OpenAiCompat { base_url, response_format }) and strips the
RIFF/WAVE header transparently when the provider returns WAV,
yielding the same raw 24 kHz int16 LE PCM the playback path
expects. OpenAI and OpenRouter keeppcm(lowest latency). -
Groq TTS rejected the default voice (
tara) with HTTP 400
(voice must be one of the following voices: [autumn diana hannah austin daniel troy]). Fono's catalogue defaulted totara,
which is part of Canopy Labs' open-source Orpheus voice set but
not part of Groq's hosted six-voice subset for
canopylabs/orpheus-v1-english. The Groq TTS default voice is
nowhannah(neutral female, in Groq's curated set). Users with
an explicit[tts.cloud.groq].voiceoverride pinned to a Canopy-
only voice (tara/leah/jess/leo/dan/mia/zac/zoe)
must edit to one ofautumn/diana/hannah/austin/daniel/
troyto get audio out of Groq.
Added
- Desktop notification when a TTS/STT/LLM/assistant model requires
terms acceptance. Providers like Groq return HTTP 400 with
model_terms_requiredwhen an org admin hasn't accepted a model's
terms (e.g. Orpheus, PlayAI). The critical-notify classifier now
recognises that shape as a newTermsRequiredclass, and the
notification body embeds the acceptance URL extracted from the
provider response so the user can click straight through to the
console. Subject to the existing session cascade cap.
Fixed
-
Anthropic LLM cleanup 400
stop_sequences: each stop sequence must contain non-whitespace. The client was sending
stop_sequences = ["\n\n"]which Anthropic now rejects. The
blank-line heuristic is dropped; cleanup output length is bounded by
max_tokens = 512alone. -
Groq assistant returned 404 (
model_not_found) because the
catalogue advertisedllama-4-maverick-17b-128e-instructas Groq's
multimodal model and the new default ofprefer_vision = true
caused the runtime to swap to it. That model isn't available on
Groq today. Groq'smultimodal_modelis nowNone; the assistant
usesopenai/gpt-oss-120b(the existingtext_model) for every
Groq request. -
Groq TTS model decommissioned. The previously catalogued
playai-ttsmodel (voiceFritz-PlayAI) was retired by Groq and
now returnsmodel_not_found. Groq's catalogue entry now points
atcanopylabs/orpheus-v1-english(Canopy Labs' Orpheus, OpenAI-
compatible audio/speech on Groq) with default voicetara. The
endpoint URL and auth header are unchanged. -
OpenAI assistant requests rejected by chat/completions when
prefer_web_searchwas on (Invalid value: 'web_search_preview').
Theweb_search_previewtool descriptor is Responses-API-only;
chat/completions rejects unknown tool types with a 400. OpenAI's
catalogue entry now advertisesweb_search = None; the default of
[assistant].prefer_web_searchhas been flipped tofalse.
Anthropic'sweb_search_20250305(Messages API) is unaffected. A
future commit will re-enable OpenAI web search via the Responses
API migration. As a defensive belt-and-braces, the OpenAI
chat/completions client now drops any web-search tool descriptor
at request build time and emits a one-shottracing::warn!so a
hand-editedprefer_web_search = trueno longer surfaces a 400 to
the user. -
Cloud STT clients (OpenAI, Deepgram) were missing from the default
build.crates/fono/Cargo.tomllistedfono-sttandfono-llm
with no feature selection, so the default release shipped only the
per-cratedefaultfeatures (Groq + Wyoming STT, OpenAI-compat +
Groq LLM). A user picking OpenAI as primary in the wizard hit a
STT not compiled inwarning at daemon startup.fono-sttis now
built withgroq + openai + deepgram + wyoming;fono-llmis
built withcerebras + openai-compat + anthropic. Thecloud-all
meta-feature is widened to match. (Cartesia / AssemblyAI STT
clients are not yet wired asfono-sttfeatures — tracked
separately.)
Added
- Cloud provider capability catalogue. A single
fono_core::provider_catalog::CLOUD_PROVIDERStable is the source of
truth for which cloud providers offer STT / LLM cleanup / assistant
chat / vision / web search / TTS. The wizard, tray,fono use cloud,
andfono doctorall consume the catalogue, eliminating the five
duplicatedmatchblocks the wizard used to carry. (Phase A, #9; see
ADR 0025.) - Multi-provider TTS for the voice assistant (#11). The assistant
audio path now supports Groq (PlayAIplayai-tts), OpenRouter
(Kokorohexgrad/kokoro-82m), Cartesia (sonic-2), and Deepgram
(aura-2-thalia-en) in addition to OpenAI and Wyoming. Users on a
non-OpenAI primary can run the full record → STT → LLM → TTS loop
without obtaining a second key.CARTESIA_API_KEYand
DEEPGRAM_API_KEYalready present insecrets.tomlfrom STT usage
are reused automatically; the wizard's TTS picker orders providers
with stored keys first. - Optional assistant extras. Two new
[assistant]toggles surface
in the wizard's Optional extras MultiSelect when the chosen primary
supports them:prefer_visionswaps the assistant chat model for the
provider's multimodal variant (OpenAI / Anthropic / Groq / Gemini),
andprefer_web_searchattaches the provider's native web-search
tool to every assistant request (OpenAI'sweb_search_preview,
Anthropic'sweb_search_20250305; Gemini'sgoogle_searchis
catalogued for forward compatibility). Both default tofalse. - Desktop notifications for critical pipeline failures. Total STT
pipeline failures (auth errors, network errors, 5xx) and LLM-cleanup
auth-class failures now fire a Critical-urgency desktop notification
in addition to the existingerror!/warn!log line, so an
expired API key is no longer silently buried in journalctl. Dedup
is per-session and per(stage, provider, error class), so a stuck
key pops exactly once per F8/F9 press and an STT-auth + LLM-auth
failure in the same session each get their own surface. LLM
transient errors (network blips, 5xx) keep the existing silent
fallback to the raw STT transcript — only configuration-class
failures pop a notification. - Critical-failure notification coverage extended (issue #8). TTS
(assistant-mode reply playback), Assistant chat (both stream-open
and mid-stream errors), and text-injection failures now route
through the samecritical_notifysurface as STT/LLM, so a
rotated API key in any stage produces a Critical-urgency popup
instead of journal-only output. The LLM cleanup path also now
notifies onNetwork-class failures (previouslyAuth-only), so
an offline endpoint is visibly surfaced. - Daemon-startup-failure notification. When
fono daemonexits
with an error (bad config, locked single-instance socket, hotkey
backend init failure), a one-shot Critical-urgency notification
fires before the process exits, pointing the user at
journalctl --user -u fonoandfono doctor. This addresses the
systemd--userautostart case where stderr is invisible.
Changed
-
Assistant extras default policy.
prefer_visionstays
default-on (no API impact — the multimodal model is the same model
on OpenAI/Anthropic, just with image input capability advertised).
prefer_web_searchnow defaults off: the only provider whose
chat/completions API supports it natively today is Anthropic, and
OpenAI's chat/completions endpoint hard-rejects the
web_search_previewdescriptor. The default flips back totrue
once the OpenAI client migrates to the Responses API. -
Wizard first-run UX corrections (pre-release polish).
- The step-1 path picker is now a fixed-order two-column table
(Local/Cloud/Customize) instead of a tier-dependent
paragraph-shaped list. Column padding is computed from the
longest option name + 2 spaces so future variants stay aligned. - The language picker is skipped entirely when the OS reports at
least one detected language; the picker only renders for the
zero-detection fallback. A one-line info trace records the
detected codes and points the user at the tray's Languages
submenu for editing. - The "Enable live dictation?" question is dropped from every
branch — the tray's existing toggle is the editing surface, and
config.interactive.enabledalready defaults tofalse. - The cloud-assistant fast-path is now automatic: when the chosen
primary covers chat, the assistant is enabled without a Confirm.
Two info lines state the configuration;pick_tts_for_assistant
still runs when no TTS was set, andprompt_assistant_extras
keeps vision / web-search as explicit opt-ins. The legacy
Confirm("Enable the voice assistant?")survives only for the
local-LLM branch where no catalogue primary matches.
- The step-1 path picker is now a fixed-order two-column table
-
Wizard cloud branch collapsed onto a single primary-provider
picker (#9). Picking OpenAI or Groq now configures STT, LLM
cleanup, the voice assistant, and TTS from one API-key prompt;
picking Anthropic / Cerebras / OpenRouter configures LLM + Assistant
and asks an opt-in follow-up only for the capabilities the primary
doesn't cover. The wizard label list shows runtime-derived capability
badges (STT · LLM · Assistant · TTS · Vision · Search), capped at
six per row. -
PathChoice::Mixedrenamed toPathChoice::Customize. The
advanced wizard branch now appears in the top-level menu as
"Customize each capability (advanced)". Legacy configs that still
carrymixedsemantics continue to load — there is no on-disk
enum to migrate. -
Re-running the wizard reuses stored keys silently. Every
cloud-key prompt now routes throughprompt_or_reuse_key, which
prints a singlereusing <KEY> from secrets.tomlline instead of
re-asking. A returning user with a populatedsecrets.tomlsees
zero key prompts on a wizard re-run. -
Cascade cap on critical notifications (issue #8). When a single
root cause (e.g. a rotated cloud API key) cascade-fails through
STT → LLM → Assistant → TTS in the same dictation session, the
user now sees exactly one notification — the first stage to
fail — instead of one per stage. Downstream failures still go to
the journal atwarn!. The cap auto-resets at the start of each
new F8/F9/F10 press and after 120 s of dictation inactivity.
Stageis now#[non_exhaustive]so future stages can be added
without breaking matches. -
Hotkeys auto-detect toggle vs push-to-talk per press. A short tap
(under one second) on the dictation or assistant hotkey toggles
recording on; pressing-and-holding for at least a second flips the
same key into push-to-talk and recording stops on release. The
global[hotkeys].mode = "toggle" | "hold"setting is removed —
there is now one consistent behaviour across both keys with no
configuration required.
Removed
[hotkeys].modeconfiguration field. Old configs that still set
mode = "toggle"ormode = "hold"continue to load (serde
silently ignores the unknown field); the value has no effect. The
HotkeyModeenum is dropped fromfono_core::config.
Full Changelog: https://github.com/bogdanr/fono/compare/v0.7.1...v0.8.0
Breaking Changes
- [interactive].enabled config field removed — no migration provided; all related logic now driven by overlay.style == Transcript.
- `[hotkeys].mode` configuration field removed; hotkey behavior is now auto-detected and cannot be overridden.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About Fono
All releases →Related context
Related tools
Beta — feedback welcome: [email protected]