Fono

v0.3.0 Breaking

This release includes 2 breaking changes for platform teams planning a safe upgrade.

Published 2mo AI Agents & Assistants

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

assistant dictation linux llm local-first rust

+5 more

speach-to-text stt vulkan whisper wyoming

Summary

AI summary

Updates Deprecated, Linux, and docs/decisions/0017-cloud-stt-language-stickiness.md across a mixed release.

Full changelog

Cloud STT now self-heals from one-off language misdetections, the LLM
cleanup stage stops occasionally replying with a question instead of
the cleaned text, and every release tag is gated on a real Groq
equivalence check across five languages.

Added

Cloud equivalence gate at release time: a new cloud-equivalence
job in .github/workflows/release.yml calls Groq's
whisper-large-v3-turbo against the existing multilingual fixture
set (en × 4, ro × 3, es × 1, fr × 1, zh × 1; ~110 audio-seconds
total) and diffs the per-fixture verdicts against a committed
baseline at docs/bench/baseline-cloud-groq.json. Blocks artefact
production on failure. Auto-skipped when GROQ_API_KEY is unset
(forks, bootstrap tags) or the tag carries the -no-cloud-gate
suffix (operator escape hatch). Cost per release: < 0.5 % of
Groq's free-tier daily cap. See ADR
0021-cloud-equivalence-via-real-api.md
and docs/dev/release-checklist.md.
fono-bench equivalence --stt groq accepts cloud Groq as an STT
backend. Reads GROQ_API_KEY from env; default model
whisper-large-v3-turbo, overridable via --model. New
--rate-limit-ms <ms> flag (default 250 ms for --stt groq, 0
otherwise) paces requests under Groq's 30-req/min ceiling. HTTP
429 is a hard fail with code 3 and an explanatory message; never
retried.
New docs/dev/release-checklist.md documenting the bootstrap
command for the cloud-equivalence baseline, the regenerate
conditions, and the -no-cloud-gate override.

Fixed

LLM cleanup occasionally returned a clarification reply
(“It seems like you're describing a situation, but the details are
incomplete. Could you provide the full text you're referring to…”)
instead of the cleaned transcript. Reproducible across every
cleanup backend — Cerebras, Groq, OpenAI, OpenRouter, Ollama,
Anthropic, and the local llama.cpp path — because the failure mode
is a property of how chat-trained LLMs interpret a bare short
utterance, not of any single provider. The fix is correspondingly
universal: the default cleanup prompt was rewritten with hard
“never ask for clarification” rules; every backend now wraps the
user message in unambiguous <<< / >>> delimiters so the
transcript cannot be mistaken for a chat message; and a refusal
detector rejects clarification-shaped replies and falls back to the
raw STT text. Applied identically to OpenAiCompat, AnthropicLlm,
and LlamaLocal. See
plans/2026-04-28-llm-cleanup-clarification-refusal-fix-v1.md.

Changed

[llm].skip_if_words_lt default raised from 0 to 3. One- and
two-word captures (“yes”, “okay”, “send it”) now bypass the LLM
cleanup roundtrip entirely — regardless of whether the configured
backend is cloud or local — saving 150–800 ms and avoiding the
short-utterance clarification failure mode at the source. Override
in config.toml if you want every utterance cleaned.
[stt.cloud].cloud_rerun_on_language_mismatch default flipped from
false to true. Combined with the new in-memory language cache,
cloud STT now self-heals from one-off language misdetections (e.g.
Groq Turbo flagging accented English as Russian) at the cost of one
extra round-trip per misfire. Set false to opt out.

Added

In-memory per-backend language cache
(crates/fono-stt/src/lang_cache.rs). Records the most recently
correctly-detected language code per cloud STT backend; consulted
only as a rerun target when post-validation fires. No file I/O,
no persistence — daemon restarts rebuild within one or two
utterances. OS locale (LANG / LC_ALL) seeds the cache at start
if and only if its alpha-2 code is in general.languages.
New crates/fono-core/src/locale.rs — POSIX-locale → BCP-47 alpha-2
parser; used by both the cache bootstrap and the wizard.
Tray Languages submenu (Linux): read-only checkbox display of
the configured peer set plus a "Clear language memory" item that
drops every entry from the in-memory cache.
New ADR
docs/decisions/0017-cloud-stt-language-stickiness.md
documenting why the cache is rerun-only, in-memory only, and
peer-symmetric (no primary/secondary).

Deprecated

[stt.cloud].cloud_force_primary_language — superseded by the
in-memory language cache. Field still parses for one release; will
be removed in v0.5.
LanguageSelection::primary() — renamed to fallback_hint(). The
alias is retained as #[deprecated] for one release; usage is
scope-restricted in its doc-comment to single-language transports.

See plans/2026-04-28-multi-language-stt-no-primary-v3.md.

Breaking Changes

[stt.cloud].cloud_force_primary_language will be removed in v0.5
LanguageSelection::primary() is deprecated; use fallback_hint()

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track Fono

Get notified when new releases ship.

About Fono

All releases →

Fono

Summary

Added

Fixed

Changed

Added

Deprecated

Breaking Changes

Related context

Related tools