This release includes 2 breaking changes for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Topics
+5 more
Summary
AI summaryUpdates Deprecated, Linux, and docs/decisions/0017-cloud-stt-language-stickiness.md across a mixed release.
Full changelog
Cloud STT now self-heals from one-off language misdetections, the LLM
cleanup stage stops occasionally replying with a question instead of
the cleaned text, and every release tag is gated on a real Groq
equivalence check across five languages.
Added
- Cloud equivalence gate at release time: a new
cloud-equivalence
job in.github/workflows/release.ymlcalls Groq's
whisper-large-v3-turboagainst the existing multilingual fixture
set (en × 4, ro × 3, es × 1, fr × 1, zh × 1; ~110 audio-seconds
total) and diffs the per-fixture verdicts against a committed
baseline atdocs/bench/baseline-cloud-groq.json. Blocks artefact
production on failure. Auto-skipped whenGROQ_API_KEYis unset
(forks, bootstrap tags) or the tag carries the-no-cloud-gate
suffix (operator escape hatch). Cost per release: < 0.5 % of
Groq's free-tier daily cap. See ADR
0021-cloud-equivalence-via-real-api.md
anddocs/dev/release-checklist.md. fono-bench equivalence --stt groqaccepts cloud Groq as an STT
backend. ReadsGROQ_API_KEYfrom env; default model
whisper-large-v3-turbo, overridable via--model. New
--rate-limit-ms <ms>flag (default 250 ms for--stt groq, 0
otherwise) paces requests under Groq's 30-req/min ceiling. HTTP
429 is a hard fail with code 3 and an explanatory message; never
retried.- New
docs/dev/release-checklist.mddocumenting the bootstrap
command for the cloud-equivalence baseline, the regenerate
conditions, and the-no-cloud-gateoverride.
Fixed
- LLM cleanup occasionally returned a clarification reply
(“It seems like you're describing a situation, but the details are
incomplete. Could you provide the full text you're referring to…”)
instead of the cleaned transcript. Reproducible across every
cleanup backend — Cerebras, Groq, OpenAI, OpenRouter, Ollama,
Anthropic, and the local llama.cpp path — because the failure mode
is a property of how chat-trained LLMs interpret a bare short
utterance, not of any single provider. The fix is correspondingly
universal: the default cleanup prompt was rewritten with hard
“never ask for clarification” rules; every backend now wraps the
user message in unambiguous<<</>>>delimiters so the
transcript cannot be mistaken for a chat message; and a refusal
detector rejects clarification-shaped replies and falls back to the
raw STT text. Applied identically toOpenAiCompat,AnthropicLlm,
andLlamaLocal. See
plans/2026-04-28-llm-cleanup-clarification-refusal-fix-v1.md.
Changed
-
[llm].skip_if_words_ltdefault raised from0to3. One- and
two-word captures (“yes”, “okay”, “send it”) now bypass the LLM
cleanup roundtrip entirely — regardless of whether the configured
backend is cloud or local — saving 150–800 ms and avoiding the
short-utterance clarification failure mode at the source. Override
inconfig.tomlif you want every utterance cleaned. -
[stt.cloud].cloud_rerun_on_language_mismatchdefault flipped from
falsetotrue. Combined with the new in-memory language cache,
cloud STT now self-heals from one-off language misdetections (e.g.
Groq Turbo flagging accented English as Russian) at the cost of one
extra round-trip per misfire. Setfalseto opt out.
Added
- In-memory per-backend language cache
(crates/fono-stt/src/lang_cache.rs). Records the most recently
correctly-detected language code per cloud STT backend; consulted
only as a rerun target when post-validation fires. No file I/O,
no persistence — daemon restarts rebuild within one or two
utterances. OS locale (LANG/LC_ALL) seeds the cache at start
if and only if its alpha-2 code is ingeneral.languages. - New
crates/fono-core/src/locale.rs— POSIX-locale → BCP-47 alpha-2
parser; used by both the cache bootstrap and the wizard. - Tray Languages submenu (Linux): read-only checkbox display of
the configured peer set plus a "Clear language memory" item that
drops every entry from the in-memory cache. - New ADR
docs/decisions/0017-cloud-stt-language-stickiness.md
documenting why the cache is rerun-only, in-memory only, and
peer-symmetric (no primary/secondary).
Deprecated
[stt.cloud].cloud_force_primary_language— superseded by the
in-memory language cache. Field still parses for one release; will
be removed in v0.5.LanguageSelection::primary()— renamed tofallback_hint(). The
alias is retained as#[deprecated]for one release; usage is
scope-restricted in its doc-comment to single-language transports.
See plans/2026-04-28-multi-language-stt-no-primary-v3.md.
Breaking Changes
- [stt.cloud].cloud_force_primary_language will be removed in v0.5
- LanguageSelection::primary() is deprecated; use fallback_hint()
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About Fono
All releases →Related context
Related tools
Beta — feedback welcome: [email protected]