Skip to content

Fono

v0.3.0 Breaking

This release includes 2 breaking changes for platform teams planning a safe upgrade.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

assistant dictation linux llm local-first rust
+5 more
speach-to-text stt vulkan whisper wyoming

Summary

AI summary

Updates Deprecated, Linux, and docs/decisions/0017-cloud-stt-language-stickiness.md across a mixed release.

Full changelog

Cloud STT now self-heals from one-off language misdetections, the LLM
cleanup stage stops occasionally replying with a question instead of
the cleaned text, and every release tag is gated on a real Groq
equivalence check across five languages.

Added

  • Cloud equivalence gate at release time: a new cloud-equivalence
    job in .github/workflows/release.yml calls Groq's
    whisper-large-v3-turbo against the existing multilingual fixture
    set (en × 4, ro × 3, es × 1, fr × 1, zh × 1; ~110 audio-seconds
    total) and diffs the per-fixture verdicts against a committed
    baseline at docs/bench/baseline-cloud-groq.json. Blocks artefact
    production on failure. Auto-skipped when GROQ_API_KEY is unset
    (forks, bootstrap tags) or the tag carries the -no-cloud-gate
    suffix (operator escape hatch). Cost per release: < 0.5 % of
    Groq's free-tier daily cap. See ADR
    0021-cloud-equivalence-via-real-api.md
    and docs/dev/release-checklist.md.
  • fono-bench equivalence --stt groq accepts cloud Groq as an STT
    backend. Reads GROQ_API_KEY from env; default model
    whisper-large-v3-turbo, overridable via --model. New
    --rate-limit-ms <ms> flag (default 250 ms for --stt groq, 0
    otherwise) paces requests under Groq's 30-req/min ceiling. HTTP
    429 is a hard fail with code 3 and an explanatory message; never
    retried.
  • New docs/dev/release-checklist.md documenting the bootstrap
    command for the cloud-equivalence baseline, the regenerate
    conditions, and the -no-cloud-gate override.

Fixed

  • LLM cleanup occasionally returned a clarification reply
    (“It seems like you're describing a situation, but the details are
    incomplete. Could you provide the full text you're referring to…”)
    instead of the cleaned transcript. Reproducible across every
    cleanup backend — Cerebras, Groq, OpenAI, OpenRouter, Ollama,
    Anthropic, and the local llama.cpp path — because the failure mode
    is a property of how chat-trained LLMs interpret a bare short
    utterance, not of any single provider. The fix is correspondingly
    universal: the default cleanup prompt was rewritten with hard
    “never ask for clarification” rules; every backend now wraps the
    user message in unambiguous <<< / >>> delimiters so the
    transcript cannot be mistaken for a chat message; and a refusal
    detector rejects clarification-shaped replies and falls back to the
    raw STT text. Applied identically to OpenAiCompat, AnthropicLlm,
    and LlamaLocal. See
    plans/2026-04-28-llm-cleanup-clarification-refusal-fix-v1.md.

Changed

  • [llm].skip_if_words_lt default raised from 0 to 3. One- and
    two-word captures (“yes”, “okay”, “send it”) now bypass the LLM
    cleanup roundtrip entirely — regardless of whether the configured
    backend is cloud or local — saving 150–800 ms and avoiding the
    short-utterance clarification failure mode at the source. Override
    in config.toml if you want every utterance cleaned.

  • [stt.cloud].cloud_rerun_on_language_mismatch default flipped from
    false to true. Combined with the new in-memory language cache,
    cloud STT now self-heals from one-off language misdetections (e.g.
    Groq Turbo flagging accented English as Russian) at the cost of one
    extra round-trip per misfire. Set false to opt out.

Added

  • In-memory per-backend language cache
    (crates/fono-stt/src/lang_cache.rs). Records the most recently
    correctly-detected language code per cloud STT backend; consulted
    only as a rerun target when post-validation fires. No file I/O,
    no persistence — daemon restarts rebuild within one or two
    utterances. OS locale (LANG / LC_ALL) seeds the cache at start
    if and only if its alpha-2 code is in general.languages.
  • New crates/fono-core/src/locale.rs — POSIX-locale → BCP-47 alpha-2
    parser; used by both the cache bootstrap and the wizard.
  • Tray Languages submenu (Linux): read-only checkbox display of
    the configured peer set plus a "Clear language memory" item that
    drops every entry from the in-memory cache.
  • New ADR
    docs/decisions/0017-cloud-stt-language-stickiness.md
    documenting why the cache is rerun-only, in-memory only, and
    peer-symmetric (no primary/secondary).

Deprecated

  • [stt.cloud].cloud_force_primary_language — superseded by the
    in-memory language cache. Field still parses for one release; will
    be removed in v0.5.
  • LanguageSelection::primary() — renamed to fallback_hint(). The
    alias is retained as #[deprecated] for one release; usage is
    scope-restricted in its doc-comment to single-language transports.

See plans/2026-04-28-multi-language-stt-no-primary-v3.md.

Breaking Changes

  • [stt.cloud].cloud_force_primary_language will be removed in v0.5
  • LanguageSelection::primary() is deprecated; use fallback_hint()

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track Fono

Get notified when new releases ship.

Sign up free

Beta — feedback welcome: [email protected]