This release adds 2 notable features for engineering teams evaluating rollout.
✓ No known CVEs patched in this version
Topics
+5 more
Summary
AI summaryScreen‑pointing support, three new recording overlay visualizations, and dictation cleanup fixes for non‑English accents.
Changes in this release
| Type | Severity | Summary | CVE |
|---|---|---|---|
| Feature | Medium |
Adds screen‑capture ability for voice assistant and coding agents. Adds screen‑capture ability for voice assistant and coding agents. Source: llm_adapter@2026-05-31 Confidence: high |
— |
| Feature | Low |
Adds three new recording overlay visual styles: Aurora Beziers, System/360, Terrain 3D. Adds three new recording overlay visual styles: Aurora Beziers, System/360, Terrain 3D. Source: llm_adapter@2026-05-31 Confidence: high |
— |
| Feature | Low |
Enables voice assistant pipeline by default; respects prior explicit disablement. Enables voice assistant pipeline by default; respects prior explicit disablement. Source: llm_adapter@2026-05-31 Confidence: high |
— |
| Feature | Low |
Rewrites voice mode to listen by default, ask bounded questions only when helpful, and never request risky approvals via voice. Rewrites voice mode to listen by default, ask bounded questions only when helpful, and never request risky approvals via voice. Source: llm_adapter@2026-05-31 Confidence: high |
— |
| Bugfix | Medium |
Fixes dictation cleanup dropping words and losing accents on non‑English input. Fixes dictation cleanup dropping words and losing accents on non‑English input. Source: llm_adapter@2026-05-31 Confidence: high |
— |
| Bugfix | Medium |
Fixes assistant providing placeholder responses for screen content; now describes actual screen. Fixes assistant providing placeholder responses for screen content; now describes actual screen. Source: llm_adapter@2026-05-31 Confidence: low |
— |
| Bugfix | Low |
Improves escape key cancellation and Ctrl‑C handling during voice sessions. Improves escape key cancellation and Ctrl‑C handling during voice sessions. Source: llm_adapter@2026-05-31 Confidence: high |
— |
| Bugfix | Low |
Makes the assistant provide actual screen descriptions instead of placeholder responses. Makes the assistant provide actual screen descriptions instead of placeholder responses. Source: granite4.1:30b@2026-05-31-audit Confidence: low |
— |
Full changelog
Show your screen, dictate in any language. This release teaches the voice
assistant and your coding agents to look at what you're pointing at, fixes
AI cleanup so it stops dropping text and accents on non-English dictation, and
adds a few new looks for the recording overlay.
Added
- Point at your screen and ask. The F8 voice assistant and any
connected coding agent can now see your screen when you reference
something on it — "what does this error mean?", "read this dialog to
me". Fono grabs the focused window automatically, or opens your
desktop's region picker so you can frame exactly what to share, then
hands the picture to the model. Private windows (KeePassXC, Bitwarden,
1Password) are never captured. Works out of the box with whatever
screenshot tool you already have (scrot, grim, maim, spectacle,
gnome-screenshot, …) — no new required dependencies.fono doctor
shows whether capture is ready. - New looks for the recording overlay. Three fresh visualisation
styles join the picker: Aurora Beziers (Siri-style glowing
ribbons), System/360 (a retro mainframe console-lamp spectrum),
and Terrain 3D (your voice as a flowing 3D landscape). Pick one
from the tray's Visualization menu.
Changed
- The voice assistant is on by default. The pipeline that powers F8
and the coding-agent voice loop now works without extra setup. If you
had explicitly turned it off, that choice is respected. - Voice mode talks more naturally. The built-in voice preset for
coding agents was rewritten: agents now listen by default, only ask
bounded A/B/C questions when it actually helps, never ask you to
approve risky actions by voice, and open each spoken turn with a short
cue so you have a moment to refocus before the answer.
Fixed
- Dictation cleanup no longer drops your words — or your accents.
On non-English dictation, the AI cleanup step could silently come back
empty and inject the raw, unpolished transcript instead; diacritics
(ă, î, ș, ț, é, ñ, …) could also get lost on the way to the cursor.
Both are fixed: cleanup now reliably tidies up non-English text and
restores the correct accented characters. When a coding agent is in
focus in a terminal, dictation is framed as prose (capitalisation and
punctuation) rather than shell commands. - The assistant now actually answers about your screen. Previously
it captured the screen but spoke a placeholder instead of describing
what it saw. It now sends the image to the model and reads back the
real answer. - Escape reliably cancels while the agent is listening, and Ctrl-C
restores the tray icon cleanly when you stop a voice session.
Full Changelog: https://github.com/bogdanr/fono/compare/v0.9.0...v0.9.1
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About Fono
All releases →Related context
Related tools
Beta — feedback welcome: [email protected]