This release includes 1 breaking change for platform teams planning a safe upgrade.
✓ No known CVEs patched in this version
Summary
AI summaryUpdates Release Status, Known Issues, and Upgrade Notes For Maintainers across a mixed release.
Full changelog
Gilbert Codex v0.3.0 is the major public alpha build-prep update after 09d34f17 (feat: expand tool runtime and utility surfaces). This release focuses on making the desktop agent runtime cleaner, more inspectable, easier to test, and safer to ship to a much wider group of users.
The headline is not one flashy feature. The headline is that the tool system has been reorganized into a production-shaped runtime: a modular executor, explicit tool registry, workflow layer, stronger provider normalization, better tool-call UI, more targeted tests, and clearer release automation.
This is still an alpha desktop agent. It is ready for broader public use and mass testing, but it is not being presented as a fully enterprise-hardened release. The known issue to keep in front of users is tool reliability with some hosted models: some models follow the shared XML tool protocol cleanly, some models drift or emit malformed calls, and local models through LM Studio or Ollama continue to be a working path when configured correctly.
Release Status
- Version:
0.3.0 - Tag:
v0.3.0 - Previous public tag:
v0.2.3 - Previous
origin/maincommit before this update:09d34f17 - Primary packaged target: Windows x64
- Release vehicle: GitHub Actions
Releaseworkflow plus Tauri NSIS installer - Installer family:
Gilbert-Codex-0.3.0-x64-setup.exe - Update feed asset:
latest.json - Updater signature asset:
Gilbert-Codex-0.3.0-x64-setup.exe.sig - Checksum asset:
Gilbert-Codex-0.3.0-x64-setup.exe.sha256 - Published SHA-256:
369a9524bc1cb8da95a460f27438b0e8b3d8a87a16f30eb5fbee18b9a56dc026
The GitHub release should publish the Windows installer, updater JSON, updater signature, and checksum together. The release workflow now reads this release note from docs/releases/v0.3.0.md so the public GitHub release body stays aligned with the repository docs.
Who This Build Is For
This build is for people who want to try Gilbert Codex as a local-first desktop coding agent workspace:
- Developers who want a GUI around chat, local files, terminal sessions, browser preview, and source control.
- Testers who want to exercise real tool use across cloud models and local models.
- Contributors who want a cleaner source tree for tool runtime work.
- Early adopters who understand that tool-calling behavior can vary by provider and model.
- Maintainers who need clearer test commands and release automation before larger public distribution.
This build is not pretending every provider/model/tool combination is perfect. The release is explicit about the model/tool compatibility surface so users can report the right failure instead of treating every tool issue as the same bug.
Major Runtime Upgrade
The local computer tool runtime has been split out of the old monolithic executor path and reorganized into focused modules under src/tools/computer/executor/.
The new structure makes the tool layer easier to reason about:
orchestrator.tsowns the pass-level execution flow.parser.tsowns local tool-call parsing and visible-text stripping.registry.tsmaps tool names to real handlers.policy.tskeeps standard and deep-research execution policies explicit.approvals.tsowns approval classification, risk, and session decisions.fileChanges.tssummarizes file effects for the UI.fuseMutations.tsmerges adjacent safe file mutations.results.tsformats and recovers tool results.shell.ts,terminalPolicy.ts, andsyntaxCheck.tsisolate terminal and verification behavior.workspacePolicy.tsandworkspaceFormatters.tsisolate workspace boundary logic and output formatting.- Individual tool handlers now live under
src/tools/computer/executor/tools/.
That change matters because tool bugs now have real owners. A Git bug does not require reading the browser handler. A terminal policy bug does not require touching file creation. A provider parser bug does not require editing all local command execution code.
Workflow Layer
v0.3.0 introduces the first workflow layer through workflow_run.
The workflow layer is intentionally practical. It does not replace primitive tools. It sequences existing primitives into higher-level routines, keeps approval policy inherited from the active workspace mode, and returns evidence the assistant can use before it edits, commits, or publishes.
Included workflow definitions:
agent-workflow-audit: inspects workflow, tool, approval, and runtime surfaces before proposing changes.plan-patch-verify: gathers local evidence before guarded edits.research-backed-patch: combines web evidence with local context for docs-sensitive or API-sensitive changes.repo-health-sweep: checks repo state and likely validation commands without mutating.branch-pr-prep: gathers Git evidence before staging, committing, pushing, or opening a PR.mcp-tool-usage: discovers MCP servers and tools before calls.monitor-brief: creates a repeatable monitor contract without silently scheduling background jobs.
The workflow engine uses xstate for sequence, parallel, branch, and retry behavior. That is a deliberate step away from prompt-only orchestration. The app can now represent workflow execution as a real runtime concept with testable behavior.
Provider And Model Tool Compatibility
The release makes one important compatibility decision: provider-native local tool calling is disabled for now, and every provider uses the same XML tool_call protocol until Gilbert can persist provider-native tool-call IDs and return provider-native tool_result content on the next turn.
Why this is safer right now:
- Different providers use different function-call envelopes.
- Some models emit partial or malformed native tool calls.
- Some OpenAI-compatible routes behave differently than OpenAI itself.
- Some OpenRouter models are aliases or routed through upstream providers with their own tool behavior.
- XML tool calls can be normalized consistently across OpenAI, Anthropic, Gemini-compatible routes, OpenRouter, DeepSeek, Groq, xAI, Mistral, LM Studio, and Ollama.
The known issue is still real: some models have tool problems. A model may answer normally but fail when asked to call tools, may place tool calls in reasoning text, may send the wrong argument shape, or may ignore the tool protocol. Other models work fine. Local models work when the selected local server and model follow the shared prompt/tool protocol well enough.
The release adds tests around:
- Forced XML protocol behavior.
- Disabled provider-native local tool path.
- Disabled OpenAI Responses MCP passthrough while XML is forced.
- OpenAI tool schema envelope shape.
- Anthropic input schema shape.
- Gemini OpenAI-compatible schema constraints.
- OpenRouter free-model recognition.
- Provider-emitted
edit_filesaliases.
Tool Parsing And Recovery
The parser now accepts more real-world model output shapes:
- Standard
<tool_call>XML. - Direct XML tool tags such as
<edit_file>...</edit_file>. - Anthropic-style
<function_calls><invoke name="...">...</invoke></function_calls>. - JSON fenced code blocks with
toolorname. - Placeholder names like
functionortool_callwhen the arguments clearly indicate the intended real tool. inline_edit,old_string/new_string,old_str/new_str, and other compatibility aliases.edit_filesbatch shapes usingedits,edits_json,paths,old_texts,new_texts, or broadcast text.
The visible assistant bubble strips executable tool markup before users see it. That should reduce confusing moments where the assistant appears to debate tool syntax instead of actually using tools.
The recovery layer now makes failures easier to classify:
- Edit retry.
- Read retry.
- Write retry.
- Terminal structured-edit guidance.
- Create retry.
- Mutation retry.
- Syntax retry.
Recoverable failures are returned as evidence for the next pass instead of disappearing into generic error text.
File Editing And Creation
This build tightens the editing path around the way users actually ask for changes.
Improved behaviors:
edit_fileremains the preferred tool for precise source edits.edit_filessupports batch edits when the model has multiple safe edits to make.inline_editroutes to the same precise edit behavior.write_filestays create-first and requires an explicit replacement path with a fresh checksum when replacing an existing file.create_filesaccepts richer batch shapes and reports malformed items instead of failing the whole idea silently.- Duplicate-safe creation remains the default.
rename_pathandmove_pathhandle path changes explicitly.- Shell commands that appear to write source files through here-strings,
Set-Content,Out-File,Tee-Object, raw redirection, or direct file writes are blocked when structured edit tools are available.
This release is intentionally pushing source mutation toward structured, inspectable tool calls. Terminal remains for builds, tests, package installs, formatters, project setup, and command evidence.
Terminal And Dev Server Handling
Terminal behavior continues to be a critical part of Gilbert Codex because real coding work needs real command output.
v0.3.0 includes the runtime split that supports:
- Buffered fast-path terminal commands.
- Longer timeouts for package setup.
- Shorter evidence timeouts for quick searches.
- Long-running dev-server detection.
- Background terminal sessions with attachable metadata.
- Managed local preview URL detection.
- Process-management command detection.
- Workspace-bound terminal working directories.
- Terminal policy that respects read-only, ask-first, Gilbert review, workspace, and full-computer modes.
The app should be much clearer about whether a terminal command ran, was blocked, is still running in the background, or needs a user decision.
Tool Activity UI
The activity surface now favors a tool-call ledger over raw thinking traces.
That means users should see:
- Which tool ran.
- Whether it is active, waiting for approval, skipped, failed, or complete.
- The compact input summary.
- Useful output detail when expanded.
- File-change summaries where available.
- Terminal metadata where available.
- Approval cards for risky operations.
This is a product direction choice. Users need to understand what the agent did, what changed, and what still needs review. Raw chain-of-thought style traces are not the right public surface for that.
Local Git And GitHub Tooling
The local Git tool family remains first-class:
git_statusgit_initgit_diffgit_loggit_stagegit_unstagegit_commitgit_pushgit_pullgit_fetchgit_branchgit_checkout
The GitHub tool family remains first-class:
- repository status and listing
- branch and tree reads
- file reads
- code search
- API-backed commits
- draft pull requests
- release-note generation
- release creation
- release listing
- workflow listing
- workflow dispatch
- workflow run inspection
For users, the important distinction is still simple: local Git tools operate on the selected clone, while GitHub tools operate through the connected GitHub account in Settings.
MCP Groundwork
The MCP surface is represented in the tool registry and workflow layer:
mcp_list_serversmcp_list_toolsmcp_call_toolmcp_set_servermcp_remove_servermcp-tool-usageworkflow
Provider-native MCP passthrough is intentionally disabled while XML tool mode is forced. That keeps MCP behavior under the same local tool execution and approval path instead of splitting the product into provider-specific tool result formats too early.
Web, Weather, Browser, And Color Tools
The broader utility tool surface remains available through the same registry:
- DuckDuckGo-backed
web_searchfor current web evidence and source-backed answers. - NOAA/NWS-oriented
weathertool support. open_browser_previewfor local app previews.browser_automationfor controlled browser interactions.lookup_colorfor CSS named colors and extended color lookup.
The release keeps web and weather tools behind Toolbox settings. Models should use them when they need current facts, provider docs, API behavior, weather data, color references, or browser-preview evidence.
Documentation Updates
The documentation pass for this release refreshes:
README.mdPROGRESS.mddocs/CODING_TOOLS.mddocs/github/README.mddocs/discord/README.mddocs/platform/README.mddocs/promo/README.mddocs/INSTALLER.md.github/workflows/release.ymldocs/releases/v0.3.0.md
The goal is to keep public docs honest about what the app can do, what is still alpha-grade, and where users should report problems.
Release Automation Update
The GitHub Release workflow now resolves release notes from docs/releases/<tag>.md before calling the Tauri release action.
For v0.3.0, the workflow should use:
docs/releases/v0.3.0.md
That prevents the automated GitHub Release from collapsing a major build into a one-line body.
Expected automation path:
- Commit the v0.3.0 update to
main. - Push
main. - Create and push tag
v0.3.0. - Let GitHub Actions run CI on
main. - Let GitHub Actions run the Release workflow on the tag.
- Confirm the Release workflow publishes the installer, updater signature, updater JSON, and release body.
- Attach or publish the SHA-256 checksum.
- Verify the public release page and updater feed.
Download
Expected Windows assets:
Gilbert-Codex-0.3.0-x64-setup.exeGilbert-Codex-0.3.0-x64-setup.exe.sha256Gilbert-Codex-0.3.0-x64-setup.exe.siglatest.json
This build is still not signed with a Windows Authenticode certificate. Windows SmartScreen may show an extra confirmation prompt before install.
macOS and Linux release artifacts are not official yet. The source tree has partial platform support, but those operating systems still need native maintainers to run, package, and fix platform-specific behavior before official downloads are promised.
Upgrade Notes For Users
- Install the Windows x64 setup executable from GitHub Releases.
- If Windows SmartScreen appears, review the publisher and source before continuing. This alpha is unsigned.
- Configure provider keys, local endpoints, GitHub OAuth, Discord, and workspace permissions from Settings.
- For local models, start LM Studio or Ollama first, confirm the model endpoint is reachable, then select the matching provider/model in Gilbert.
- If a cloud model answers normally but tools fail, try a different model or local model and file an issue with the model ID, provider, prompt, and tool call that failed.
- Keep local databases, logs, screenshots, provider errors, and workspace paths out of public issues unless they are sanitized.
Upgrade Notes For Maintainers
- Keep version fields aligned across
package.json,package-lock.json,src-tauri/Cargo.toml,src-tauri/Cargo.lock, andsrc-tauri/tauri.conf.json. - Keep
docs/releases/v0.3.0.mdaligned with the GitHub Release body. - Use
npm.cmdon Windows PowerShell if thenpmshim is blocked by script policy. - Run the focused JavaScript tool-runtime tests before release.
- Run the full build and Rust check before pushing a release tag.
- Confirm the release workflow has
TAURI_SIGNING_PRIVATE_KEYandTAURI_SIGNING_PRIVATE_KEY_PASSWORDwhen needed. - Confirm generated artifacts, logs, local database files, secrets, and dependency folders stay out of Git.
Validation Commands
Recommended local checks:
npm.cmd run test:local-runtime
npm.cmd run test:workflow-layer
npm.cmd run test:tool-recovery
npm.cmd run test:provider-roundtrip
npm.cmd run check
npm.cmd run audit:prod
git diff --check
Recommended release build:
npm.cmd run app:release
The GitHub release workflow performs the signed Windows updater release build when the tag is pushed or the workflow is dispatched with a version input.
Known Issues
Model And Tool Compatibility
Some models have tool problems. This is the top known issue for v0.3.0.
Observed or expected failure modes include:
- A model responds with normal text but ignores the tool protocol.
- A model emits a tool name that is close but not canonical.
- A model places tool calls inside reasoning text.
- A model sends malformed XML.
- A model sends JSON with the wrong argument shape.
- A model starts a tool call and then continues prose as if the tool already ran.
- A provider route supports chat completion but not reliable function/tool behavior.
- A provider-compatible endpoint accepts OpenAI-like requests but handles tools differently.
Workarounds:
- Use a model known to follow the shared XML tool protocol.
- Try a local model through LM Studio or Ollama when cloud model tool behavior is unstable.
- Retry with a more direct instruction if the model drifted.
- Switch to a provider/model with stronger structured-output behavior.
- Include the exact model ID, provider, prompt, and failed tool text when reporting.
Provider-Native Tools Disabled
Provider-native local tools are disabled in this build on purpose. The app forces the XML tool protocol for every provider until native tool-call IDs and provider-native tool-result messages can be persisted and replayed safely across turns.
This may feel conservative, but it prevents different providers from splitting the runtime into incompatible result contracts.
MCP Passthrough Disabled
OpenAI Responses MCP passthrough is disabled while XML mode is forced. MCP tools still belong to the local Gilbert tool/approval path for now.
Windows-Only Packaged Release
Windows x64 is the only official packaged target for v0.3.0. macOS and Linux source support exists, but official release artifacts require native validation.
Unsigned Windows Installer
The installer is not Authenticode code-signed yet. SmartScreen warnings are expected on some machines.
Tooling Still Needs More Real-World Coverage
The runtime has focused tests now, but broad real-world coverage is still needed across:
- long-running terminal sessions
- package managers beyond npm
- very large repositories
- provider-specific model catalogs
- local-model server restarts
- GitHub OAuth scope edge cases
- Discord bridge behavior during long agent runs
- browser preview behavior across local frameworks
- MCP server failures and tool schema mismatches
Security And Privacy Notes
Gilbert Codex is local-first. Provider keys, GitHub tokens, Discord settings, local accounts, local databases, logs, workspace files, scan artifacts, release signing credentials, and updater private keys are local user data. They should not be committed or attached to public issues.
The release continues the repository hygiene posture:
.envstays ignored.- Local SQLite databases stay ignored.
- SQLite journal and WAL files stay ignored.
- Local scan outputs stay ignored.
- Dependency folders stay ignored.
- Build outputs stay ignored.
- Release signing credentials stay out of source control.
Production Readiness Posture
This build is prepared for larger public testing and public release distribution. It has a cleaner architecture, more tests, better docs, and stronger automation than the previous public build.
It is not claiming perfect model/tool reliability. The mass-use path for v0.3.0 is to ship the build with explicit known issues, gather provider/model compatibility reports, and continue tightening the runtime with real evidence.
The release is clean enough to publish. It is honest enough to trust.
Breaking Changes
- Provider‑native local tool calling is disabled; all providers must use the XML `tool_call` protocol until native IDs can be persisted.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
Share this release
About UrbanWafflezz/GilbertCodex
All releases →Related context
Related tools
Beta — feedback welcome: [email protected]