UrbanWafflezz/GilbertCodex

v0.3.0 Breaking

This release includes 1 breaking change for platform teams planning a safe upgrade.

Published 2mo Developer Productivity

View tool

✓ No known CVEs patched

Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Summary

AI summary

Updates Release Status, Known Issues, and Upgrade Notes For Maintainers across a mixed release.

Full changelog

Gilbert Codex v0.3.0 is the major public alpha build-prep update after 09d34f17 (feat: expand tool runtime and utility surfaces). This release focuses on making the desktop agent runtime cleaner, more inspectable, easier to test, and safer to ship to a much wider group of users.

The headline is not one flashy feature. The headline is that the tool system has been reorganized into a production-shaped runtime: a modular executor, explicit tool registry, workflow layer, stronger provider normalization, better tool-call UI, more targeted tests, and clearer release automation.

This is still an alpha desktop agent. It is ready for broader public use and mass testing, but it is not being presented as a fully enterprise-hardened release. The known issue to keep in front of users is tool reliability with some hosted models: some models follow the shared XML tool protocol cleanly, some models drift or emit malformed calls, and local models through LM Studio or Ollama continue to be a working path when configured correctly.

Release Status

Version: 0.3.0
Tag: v0.3.0
Previous public tag: v0.2.3
Previous origin/main commit before this update: 09d34f17
Primary packaged target: Windows x64
Release vehicle: GitHub Actions Release workflow plus Tauri NSIS installer
Installer family: Gilbert-Codex-0.3.0-x64-setup.exe
Update feed asset: latest.json
Updater signature asset: Gilbert-Codex-0.3.0-x64-setup.exe.sig
Checksum asset: Gilbert-Codex-0.3.0-x64-setup.exe.sha256
Published SHA-256: 369a9524bc1cb8da95a460f27438b0e8b3d8a87a16f30eb5fbee18b9a56dc026

The GitHub release should publish the Windows installer, updater JSON, updater signature, and checksum together. The release workflow now reads this release note from docs/releases/v0.3.0.md so the public GitHub release body stays aligned with the repository docs.

Who This Build Is For

This build is for people who want to try Gilbert Codex as a local-first desktop coding agent workspace:

Developers who want a GUI around chat, local files, terminal sessions, browser preview, and source control.
Testers who want to exercise real tool use across cloud models and local models.
Contributors who want a cleaner source tree for tool runtime work.
Early adopters who understand that tool-calling behavior can vary by provider and model.
Maintainers who need clearer test commands and release automation before larger public distribution.

This build is not pretending every provider/model/tool combination is perfect. The release is explicit about the model/tool compatibility surface so users can report the right failure instead of treating every tool issue as the same bug.

Major Runtime Upgrade

The local computer tool runtime has been split out of the old monolithic executor path and reorganized into focused modules under src/tools/computer/executor/.

The new structure makes the tool layer easier to reason about:

orchestrator.ts owns the pass-level execution flow.
parser.ts owns local tool-call parsing and visible-text stripping.
registry.ts maps tool names to real handlers.
policy.ts keeps standard and deep-research execution policies explicit.
approvals.ts owns approval classification, risk, and session decisions.
fileChanges.ts summarizes file effects for the UI.
fuseMutations.ts merges adjacent safe file mutations.
results.ts formats and recovers tool results.
shell.ts, terminalPolicy.ts, and syntaxCheck.ts isolate terminal and verification behavior.
workspacePolicy.ts and workspaceFormatters.ts isolate workspace boundary logic and output formatting.
Individual tool handlers now live under src/tools/computer/executor/tools/.

That change matters because tool bugs now have real owners. A Git bug does not require reading the browser handler. A terminal policy bug does not require touching file creation. A provider parser bug does not require editing all local command execution code.

Workflow Layer

v0.3.0 introduces the first workflow layer through workflow_run.

The workflow layer is intentionally practical. It does not replace primitive tools. It sequences existing primitives into higher-level routines, keeps approval policy inherited from the active workspace mode, and returns evidence the assistant can use before it edits, commits, or publishes.

Included workflow definitions:

agent-workflow-audit: inspects workflow, tool, approval, and runtime surfaces before proposing changes.
plan-patch-verify: gathers local evidence before guarded edits.
research-backed-patch: combines web evidence with local context for docs-sensitive or API-sensitive changes.
repo-health-sweep: checks repo state and likely validation commands without mutating.
branch-pr-prep: gathers Git evidence before staging, committing, pushing, or opening a PR.
mcp-tool-usage: discovers MCP servers and tools before calls.
monitor-brief: creates a repeatable monitor contract without silently scheduling background jobs.

The workflow engine uses xstate for sequence, parallel, branch, and retry behavior. That is a deliberate step away from prompt-only orchestration. The app can now represent workflow execution as a real runtime concept with testable behavior.

Provider And Model Tool Compatibility

The release makes one important compatibility decision: provider-native local tool calling is disabled for now, and every provider uses the same XML tool_call protocol until Gilbert can persist provider-native tool-call IDs and return provider-native tool_result content on the next turn.

Why this is safer right now:

Different providers use different function-call envelopes.
Some models emit partial or malformed native tool calls.
Some OpenAI-compatible routes behave differently than OpenAI itself.
Some OpenRouter models are aliases or routed through upstream providers with their own tool behavior.
XML tool calls can be normalized consistently across OpenAI, Anthropic, Gemini-compatible routes, OpenRouter, DeepSeek, Groq, xAI, Mistral, LM Studio, and Ollama.

The known issue is still real: some models have tool problems. A model may answer normally but fail when asked to call tools, may place tool calls in reasoning text, may send the wrong argument shape, or may ignore the tool protocol. Other models work fine. Local models work when the selected local server and model follow the shared prompt/tool protocol well enough.

The release adds tests around:

Forced XML protocol behavior.
Disabled provider-native local tool path.
Disabled OpenAI Responses MCP passthrough while XML is forced.
OpenAI tool schema envelope shape.
Anthropic input schema shape.
Gemini OpenAI-compatible schema constraints.
OpenRouter free-model recognition.
Provider-emitted edit_files aliases.

Tool Parsing And Recovery

The parser now accepts more real-world model output shapes:

Standard <tool_call> XML.
Direct XML tool tags such as <edit_file>...</edit_file>.
Anthropic-style <function_calls><invoke name="...">...</invoke></function_calls>.
JSON fenced code blocks with tool or name.
Placeholder names like function or tool_call when the arguments clearly indicate the intended real tool.
inline_edit, old_string/new_string, old_str/new_str, and other compatibility aliases.
edit_files batch shapes using edits, edits_json, paths, old_texts, new_texts, or broadcast text.

The visible assistant bubble strips executable tool markup before users see it. That should reduce confusing moments where the assistant appears to debate tool syntax instead of actually using tools.

The recovery layer now makes failures easier to classify:

Edit retry.
Read retry.
Write retry.
Terminal structured-edit guidance.
Create retry.
Mutation retry.
Syntax retry.

Recoverable failures are returned as evidence for the next pass instead of disappearing into generic error text.

File Editing And Creation

This build tightens the editing path around the way users actually ask for changes.

Improved behaviors:

edit_file remains the preferred tool for precise source edits.
edit_files supports batch edits when the model has multiple safe edits to make.
inline_edit routes to the same precise edit behavior.
write_file stays create-first and requires an explicit replacement path with a fresh checksum when replacing an existing file.
create_files accepts richer batch shapes and reports malformed items instead of failing the whole idea silently.
Duplicate-safe creation remains the default.
rename_path and move_path handle path changes explicitly.
Shell commands that appear to write source files through here-strings, Set-Content, Out-File, Tee-Object, raw redirection, or direct file writes are blocked when structured edit tools are available.

This release is intentionally pushing source mutation toward structured, inspectable tool calls. Terminal remains for builds, tests, package installs, formatters, project setup, and command evidence.

Terminal And Dev Server Handling

Terminal behavior continues to be a critical part of Gilbert Codex because real coding work needs real command output.

v0.3.0 includes the runtime split that supports:

Buffered fast-path terminal commands.
Longer timeouts for package setup.
Shorter evidence timeouts for quick searches.
Long-running dev-server detection.
Background terminal sessions with attachable metadata.
Managed local preview URL detection.
Process-management command detection.
Workspace-bound terminal working directories.
Terminal policy that respects read-only, ask-first, Gilbert review, workspace, and full-computer modes.

The app should be much clearer about whether a terminal command ran, was blocked, is still running in the background, or needs a user decision.

Tool Activity UI

The activity surface now favors a tool-call ledger over raw thinking traces.

That means users should see:

Which tool ran.
Whether it is active, waiting for approval, skipped, failed, or complete.
The compact input summary.
Useful output detail when expanded.
File-change summaries where available.
Terminal metadata where available.
Approval cards for risky operations.

This is a product direction choice. Users need to understand what the agent did, what changed, and what still needs review. Raw chain-of-thought style traces are not the right public surface for that.

Local Git And GitHub Tooling

The local Git tool family remains first-class:

git_status
git_init
git_diff
git_log
git_stage
git_unstage
git_commit
git_push
git_pull
git_fetch
git_branch
git_checkout

The GitHub tool family remains first-class:

repository status and listing
branch and tree reads
file reads
code search
API-backed commits
draft pull requests
release-note generation
release creation
release listing
workflow listing
workflow dispatch
workflow run inspection

For users, the important distinction is still simple: local Git tools operate on the selected clone, while GitHub tools operate through the connected GitHub account in Settings.

MCP Groundwork

The MCP surface is represented in the tool registry and workflow layer:

mcp_list_servers
mcp_list_tools
mcp_call_tool
mcp_set_server
mcp_remove_server
mcp-tool-usage workflow

Provider-native MCP passthrough is intentionally disabled while XML tool mode is forced. That keeps MCP behavior under the same local tool execution and approval path instead of splitting the product into provider-specific tool result formats too early.

Web, Weather, Browser, And Color Tools

The broader utility tool surface remains available through the same registry:

DuckDuckGo-backed web_search for current web evidence and source-backed answers.
NOAA/NWS-oriented weather tool support.
open_browser_preview for local app previews.
browser_automation for controlled browser interactions.
lookup_color for CSS named colors and extended color lookup.

The release keeps web and weather tools behind Toolbox settings. Models should use them when they need current facts, provider docs, API behavior, weather data, color references, or browser-preview evidence.

Documentation Updates

The documentation pass for this release refreshes:

README.md
PROGRESS.md
docs/CODING_TOOLS.md
docs/github/README.md
docs/discord/README.md
docs/platform/README.md
docs/promo/README.md
docs/INSTALLER.md
.github/workflows/release.yml
docs/releases/v0.3.0.md

The goal is to keep public docs honest about what the app can do, what is still alpha-grade, and where users should report problems.

Release Automation Update

The GitHub Release workflow now resolves release notes from docs/releases/<tag>.md before calling the Tauri release action.

For v0.3.0, the workflow should use:

docs/releases/v0.3.0.md

That prevents the automated GitHub Release from collapsing a major build into a one-line body.

Expected automation path:

Commit the v0.3.0 update to main.
Push main.
Create and push tag v0.3.0.
Let GitHub Actions run CI on main.
Let GitHub Actions run the Release workflow on the tag.
Confirm the Release workflow publishes the installer, updater signature, updater JSON, and release body.
Attach or publish the SHA-256 checksum.
Verify the public release page and updater feed.

Download

Expected Windows assets:

Gilbert-Codex-0.3.0-x64-setup.exe
Gilbert-Codex-0.3.0-x64-setup.exe.sha256
Gilbert-Codex-0.3.0-x64-setup.exe.sig
latest.json

This build is still not signed with a Windows Authenticode certificate. Windows SmartScreen may show an extra confirmation prompt before install.

macOS and Linux release artifacts are not official yet. The source tree has partial platform support, but those operating systems still need native maintainers to run, package, and fix platform-specific behavior before official downloads are promised.

Upgrade Notes For Users

Install the Windows x64 setup executable from GitHub Releases.
If Windows SmartScreen appears, review the publisher and source before continuing. This alpha is unsigned.
Configure provider keys, local endpoints, GitHub OAuth, Discord, and workspace permissions from Settings.
For local models, start LM Studio or Ollama first, confirm the model endpoint is reachable, then select the matching provider/model in Gilbert.
If a cloud model answers normally but tools fail, try a different model or local model and file an issue with the model ID, provider, prompt, and tool call that failed.
Keep local databases, logs, screenshots, provider errors, and workspace paths out of public issues unless they are sanitized.

Upgrade Notes For Maintainers

Keep version fields aligned across package.json, package-lock.json, src-tauri/Cargo.toml, src-tauri/Cargo.lock, and src-tauri/tauri.conf.json.
Keep docs/releases/v0.3.0.md aligned with the GitHub Release body.
Use npm.cmd on Windows PowerShell if the npm shim is blocked by script policy.
Run the focused JavaScript tool-runtime tests before release.
Run the full build and Rust check before pushing a release tag.
Confirm the release workflow has TAURI_SIGNING_PRIVATE_KEY and TAURI_SIGNING_PRIVATE_KEY_PASSWORD when needed.
Confirm generated artifacts, logs, local database files, secrets, and dependency folders stay out of Git.

Validation Commands

Recommended local checks:

npm.cmd run test:local-runtime
npm.cmd run test:workflow-layer
npm.cmd run test:tool-recovery
npm.cmd run test:provider-roundtrip
npm.cmd run check
npm.cmd run audit:prod
git diff --check

Recommended release build:

npm.cmd run app:release

The GitHub release workflow performs the signed Windows updater release build when the tag is pushed or the workflow is dispatched with a version input.

Known Issues

Model And Tool Compatibility

Some models have tool problems. This is the top known issue for v0.3.0.

Observed or expected failure modes include:

A model responds with normal text but ignores the tool protocol.
A model emits a tool name that is close but not canonical.
A model places tool calls inside reasoning text.
A model sends malformed XML.
A model sends JSON with the wrong argument shape.
A model starts a tool call and then continues prose as if the tool already ran.
A provider route supports chat completion but not reliable function/tool behavior.
A provider-compatible endpoint accepts OpenAI-like requests but handles tools differently.

Workarounds:

Use a model known to follow the shared XML tool protocol.
Try a local model through LM Studio or Ollama when cloud model tool behavior is unstable.
Retry with a more direct instruction if the model drifted.
Switch to a provider/model with stronger structured-output behavior.
Include the exact model ID, provider, prompt, and failed tool text when reporting.

Provider-Native Tools Disabled

Provider-native local tools are disabled in this build on purpose. The app forces the XML tool protocol for every provider until native tool-call IDs and provider-native tool-result messages can be persisted and replayed safely across turns.

This may feel conservative, but it prevents different providers from splitting the runtime into incompatible result contracts.

MCP Passthrough Disabled

OpenAI Responses MCP passthrough is disabled while XML mode is forced. MCP tools still belong to the local Gilbert tool/approval path for now.

Windows-Only Packaged Release

Windows x64 is the only official packaged target for v0.3.0. macOS and Linux source support exists, but official release artifacts require native validation.

Unsigned Windows Installer

The installer is not Authenticode code-signed yet. SmartScreen warnings are expected on some machines.

Tooling Still Needs More Real-World Coverage

The runtime has focused tests now, but broad real-world coverage is still needed across:

long-running terminal sessions
package managers beyond npm
very large repositories
provider-specific model catalogs
local-model server restarts
GitHub OAuth scope edge cases
Discord bridge behavior during long agent runs
browser preview behavior across local frameworks
MCP server failures and tool schema mismatches

Security And Privacy Notes

Gilbert Codex is local-first. Provider keys, GitHub tokens, Discord settings, local accounts, local databases, logs, workspace files, scan artifacts, release signing credentials, and updater private keys are local user data. They should not be committed or attached to public issues.

The release continues the repository hygiene posture:

.env stays ignored.
Local SQLite databases stay ignored.
SQLite journal and WAL files stay ignored.
Local scan outputs stay ignored.
Dependency folders stay ignored.
Build outputs stay ignored.
Release signing credentials stay out of source control.

Production Readiness Posture

This build is prepared for larger public testing and public release distribution. It has a cleaner architecture, more tests, better docs, and stronger automation than the previous public build.

It is not claiming perfect model/tool reliability. The mass-use path for v0.3.0 is to ship the build with explicit known issues, gather provider/model compatibility reports, and continue tightening the runtime with real evidence.

The release is clean enough to publish. It is honest enough to trust.

Breaking Changes

Provider‑native local tool calling is disabled; all providers must use the XML `tool_call` protocol until native IDs can be persisted.

View diff on GitHub

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Share on X Share on Bluesky

Track UrbanWafflezz/GilbertCodex

Get notified when new releases ship.

About UrbanWafflezz/GilbertCodex

All releases →