Skip to content

WFGY

AI Agents & Assistants

WFGY is an open-source AI Troubleshooting Atlas for RAG, agents, and real-world AI workflows. Includes the 16-problem map, Global Debug Card, and WFGY 4.0. ⭐ Star to help more builders find this repo.

Jupyter Notebook Latest v5.0.0-teaser-01 · 23d ago Security brief →

Features

  • Staged functional rollout of the WFGY 5.0 Polaris Protocol
  • Provides a governed protocol layer for building, tuning, verifying, and carrying structured language systems
  • First public portable component: Polaris Goal Compiler

Recent releases

View all 7 releases →
v5.0.0-teaser-01 New feature
Notable features
  • Polaris Goal Compiler: portable human‑AI execution protocol for complex AI workflows
  • Public README, usage notes, and TXT‑based execution constitution included
Full changelog

WFGY 5.0 Teaser 01: Polaris Goal Compiler

This is the first small public teaser release in the staged WFGY 5.0 Polaris Protocol rollout.

WFGY 5.0 has grown beyond a single release package. Instead of waiting for one giant drop, we are now releasing useful components step by step so users can inspect, test, and understand each part properly.

This teaser introduces Polaris Goal Compiler, the first public portable protocol component under the WFGY 5.0 Polaris line.

What is included

  • Polaris Goal Compiler
  • Public README and usage notes
  • Portable TXT-based execution constitution
  • Links to the current Polaris public route
  • Links to the existing public evidence layer

What Polaris Goal Compiler does

Polaris Goal Compiler is a portable human-AI execution protocol for complex AI workflows.

It helps an AI assistant avoid treating raw natural language as if it were already an executable task.

It focuses on:

  • compiling user goals before construction
  • separating truth work from expression work
  • exposing the active task
  • exposing blocked downstream work
  • preventing premature completion claims
  • verifying before unlock
  • making long AI workflows easier to inspect

Current compatibility

At this teaser stage, Polaris Goal Compiler is currently validated for ChatGPT use only.

Other AI assistants, agent systems, IDE agents, and workflow runners may be supported later, but they are not the official target of this teaser release yet.

Please treat this release as a ChatGPT-first public test component.

Important note

This is not the full WFGY 5.0 final release.

It is a teaser component released as part of the staged rollout.

The deeper WFGY 5.0 materials, runtime structures, reproduction workflows, and engine layers will be opened step by step.

Start here

  • Polaris Goal Compiler:
    https://github.com/onestardao/WFGY/blob/main/Polaris/protocols/goal-compiler/README.md

  • WFGY 5.0 Polaris Protocol:
    https://github.com/onestardao/WFGY/blob/main/Polaris/README.md

  • Polaris Experiments and public evidence packages:
    https://github.com/onestardao/WFGY/blob/main/Polaris/experiments/README.md

  • WFGY Discord:
    https://discord.gg/KRxBsr6GYx

Release status

  • Release type: WFGY 5.0 teaser
  • Component: Polaris Goal Compiler
  • Compatibility: ChatGPT only at this stage
  • Stability: public teaser
  • Full WFGY 5.0 release: staged rollout in progress

Thank you for following WFGY.

More components will be released step by step.

WFGY-Easter-Egg-CFV New feature
Notable features
  • Cite First Verification (CFV) adds compact confidence scores, stricter handling of weak claims, and an inspectable mechanism for overconfident answers
Full changelog

WFGY Easter Egg: Cite First Verification v0.1.0

Before WFGY 5.0 Polaris Protocol officially arrives, I want to share a small gift with everyone who has supported WFGY along the way.

WFGY 5.0 is a major release built from the full path of WFGY 1.0, 2.0, 3.0, and 4.0, plus everything I, PSBigBig, have learned through this long journey.

This road has not been easy.

To make WFGY 5.0 stronger, cleaner, and more polished, the official launch will take a little more time.

So before the main release goes online, here is a small Easter Egg:

Cite First Verification

Cite First Verification, or CFV, is a lightweight WFGY mechanism that makes AI answers expose confidence before they sound certain.

It can add compact confidence scores, soften weak statements, and help prevent low-confidence claims from sounding stronger than they should.

Try it:

Use WFGY with CFV Strict. Show only surface CI.

What this Easter Egg gives you

  • compact confidence signals for AI answers
  • stricter handling of weak or uncertain claims
  • a simple way to make overconfident answers easier to inspect
  • an early playable mechanism from the WFGY line before 5.0 arrives

Links

Note

This is not the full WFGY 5.0 release.

It is a small thank-you gift before the larger work goes online.

Thank you for waiting, testing, starring, questioning, and supporting WFGY.

WFGY 5.0 Polaris Protocol is coming.

WFGY-4.0 New feature
Notable features
  • Twin Atlas flagship engine view with constitutional runtime direction
  • Inverse Atlas pre-generative governance methodology
  • Public evidence surface with inspectable proof and reproducibility
Full changelog

WFGY 4.0 is the first public release surface in this repository where the engine-level view becomes legible as one direction rather than a scattered set of ideas.

This release brings two major public surfaces into focus together:

  • Twin Atlas
  • Inverse Atlas

They are related, but they are not duplicates.

Twin Atlas is the flagship engine view.
Inverse Atlas is the legitimacy-first governance methodology that pushes the control point forward, before stronger public emission is allowed.

What this release is really about

Most AI systems still treat generation as if the main question were:

can the model produce an answer?

WFGY 4.0 pushes a stricter question:

has the system earned the right to conclude that strongly yet?

That shift matters because many serious AI failures do not begin when the final sentence is obviously false.

They begin earlier, when the system turns plausibility into public reality too easily:

  • over-committing under pressure
  • crossing evidence boundaries too early
  • collapsing live alternatives into one forced answer
  • treating surface appearance as if it were already proof
  • emitting stronger conclusions than the current support lawfully allows

WFGY 4.0 is built to reduce that failure class.

Twin Atlas

Twin Atlas is the flagship engine view of the WFGY atlas family.

It is not a larger prompt.
It is not a softer answer style.
It is not a decorative reasoning wrapper.

It is a constitutional runtime direction that separates three powers before stronger public emission is allowed:

  1. Forward Atlas
    route-first structural orientation

  2. Bridge
    advisory-only coupling and no-inflation handoff

  3. Inverse Atlas
    legitimacy-first generation governance

Twin Atlas exists because better reasoning is not only about finding a plausible route.
It is also about deciding whether that route has actually earned the right to become a stronger public conclusion.

Inverse Atlas

Inverse Atlas is a pre-generative governance methodology for AI output.

It does not begin from the assumption that every answer should be emitted first and corrected later.

It begins from a stricter question:

has this system earned the right to resolve strongly at all?

The current public MVP surface exposes that methodology through a prompt-based runtime layer so readers can inspect and reproduce the logic now.

In this release, Inverse Atlas should be understood as:

  • a legitimacy-first governance surface
  • a pre-emission control layer
  • a methodology for lawful resolution under real uncertainty
  • a system that treats AUTHORIZED as earned, not assumed

What is now publicly available

This release now makes the public proof surface much easier to inspect.

Public-facing materials now include:

  • Twin Atlas landing surface
  • Inverse Atlas README surface
  • public demo prompts
  • public screenshot layer
  • raw run layer
  • aggregate results summary
  • flagship case layer
  • shortest-path rerun entry

This means WFGY 4.0 is not being released as theory alone.
It is being released with an inspectable public evidence surface.

Public proof surface

The current public WFGY 4.0 release surface includes:

  • Twin Atlas Runtime TXT
  • Governance Stress Suite TXT
  • AI Eval
  • Raw Runs
  • Results Summary
  • Flagship Cases

The core public claim is intentionally narrower than a universal benchmark claim.

The strongest current headline is this:

WFGY 4.0 reduces unauthorized commitment under pressure.

That is the main result this release is willing to stand behind publicly.

This does not mean every model behaves identically.
It does not mean every domain is solved.
It does not mean the current public surface proves every future deployment environment.

It does mean the current release already shows a visible and inspectable shift away from pressure-driven closure and toward lawful downgrade, ambiguity preservation, and ceiling-respecting output.

Why this release matters

This release is important because it makes a design boundary public.

Instead of letting route discovery, handoff, repair language, and public conclusion blur together, WFGY 4.0 draws a harder line between:

  • plausible route
  • structural repair
  • legal authorization
  • public emission

That is the deeper shift behind both Twin Atlas and Inverse Atlas.

Recommended starting points

If you want the fastest visible entry:

  • start with Twin Atlas
  • then open AI Eval
  • then open Results Summary

If you want the shortest rerun path:

  • use the two public TXT files
  • run the public stress suite
  • compare BEFORE vs AFTER

If you want the governance-first methodology:

  • start with Inverse Atlas

Boundary and honesty note

This release does not claim universal superiority in every unknown production environment.

It presents WFGY 4.0 as:

  • a design-complete flagship direction
  • a coherent engine-level architecture
  • a public governance surface that can already be inspected and reproduced
  • an evidence-backed release that is still honest about its current boundary

That honesty boundary is not a weakness.
It is part of the release architecture.

Closing

WFGY 4.0 is not just about improving answers.

It is about changing the conditions under which an AI conclusion is allowed to enter public reality.

That is the release shift.

Not merely better wording.
Not merely safer tone.
But a stronger boundary between route, legitimacy, repair, and public emission.

WFGY-Troubleshooting-Atlas New feature
Notable features
  • Route-first debugging workflow structure
  • Diagnostic surface for RAG and agent systems
  • Failure classification and reproducibility framework
Full changelog

WFGY Atlas — AI Troubleshooting Atlas (Problem Map 3.0)

This release introduces the AI Troubleshooting Atlas, a new route-first diagnostic surface built on top of the WFGY Problem Map system.

The Atlas is designed for teams whose AI systems “run” but still fail in practice — hallucination, drift, collapse, unstable reasoning, or inconsistent behavior across runs.

Instead of debugging from symptoms, the Atlas helps you:

→ route the failure
→ inspect the broken invariant
→ choose the correct first repair move


What’s new

1. AI Troubleshooting Atlas (Problem Map 3.0)

A structured troubleshooting surface for:

  • broken RAG pipelines
  • unstable agent systems
  • multi-step AI workflows
  • reasoning failures that infra metrics cannot explain

The Atlas provides a route-first workflow that turns vague incidents into reproducible failure classes.


2. Route-first debugging model

Unlike traditional AI debugging approaches that rely on ad-hoc patching, the Atlas introduces a structured flow:

symptom → case → family → invariant → first repair move

This makes failures:

  • legible
  • classifiable
  • reproducible
  • fixable

3. Designed for real production systems

The Atlas is intended for:

  • RAG and agent teams shipping real workloads
  • infra and platform owners diagnosing model behavior across deployments
  • evaluation and research teams studying reasoning robustness
  • builders facing recurring unexplained AI failures

WFGY is not another prompt recipe — it is a structured reasoning and debugging system for real-world AI operations. :contentReference[oaicite:0]{index=0}


4. Drop-in diagnostic entry point

The Atlas works as:

  • a standalone troubleshooting surface
  • a companion to the WFGY Global Debug Card
  • an entry layer into the broader WFGY ecosystem

Most real-world integrations currently start with the ProblemMap-style diagnostic layer before expanding into deeper WFGY tooling. :contentReference[oaicite:1]{index=1}


Who should use this

Start here if:

  • your pipeline “works” but outputs are unreliable
  • debugging cycles feel random and repetitive
  • root causes remain unclear after logs + metrics
  • teams cannot agree on failure naming

Getting started

Start with the Atlas:

👉 https://github.com/onestardao/WFGY/blob/Atlas/ProblemMap/wfgy-ai-problem-map-troubleshooting-atlas.md


Notes

This release focuses on clarity, structure, and reproducibility rather than automation.

The Atlas does not attempt to “auto-fix” systems.
It ensures teams fix the right layer first.


If this helps your workflow, consider starring the repo to support the public ecosystem.

WFGY-Global-Debug-Card New feature
Notable features
  • Q/E/P/A semantic gap checking framework
  • Diagnostic card workflow with four observable objects
  • WFGY RAG 16 Problem Map integration
Full changelog

WFGY introduces the Global Debug Card as a public, framework agnostic debug entry for RAG systems.

This release turns the WFGY RAG 16 Problem Map into a compact visual protocol that can be used by humans, LLMs, and tooling to localize failure patterns, classify likely root causes, and propose small structural fixes.

Live page:
Open the Global Debug Card

What this release is

The Global Debug Card is a single card workflow for diagnosing failing RAG runs.

Instead of treating every bad answer as “the model is weak,” this card breaks the run into four observable objects:

  • Q: user question
  • E: retrieved evidence
  • P: final prompt sent to the model
  • A: model answer

It then uses semantic gap checks across these four objects to help identify where the break most likely happens.

Core diagnostic structure

This release centers on a simple but strict debug flow:

  1. Inspect the four objects: Q / E / P / A
  2. Estimate semantic gaps between them
  3. Classify the failure into one of four fix families:
    • R: retrieval
    • P: prompt / reasoning
    • S: state / memory
    • I: infra / deployment
  4. Map the case to the WFGY RAG 16 Problem Map
  5. Apply a small structural fix
  6. Run a tiny verification test

This is designed to reduce random debugging and turn RAG failure analysis into a repeatable protocol.

What it helps classify

The card organizes failures across four lanes:

  • IN: input / retrieval
  • RE: reasoning / planning
  • ST: state / context
  • OP: infra / deploy

Across these lanes, it maps cases into the WFGY 16 recurring failure modes, including retrieval miss, evidence misread, chain drift, logic collapse, broken memory, bootstrap mistakes, deadlocks, and bad deployment states.

Why this matters

Most RAG failures are not caused by one single issue.

They usually come from mixed failures across retrieval, prompt construction, memory state, and system readiness.

The goal of this release is to provide one clean diagnostic surface that helps you:

  • see where the break is happening
  • avoid fixing the wrong layer
  • make smaller, testable changes
  • reuse the same debug logic across different stacks

Intended usage

This release is especially useful when:

  • your RAG system “works,” but answers are still wrong
  • retrieval looks normal, but reasoning goes off track
  • long context causes drift, noise, or memory instability
  • a deployment appears healthy, but behavior is inconsistent
  • you want a portable debug checklist that is not tied to one framework

You can use the card as:

  • a visual debug guide
  • an LLM input artifact
  • a shared troubleshooting reference
  • a future machine readable protocol entry point

Included in this release

  • The WFGY 3.1 Global Debug Card release entry
  • Public hosted debug card page
  • A cleaner external entry point for sharing and onboarding
  • A more direct way to introduce the WFGY RAG 16 Problem Map

Position in the WFGY line

WFGY 3.1 is a focused release.

It does not replace the broader WFGY ecosystem.
It acts as a sharper front door for one specific use case:

Use one card to inspect one failing run, classify the break, and decide what to fix next.

If you are debugging real RAG failures, this is the fastest place to start.

Link

https://onestardao.github.io/debug-card/

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

About

Stars
1,753
Forks
162
Languages
Jupyter Notebook HTML Python

Community & Support

Beta — feedback welcome: [email protected]