Skip to content
Resurf
AI Agents & Assistants
A realistic, reproducible test framework for AI browser agents using synthetic, stateful websites with failure‑mode injection
Python
·
Latest v0.1.1 · 27d ago
Security brief →
Features
-
Provides a production‑shaped synthetic e-commerce site (FastAPI + React + SQLite) in Docker
-
Injects deterministic failure modes such as latency, payment errors, and server failures per task
-
Offers auditable success evaluation via database state rather than LLM judgment
-
Supports multiple adapters (DOM/AX‑tree native, Stagehand Node bridge, vision‑only baseline)
-
Records full trajectory data including DOM snapshots, screenshots, actions, token counts, and latencies
v0.1.0
New feature
·
Notable features
- resurf SDK with Environment, Task, Runner, Trajectory and adapter ABCs
- shop_v1 synthetic e-commerce site (FastAPI + React + SQLite) with modifier middleware
- Adapters: browser-use, stagehand, vision_baseline
Full changelog
Initial public release.
Added
resurf SDK (import resurf): Environment, Task, Runner, Trajectory, adapter ABC.
shop_v1 synthetic e-commerce site (FastAPI + React + SQLite) with
modifier middleware (latency, payment_outcome, server_error_rate,
session_ttl, frozen_time).
- Adapters:
browser-use, stagehand, vision_baseline.
- ~10 failure-mode templates and ~10 bundled tasks across find / cart /
checkout / account / adversarial / mobile.
- CLI:
resurf task new | from-template | validate | try | list.
resurf-models: shared SQLModel schema + deterministic seed_database.
Weekly OSS security release digest.
The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.
No spam, unsubscribe anytime.
About
Languages
Python
·
TypeScript
·
Jinja
View on GitHub
Search tools, categories, lists, and users
Use ↑↓ to navigate, Enter to open, Esc to close
No results for ""
⌘K to open
↑↓ navigate
⏎ open