Skip to content

Resurf

AI Agents & Assistants

A realistic, reproducible test framework for AI browser agents using synthetic, stateful websites with failure‑mode injection

Python Latest v0.1.1 · 27d ago Security brief →

Features

  • Provides a production‑shaped synthetic e-commerce site (FastAPI + React + SQLite) in Docker
  • Injects deterministic failure modes such as latency, payment errors, and server failures per task
  • Offers auditable success evaluation via database state rather than LLM judgment
  • Supports multiple adapters (DOM/AX‑tree native, Stagehand Node bridge, vision‑only baseline)
  • Records full trajectory data including DOM snapshots, screenshots, actions, token counts, and latencies

Recent releases

View all 2 releases →
v0.1.0 New feature
Notable features
  • resurf SDK with Environment, Task, Runner, Trajectory and adapter ABCs
  • shop_v1 synthetic e-commerce site (FastAPI + React + SQLite) with modifier middleware
  • Adapters: browser-use, stagehand, vision_baseline
Full changelog

Initial public release.

Added

  • resurf SDK (import resurf): Environment, Task, Runner, Trajectory, adapter ABC.
  • shop_v1 synthetic e-commerce site (FastAPI + React + SQLite) with
    modifier middleware (latency, payment_outcome, server_error_rate,
    session_ttl, frozen_time).
  • Adapters: browser-use, stagehand, vision_baseline.
  • ~10 failure-mode templates and ~10 bundled tasks across find / cart /
    checkout / account / adversarial / mobile.
  • CLI: resurf task new | from-template | validate | try | list.
  • resurf-models: shared SQLModel schema + deterministic seed_database.

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

About

Stars
5
Forks
0
Languages
Python TypeScript Jinja

Install & Platforms

Install via
pip

Beta — feedback welcome: [email protected]