Skip to content

0xMassi/webclaw

v0.1.4 Feature

This release adds 3 notable features for engineering teams evaluating rollout.

Published 2mo MCP Developer Tools
✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →

✓ No known CVEs patched in this version

Topics

ai ai-agents ai-scraping cli crawler data-extraction
+13 more
firecrawl-alternative html-to-markdown llm markdown mcp mcp-server rust self-hosted tls-fingerprinting web-crawler web-extraction web-scraper web-scraping

Affected surfaces

rce_ssrf

Summary

AI summary

Added QuickJS integration to extract JavaScript‑embedded data, enabled by default with minimal performance impact.

Full changelog

Added

  • QuickJS integration: embeds a sandboxed JavaScript engine to execute inline <script> tags and extract data hidden in JS variable assignments
  • Captures window.__preloadedData (NYTimes), window.__PRELOADED_STATE__ (Wired/Conde Nast), self.__next_f (Next.js RSC), and any window.__* data blobs
  • Smart text filtering: rejects CSS, base64, file paths, code — only keeps readable prose
  • Feature-gated: enabled by default, disable with --no-default-features for WASM builds

Results

| Site | Before | After | Gain |
|---|---|---|---|
| NYTimes | 1,552 words | 4,162 words | +168% |
| Wired | 1,459 words | 9,937 words | +580% |

Sites that already SSR well (BBC, GitHub, HN) are unaffected — QuickJS only adds content when there's hidden data in scripts.

Performance: <15ms overhead per page. Binary size +1-2MB.

Full changelog: https://github.com/0xMassi/webclaw/blob/main/CHANGELOG.md

Full Changelog: https://github.com/0xMassi/webclaw/compare/v0.1.3...v0.1.4

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track 0xMassi/webclaw

Get notified when new releases ship.

Sign up free

About 0xMassi/webclaw

Web content extraction for AI agents. 10 tools: scrape, crawl, map, batch, extract, summarize, diff, brand, search, research. TLS fingerprinting bypasses anti-bot without a browser. 67% fewer tokens than raw HTML. `npx create-webclaw` auto-configures Claude, Cursor, Windsurf, Codex, OpenCode.

All releases →

Beta — feedback welcome: [email protected]