The unified web layer for AI agents. One package. Browse, search, post, stay logged in.
A self-hosted Node.js package that gives AI agents complete web access through one API. No duct-taping Camoufox + PinchTab + Tavily + Scrapy + Playwright together. One tool, everything works.
| Feature | Browserbase | Hyperbrowser | Steel.dev | Spectrawl |
|---|---|---|---|---|
| Self-hosted | ❌ | ❌ | ✅ | ✅ |
| Search built-in | ❌ | ❌ | ❌ | ✅ |
| Auth lifecycle | ❌ | ❌ | Partial | ✅ |
| Multi-account | ❌ | ❌ | ❌ | ✅ |
| Cookie auto-refresh | ❌ | ❌ | ❌ | ✅ |
| MCP server | ❌ | ❌ | ❌ | ✅ |
| Fingerprint mgmt | ❌ | ✅ | ❌ | ✅ |
| Per-second billing | ✅ | ✅ | ❌ | Free |
Free Tavily replacement. Aggregates free APIs, scrapes top results, optionally summarizes with LLM.
agent.search("dental implants seoul", { summarize: true, scrapeTop: 3 })
Search cascade (saves quota):
SearXNG (self-hosted, 70+ engines)
→ not enough? → Brave (2000/mo free)
→ still not enough? → Serper (2500/mo free)
→ Google CSE (100/day free)
Scrape escalation:
fetch + readability (fast, free, 90% of pages)
→ blocked/JS-heavy? → stealth browser (Camoufox)
Output:
{
"answer": "LLM-generated summary with citations",
"sources": [
{ "url": "...", "title": "...", "snippet": "...", "fullContent": "..." }
],
"cached": false
}Stealth web browsing with anti-detection. Two modes: headless (fast) and visible (for manual login).
agent.browse("https://reddit.com/r/dentistry", { auth: "reddit" })
Browser selection:
Playwright (fast, cheap, default)
→ detected/blocked? → Camoufox (stealth, anti-fingerprint)
Features:
- Residential proxy rotation (first-class config, not per-tool)
- Canvas/WebGL/font fingerprint management
- Human-like timing (random delays, natural scroll patterns)
- Screenshot + DOM extraction in one call
- Form filling that handles contentEditable, shadow DOM, React controlled inputs
Persistent authenticated sessions with automatic cookie lifecycle.
agent.auth.add("x", { account: "@xanlens__" }) // opens browser for manual login
agent.auth.add("reddit", { clientId: "...", secret: "..." }) // OAuth, fully automated
Auth priority per platform:
OAuth token (if platform supports it) → auto-refresh built in
→ not available? → Cookie capture (one-time manual login)
→ creds provided? → Headless login (no manual step)
Cookie lifecycle:
cron checks validity every N hours
→ valid? → nothing
→ expiring? → event hook: "cookie_expiring"
→ expired? → auto re-login via stored method
→ re-login failed? → event hook: "auth_failed"
Multi-account management:
agent.auth.list()
// → [{ platform: "x", account: "@xanlens__", status: "valid", expires: "..." },
// { platform: "x", account: "@xankriegor_", status: "valid", method: "oauth" },
// { platform: "reddit", account: "@EntrepreneurSharp538", status: "valid" }]
Authenticated actions on platforms. Agent says what to do, tool handles the how.
agent.act("x", "post", { account: "@xanlens__", text: "hello world" })
agent.act("reddit", "post", { account: "@EntrepreneurSharp538", subreddit: "SaaS", title: "...", body: "..." })
agent.act("reddit", "comment", { account: "@EntrepreneurSharp538", postId: "...", text: "..." })
Platform adapters: Each platform has quirks. One adapter per platform handles them:
| Platform | Primary Method | Fallback | Known Quirks |
|---|---|---|---|
| X | Cookie API (GraphQL) | Stealth browser | contentEditable compose, execCommand("insertText") |
| Cookie API (OAuth) | Stealth browser | Blocks datacenter IPs on web, not on oauth.reddit.com | |
| IH | Browser automation | — | No API, Camoufox-only |
| Cookie API | Stealth browser | Aggressive bot detection | |
| Dev.to | REST API | — | Official API, easy |
| Hashnode | GraphQL API | — | Official API, easy |
SQLite with TTL expiry. Same search query twice = instant, free.
cache: {
search: { ttl: "1h" }, // search results
scrape: { ttl: "24h" }, // scraped page content
screenshot: { ttl: "1h" } // page screenshots
}
Built-in delays that mimic human browsing:
- Random intervals between actions (configurable min/max)
- Natural scroll patterns before clicking
- Typing delays for form filling
- Per-platform rate limits (X: max N posts/hour, Reddit: min 10min between posts)
When something fails, return why — not just an error code:
{
"success": false,
"error": "auth_expired",
"detail": "X cookie for @xanlens__ expired 2h ago. Auto-refresh failed: CAPTCHA required.",
"suggestion": "Run `spectrawl login x --account @xanlens__` for manual re-login"
}Agent fires multiple actions. Tool manages concurrency:
- Configurable concurrent browser sessions (default: 3)
- Exponential backoff on failures
- Dead letter queue for permanently failed actions
- Action deduplication (same post won't fire twice)
Proactive notifications to the agent:
agent.on("cookie_expiring", ({ platform, account, expiresIn }) => { ... })
agent.on("auth_failed", ({ platform, account, reason }) => { ... })
agent.on("rate_limited", ({ platform, retryAfter }) => { ... })
agent.on("action_failed", ({ action, error, retryCount }) => { ... })Built-in probe: "are my sessions still valid?"
spectrawl status
// x/@xanlens__ ✅ valid (expires in 14d)
// x/@xankriegor_ ✅ valid (OAuth, auto-refresh)
// reddit/@Entrep... ✅ valid (expires in 22d)
// ih/@Fay ⚠️ expiring (2h left)
Exposed as an MCP tool so any agent framework can use it natively:
{
"tools": [
{ "name": "web_search", "description": "Search the web with free APIs" },
{ "name": "web_browse", "description": "Browse a URL with stealth/auth" },
{ "name": "web_act", "description": "Perform authenticated action on a platform" },
{ "name": "web_auth", "description": "Manage platform authentication" },
{ "name": "web_status", "description": "Check auth health for all accounts" }
]
}Any OpenClaw agent, Claude Code session, or MCP-compatible tool gets access for free. This is the distribution play.
┌──────────────────────────────────────────────────────┐
│ Spectrawl API │
│ (HTTP server + MCP server) │
├──────────┬──────────┬──────────┬─────────────────────┤
│ SEARCH │ BROWSE │ AUTH │ ACT │
├──────────┼──────────┼──────────┼─────────────────────┤
│ DDG │ Playwr. │ OAuth │ Platform Adapters │
│ Brave │ Camoufox │ Cookie │ ┌─────┬──────┬────┐ │
│ Serper │ │ Store │ │ X │Reddit│ IH │ │
│ Google │ Proxy │ (SQLite) │ ├─────┼──────┼────┤ │
│ CSE │ Pool │ │ │Dev │Hash │Link│ │
│ │ │ Refresh │ │.to │node │edIn│ │
│ Scraper │ Finger- │ Cron │ └─────┴──────┴────┘ │
│ (reada- │ print │ │ │
│ bility) │ Mgmt │ Event │ Form Filler │
│ │ │ Hooks │ (contentEditable, │
│ LLM │ Human- │ │ shadow DOM, React) │
│ Summary │ like │ │ │
│ (opt.) │ Timing │ │ Queue + Retry │
├──────────┴──────────┴──────────┴─────────────────────┤
│ CACHE (SQLite + TTL) │
├──────────────────────────────────────────────────────┤
│ CONFIG (spectrawl.json) │
│ - proxy settings │
│ - API keys (Brave, Serper — optional) │
│ - rate limits per platform │
│ - LLM provider for summaries │
│ - cache TTLs │
│ - concurrent sessions limit │
└──────────────────────────────────────────────────────┘
- Runtime: Node.js (matches xanOS)
- Stealth browser: Camoufox
- Fast browser: Playwright
- Storage: SQLite (cache, cookies, queue)
- Config:
spectrawl.jsonin project root - CLI:
spectrawlcommand for login, status, health - Server: HTTP + MCP dual server
- Package: npm (
@fayandxan/spectrawlorspectrawl)
# Setup
spectrawl init # creates spectrawl.json
spectrawl login x --account @handle # opens browser for manual login
spectrawl login reddit --oauth # OAuth flow
# Status
spectrawl status # all accounts health
spectrawl status x # X accounts only
# Server
spectrawl serve # start HTTP + MCP server
spectrawl serve --port 3900 # custom port
# Test
spectrawl search "query" # test search from CLI
spectrawl browse https://example.com # test browse from CLI{
"port": 3900,
"proxy": {
"type": "residential",
"provider": "proxycheap",
"key": "..."
},
"search": {
"brave_key": "...",
"serper_key": "...",
"cascade": ["ddg", "brave", "serper"],
"scrapeTop": 3,
"llm": {
"provider": "minimax",
"model": "m2.5",
"apiKey": "..."
}
},
"auth": {
"refreshInterval": "4h",
"cookieStore": "./data/cookies.db"
},
"rateLimit": {
"x": { "postsPerHour": 5, "minDelayMs": 30000 },
"reddit": { "postsPerHour": 3, "minDelayMs": 600000 }
},
"cache": {
"path": "./data/cache.db",
"searchTtl": "1h",
"scrapeTtl": "24h"
},
"concurrency": 3
}- npm package —
npm install spectrawl - GitHub repo — open source, MIT
- MCP listing — submit to MCP registries
- OpenClaw skill — publish on ClawHub
- Post in OpenClaw GitHub Discussions — "built this to solve multi-tool browsing"
- xanOS uses it internally — dogfood + proof it works
Every user who installs Spectrawl discovers xanOS exists. That's the funnel.