Evidence-aware web search, browsing, and claim verification for AI agents.
CLI + MCP + skill surfaces for Gemini, Claude Code, OpenClaw, Manus, and other agent runtimes.
Cross-Validated Search is the flagship repo for source-backed agent answers. It combines live web search, page reading, and claim checking so an agent can surface supporting evidence, conflicting evidence, and source-backed confidence before presenting factual answers.
Canonical names in v16: package
cross-validated-search, modulecross_validated_search, and MCP commandcross-validated-search-mcp. Legacyfree_web_searchimports andfree-web-search-mcpremain available as compatibility aliases.
Recommended free path:
ddgs + self-hosted SearXNG. ConfigureCROSS_VALIDATED_SEARCH_SEARXNG_URLto unlock a free second provider and stronger evidence reports.
- one install gives you
search-web,browse-page,verify-claim, andevidence-report - one repo covers CLI, MCP, Gemini, Claude Code, OpenClaw, Manus, and Copilot-adjacent workflows
- one flagship workflow turns raw search results into an evidence artifact with rationale, conflicts, and next steps
If you are reviewing this repo for collection or ecosystem inclusion, the fastest path is:
- verify the flagship workflow:
evidence-report "Python 3.13 stable release" --claim "Python 3.13 is the latest stable release" --deep --json - review the ecosystem contract: docs/ecosystem-readiness.md
- review the free dual-provider bootstrap: docs/searxng-self-hosted.md
- review Gemini gallery readiness: docs/gemini-submission-checklist.md
- review Claude Code and Manus setup notes: docs/claude-code.md, docs/manus.md
Install and verify the public surface:
pip install cross-validated-search
search-web "Python 3.13 release" --json
verify-claim "Python 3.13 is the latest stable release" --deep --max-pages 2 --json
evidence-report "Python 3.13 stable release" --claim "Python 3.13 is the latest stable release" --deep --jsonTypical evidence-report JSON shape:
{
"verdict": "contested",
"confidence": "MEDIUM",
"coverage_warnings": [
"Single-provider evidence path. Add another provider when possible."
],
"analysis": {
"report_model": "evidence-report-v2",
"provider_diversity": 1,
"page_aware": true,
"support_score": 1.42,
"conflict_score": 0.61,
"coverage_warning_count": 1
}
}If you are evaluating the repo for ecosystem collection, start with docs/ecosystem-readiness.md.
| Surface | Entry | Why it matters |
|---|---|---|
| CLI | search-web, browse-page, verify-claim, evidence-report |
fastest way to verify the repo in under a minute |
| MCP | cross-validated-search-mcp |
works with MCP-native agent runtimes |
| Gemini | GEMINI.md, .gemini/SKILL.md |
gallery / extension readiness |
| Claude Code | .claude/skills/cross-validated-search/SKILL.md |
local skill install path |
| OpenClaw | cross_validated_search/skills/SKILL.md |
bundled skill surface |
| Manus | SKILL.md, docs/manus.md | GitHub-import-friendly skill |
$ search-web "Python 3.13 release" --json
... ranked sources with citations ...
$ verify-claim "Python 3.13 is the latest stable release" --deep --max-pages 2 --json
{"verdict":"likely_supported","confidence":"MEDIUM", ...}
$ evidence-report "Python 3.13 stable release" --claim "Python 3.13 is the latest stable release" --deep --json
{"verdict":"contested","confidence":"MEDIUM","analysis":{"provider_diversity":1,"page_aware":true}}Most search wrappers stop at “here are some results.” This repository goes one step further:
- returns structured search results with citations
- reads full pages when snippets are not enough
- classifies evidence as supporting, conflicting, or neutral
- generates a higher-level evidence report with citation-ready sources and next steps
- exposes explainable confidence signals instead of a black-box claim
- works across CLI, MCP, Gemini, OpenClaw, and other agent workflows
Use live search for factual or time-sensitive queries.
search-web "Python 3.13 release"
search-web "OpenAI release news" --type news --timelimit w
search-web "人工智能最新进展" --region zh-cn --jsonRead the full text of a URL when snippets are not enough.
browse-page "https://example.com/article"
browse-page "https://example.com/article" --jsonCheck whether a claim looks supported, contested, likely false, or still under-evidenced.
verify-claim "Python 3.13 is the latest stable release"
verify-claim "OpenAI released GPT-5 this week" --timelimit w --json
verify-claim "Python 3.13 is the latest stable release" --with-pages --max-pages 2Generate a compact report that combines search, verification, citations, and follow-up guidance.
evidence-report "Python 3.13 stable release"
evidence-report "Python 3.13 stable release" --claim "Python 3.13 is the latest stable release" --deep --jsonThe report now includes:
- verdict rationale that explains why the score landed where it did
- stance summary buckets for supporting, conflicting, and neutral evidence
- coverage warnings when provider diversity, domain diversity, or page-aware depth look weak
- citation-ready source digests and recommended next steps
If you want the strongest free setup, self-host SearXNG and pair it with ddgs:
./scripts/start-searxng.sh
export CROSS_VALIDATED_SEARCH_SEARXNG_URL="http://127.0.0.1:8080"
./scripts/validate-free-path.shOr use the compose file directly:
cp .env.searxng.example .env
docker compose -f docker-compose.searxng.yml up -dFull setup and validation guide: docs/searxng-self-hosted.md.
The current verification model is evidence-aware-heuristic-v3, and the flagship report surface is evidence-report-v2. Together they use:
- keyword overlap between the claim and returned evidence
- contradiction markers in titles and snippets
- source-quality heuristics
- source freshness when a parseable date exists
- domain diversity across the evidence set
- optional page text from top fetched sources
- optional provider diversity when a second provider is configured
Details and limitations are documented in docs/trust-model.md. For a quick product-level comparison with plain search wrappers, see docs/why-not-just-search.md. The next calibration step is outlined in docs/benchmark-plan.md. Quick repository-level context lives in docs/use-cases.md, docs/benchmarks.md, and docs/external-threads.md.
pip install cross-validated-searchOr install from source:
git clone https://github.com/wd041216-bit/cross-validated-search.git
cd cross-validated-search
pip install -e .Python 3.10+ is required.
| Surface | Status | Notes |
|---|---|---|
| CLI | Yes | search-web, browse-page, verify-claim, evidence-report |
| MCP | Yes | cross-validated-search-mcp |
| Gemini CLI | Yes | gemini-extension.json, root skills/, and .gemini/SKILL.md |
| OpenClaw | Yes | cross_validated_search/skills/SKILL.md |
| Claude Code | Yes | .claude/skills/cross-validated-search/SKILL.md |
| Cursor / Continue / Copilot | Yes | Bundled instruction and skill files |
| Manus | Yes | Root SKILL.md plus docs/manus.md |
verify-claim returns one of five verdicts:
| Verdict | Meaning |
|---|---|
supported |
Strong support, low conflict, and enough domain diversity |
likely_supported |
Evidence leans positive but is not decisive |
contested |
Support and conflict both carry meaningful weight |
likely_false |
Conflict is strong and support is weak |
insufficient_evidence |
Evidence exists but is too weak for a firmer claim |
This is an evidence-aware heuristic system, not a fact-level proof engine. Today it still has three important limits:
- default install still starts with
ddgs; the recommended collection-grade free path isddgs + self-hosted searxng - page-aware verification is optional and still heuristic rather than full-document entailment
- no benchmark-driven confidence calibration yet
Add the server to your MCP client:
{
"mcpServers": {
"cross-validated-search": {
"command": "cross-validated-search-mcp",
"args": []
}
}
}Run the test suite:
python -m unittest discover -s tests -vBuild distributables:
python -m buildRun deterministic benchmark regressions:
python benchmarks/run_benchmark.pyThe next upgrades needed for ecosystem-grade collection are:
- calibrate provider weighting and add stronger provider-specific tests
- improve page-aware verification beyond snippet and keyword heuristics
- add benchmark fixtures and regression scoring
- improve the flagship
evidence-reportworkflow with richer source summarization and calibration
MIT License.