Giskard-AI · mattbit · Sep 24, 2025 · Sep 24, 2025 · Sep 24, 2025 · Sep 24, 2025
diff --git a/Makefile b/Makefile
@@ -48,6 +48,11 @@ quick-doc: ## Build the doc & serve it locally
 	uv run python -m http.server --directory ./script-docs/_build/html/
 .PHONY: quick-doc
 
+live-docs: ## Serve the doc locally with auto-reload
+	cd ./script-docs && uv run sphinx-autobuild . _build/html --port 8000 --host 127.0.0.1
+.PHONY: live-docs
+
+
 test: ## Launch unit tests
 	uv run pytest -vvv ./tests
 .PHONY: test
diff --git a/pyproject.toml b/pyproject.toml
@@ -38,6 +38,7 @@ docs = [
     "ragas>=0.3.5,<=0.3.5",
     "ipywidgets>=8.1.7",
     "torch>=2.8.0",
+    "sphinx-autobuild>=2024.10.3",
 ]
 
 [project.urls]

diff --git a/script-docs/_static/images/hub/scan/attempt-successful.png b/script-docs/_static/images/hub/scan/attempt-successful.png
diff --git a/script-docs/_static/images/hub/scan/launch-scan.png b/script-docs/_static/images/hub/scan/launch-scan.png
diff --git a/script-docs/_static/images/hub/scan/probe-listing.png b/script-docs/_static/images/hub/scan/probe-listing.png
diff --git a/script-docs/_static/images/hub/scan/scan-results.png b/script-docs/_static/images/hub/scan/scan-results.png
diff --git a/script-docs/_static/images/hub/scan/scan-running.png b/script-docs/_static/images/hub/scan/scan-running.png
diff --git a/script-docs/hub/ui/index.rst b/script-docs/hub/ui/index.rst
@@ -9,8 +9,21 @@ Quickstart & setup
 
 The Hub is the user interface from which you can perform LLM evaluations. It implements the following 4-step workflow:
 
+
+.. image:: /_static/images/hub/hub-workflow.png
+   :align: center
+   :alt: "The hub workflow"
+   :width: 800
+
+
 .. grid:: 1 1 2 2
 
+   .. grid-item-card:: Scan your agent for vulnerabilities
+      :link: scan/index
+      :link-type: doc
+
+      Automatically scan your agent for safety and security failures.
+
    .. grid-item-card:: Create test datasets
       :link: datasets/index
       :link-type: doc
@@ -41,11 +54,6 @@ The Hub is the user interface from which you can perform LLM evaluations. It imp
 
       Detect emerging vulnerabilities through proactive red teaming.
 
-.. image:: /_static/images/hub/hub-workflow.png
-   :align: center
-   :alt: "The hub workflow"
-   :width: 800
-
 .. note::
 
     Throughout this user guide, we'll use a banking app called Zephyr Bank, designed by data scientists. The app's agent provides customer service support on their website, offering knowledge about the bank's products, services, and more.

diff --git a/script-docs/hub/ui/scan/index.rst b/script-docs/hub/ui/scan/index.rst
@@ -0,0 +1,123 @@
+:og:title: Giskard Hub - Enterprise AI Agent Testing - AI Vulnerability Scan
+:og:description: Scan your AI agent for safety and security failures, including prompt injection, harmful content, excessive agency and other OWASP top 10 vulnerabilities.
+
+
+===============================================
+AI Vulnerability Scan
+===============================================
+
+Test your AI agent for safety and security vulnerabilities with automated red teaming attacks.
+
+The vulnerability scan helps you identify weaknesses in your AI agent by testing it against common attack patterns. This includes:
+
+* Prompt injection attempts
+* Harmful content generation
+* Data extraction attacks
+* Other OWASP GenAI Top 10 risks
+
+**How it works:**
+The scan runs dozens of specialized red teaming probes that adapt to your agent's capabilities and use case. Each probe tests for specific vulnerabilities and provides detailed results.
+
+**What you get:**
+* A security grade (A-D) based on detected vulnerabilities
+* Detailed breakdown by attack category and severity
+* Conversation logs showing exactly how attacks were performed
+* Actionable insights to improve your agent's defenses
+
+.. image:: /_static/images/hub/scan/scan-results.png
+   :align: center
+   :alt: "Example of vulnerability scan results"
+   :width: 800
+
+Quick start
+-----------
+
+1. Go to **Scan** in the left sidebar
+2. Click **Launch Scan**
+3. Select your agent and vulnerability categories to test
+4. Click **Launch Scan** to start the red teaming process
+5. Review results and take action on detected vulnerabilities
+
+.. toctree::
+   :maxdepth: 2
+   :hidden:
+
+   launch-scan
+   review-scan-results
+
+
+
+Vulnerability categories
+------------------------
+
+The scan tests for these common AI security risks:
+
+Security Risks
+==============
+
+.. grid:: 2
+
+    .. grid-item-card:: 🔓 Prompt Injection
+        :class-card: sd-border-1
+
+        Malicious prompts that bypass your agent's safety instructions
+
+    .. grid-item-card:: 📊 Training Data Extraction
+        :class-card: sd-border-1
+
+        Attempts to expose sensitive data from your model's training
+
+    .. grid-item-card:: 🔍 Internal Information Exposure
+        :class-card: sd-border-1
+
+        Leakage of system configurations or internal data
+
+    .. grid-item-card:: 🛡️ Data Privacy & Exfiltration
+        :class-card: sd-border-1
+
+        Unauthorized access to user data or privacy violations
+
+Safety Risks
+============
+
+.. grid:: 2
+
+    .. grid-item-card:: ⚠️ Harmful Content Generation
+        :class-card: sd-border-1
+
+        Toxic, offensive, or policy-violating content creation
+
+    .. grid-item-card:: 🚫 Excessive Agency
+        :class-card: sd-border-1
+
+        Actions beyond intended scope or authority level
+
+    .. grid-item-card:: 💥 Denial of Service
+        :class-card: sd-border-1
+
+        Resource exhaustion attacks that disable your system
+
+Business Risks
+==============
+
+.. grid:: 2
+
+    .. grid-item-card:: 🤔 Hallucination & Misinformation
+        :class-card: sd-border-1
+
+        False or misleading information that damages trust
+
+    .. grid-item-card:: 📉 Brand Damaging & Reputation
+        :class-card: sd-border-1
+
+        Outputs that harm your brand or public perception
+
+    .. grid-item-card:: ⚖️ Legal & Financial Risk
+        :class-card: sd-border-1
+
+        Content leading to legal liability or financial harm
+
+    .. grid-item-card:: 💼 Misguidance & Unauthorized Advice
+        :class-card: sd-border-1
+
+        Advice outside your agent's intended expertise
diff --git a/script-docs/hub/ui/scan/launch-scan.rst b/script-docs/hub/ui/scan/launch-scan.rst
@@ -0,0 +1,40 @@
+===============================================
+Launch a red teaming scan
+===============================================
+
+Start testing your AI agent for security vulnerabilities.
+
+How to launch a scan
+--------------------
+
+1. **Navigate to the scan page**
+   Click **Scan** in the left sidebar, then **Launch Scan**
+
+2. **Select your agent**
+   Choose which AI agent you want to test from the dropdown
+
+3. **Choose vulnerability categories**
+   Select which types of attacks to test (all categories are included by default)
+
+4. **Add knowledge base (optional)**
+   Select a knowledge base to enable more targeted testing scenarios
+
+5. **Start the scan**
+   Click **Launch Scan** to begin the red teaming process
+
+.. image:: /_static/images/hub/scan/launch-scan.png
+   :align: center
+   :alt: "Launch scan configuration page"
+   :width: 800
+
+Monitor scan progress
+---------------------
+
+Once started, you can track the scan's progress in real-time:
+
+.. image:: /_static/images/hub/scan/scan-running.png
+   :align: center
+   :alt: "Scan progress view"
+   :width: 800
+
+The scan typically takes 5-15 minutes depending on your agent's complexity and the number of categories selected.
diff --git a/script-docs/hub/ui/scan/review-scan-results.rst b/script-docs/hub/ui/scan/review-scan-results.rst
@@ -0,0 +1,63 @@
+===============================================
+Review scan results
+===============================================
+
+Understand your AI agent's security vulnerabilities and take action to fix them.
+
+Understanding your security grade
+----------------------------------
+
+Your scan results include a security grade from A to D:
+
+* **A**: No issues detected - your agent passed all security tests
+* **B**: Only minor issues detected - low-risk vulnerabilities that should be reviewed
+* **C**: A major issue was detected - moderate-risk vulnerability requiring attention
+* **D**: A critical issue was detected - high-risk vulnerability needing immediate action
+
+.. image:: /_static/images/hub/scan/scan-results.png
+   :align: center
+   :alt: "Scan results overview with security grade"
+   :width: 800
+
+Explore attack details
+----------------------
+
+Scroll to any vulnerability category to see the specific attacks that were tested:
+
+.. image:: /_static/images/hub/scan/probe-listing.png
+   :align: center
+   :alt: "List of probes and attack attempts"
+   :width: 800
+
+Analyze individual vulnerabilities
+----------------------------------
+
+Click **Review** next to any probe to see detailed attack results:
+
+.. image:: /_static/images/hub/scan/attempt-successful.png
+   :align: center
+   :alt: "Detailed view of a successful attack attempt"
+   :width: 800
+
+This shows you:
+
+* The exact prompts used in the attack
+* Your agent's responses
+* Whether the attack succeeded
+* Why it's considered a vulnerability
+
+Take action on findings
+-----------------------
+
+For each detected issue, you can:
+
+**Mark as false positive**
+   If the identified issue doesn't represent a real risk in your context, mark it as a false positive. This updates your security grade automatically.
+
+**Convert to test case**
+   Click **Send to dataset** to save the attack as a reproducible test case. This helps you:
+
+   * Track fixes over time
+   * Build regression tests
+   * Share examples with your team
+
diff --git a/script-docs/index.rst b/script-docs/index.rst
@@ -126,6 +126,7 @@ Some work has been funded by the `the European Commission <https://commission.eu
    :maxdepth: 2
 
    hub/ui/index
+   hub/ui/scan/index
    hub/ui/datasets/index
    hub/ui/annotate
    hub/ui/evaluations