Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,11 @@ quick-doc: ## Build the doc & serve it locally
uv run python -m http.server --directory ./script-docs/_build/html/
.PHONY: quick-doc

live-docs: ## Serve the doc locally with auto-reload
cd ./script-docs && uv run sphinx-autobuild . _build/html --port 8000 --host 127.0.0.1
.PHONY: live-docs


test: ## Launch unit tests
uv run pytest -vvv ./tests
.PHONY: test
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ docs = [
"ragas>=0.3.5,<=0.3.5",
"ipywidgets>=8.1.7",
"torch>=2.8.0",
"sphinx-autobuild>=2024.10.3",
]

[project.urls]
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
18 changes: 13 additions & 5 deletions script-docs/hub/ui/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,21 @@ Quickstart & setup

The Hub is the user interface from which you can perform LLM evaluations. It implements the following 4-step workflow:


.. image:: /_static/images/hub/hub-workflow.png
:align: center
:alt: "The hub workflow"
:width: 800


.. grid:: 1 1 2 2

.. grid-item-card:: Scan your agent for vulnerabilities
:link: scan/index
:link-type: doc

Automatically scan your agent for safety and security failures.

.. grid-item-card:: Create test datasets
:link: datasets/index
:link-type: doc
Expand Down Expand Up @@ -41,11 +54,6 @@ The Hub is the user interface from which you can perform LLM evaluations. It imp

Detect emerging vulnerabilities through proactive red teaming.

.. image:: /_static/images/hub/hub-workflow.png
:align: center
:alt: "The hub workflow"
:width: 800

.. note::

Throughout this user guide, we'll use a banking app called Zephyr Bank, designed by data scientists. The app's agent provides customer service support on their website, offering knowledge about the bank's products, services, and more.
Expand Down
123 changes: 123 additions & 0 deletions script-docs/hub/ui/scan/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
:og:title: Giskard Hub - Enterprise AI Agent Testing - AI Vulnerability Scan
:og:description: Scan your AI agent for safety and security failures, including prompt injection, harmful content, excessive agency and other OWASP top 10 vulnerabilities.


===============================================
AI Vulnerability Scan
===============================================

Test your AI agent for safety and security vulnerabilities with automated red teaming attacks.

The vulnerability scan helps you identify weaknesses in your AI agent by testing it against common attack patterns. This includes:

* Prompt injection attempts
* Harmful content generation
* Data extraction attacks
* Other OWASP GenAI Top 10 risks

**How it works:**
The scan runs dozens of specialized red teaming probes that adapt to your agent's capabilities and use case. Each probe tests for specific vulnerabilities and provides detailed results.

**What you get:**
* A security grade (A-D) based on detected vulnerabilities
* Detailed breakdown by attack category and severity
* Conversation logs showing exactly how attacks were performed
* Actionable insights to improve your agent's defenses

.. image:: /_static/images/hub/scan/scan-results.png
:align: center
:alt: "Example of vulnerability scan results"
:width: 800

Quick start
-----------

1. Go to **Scan** in the left sidebar
2. Click **Launch Scan**
3. Select your agent and vulnerability categories to test
4. Click **Launch Scan** to start the red teaming process
5. Review results and take action on detected vulnerabilities

.. toctree::
:maxdepth: 2
:hidden:

launch-scan
review-scan-results



Vulnerability categories
------------------------

The scan tests for these common AI security risks:

Security Risks
==============

.. grid:: 2

.. grid-item-card:: 🔓 Prompt Injection
:class-card: sd-border-1

Malicious prompts that bypass your agent's safety instructions

.. grid-item-card:: 📊 Training Data Extraction
:class-card: sd-border-1

Attempts to expose sensitive data from your model's training

.. grid-item-card:: 🔍 Internal Information Exposure
:class-card: sd-border-1

Leakage of system configurations or internal data

.. grid-item-card:: 🛡️ Data Privacy & Exfiltration
:class-card: sd-border-1

Unauthorized access to user data or privacy violations

Safety Risks
============

.. grid:: 2

.. grid-item-card:: ⚠️ Harmful Content Generation
:class-card: sd-border-1

Toxic, offensive, or policy-violating content creation

.. grid-item-card:: 🚫 Excessive Agency
:class-card: sd-border-1

Actions beyond intended scope or authority level

.. grid-item-card:: 💥 Denial of Service
:class-card: sd-border-1

Resource exhaustion attacks that disable your system

Business Risks
==============

.. grid:: 2

.. grid-item-card:: 🤔 Hallucination & Misinformation
:class-card: sd-border-1

False or misleading information that damages trust

.. grid-item-card:: 📉 Brand Damaging & Reputation
:class-card: sd-border-1

Outputs that harm your brand or public perception

.. grid-item-card:: ⚖️ Legal & Financial Risk
:class-card: sd-border-1

Content leading to legal liability or financial harm

.. grid-item-card:: 💼 Misguidance & Unauthorized Advice
:class-card: sd-border-1

Advice outside your agent's intended expertise
40 changes: 40 additions & 0 deletions script-docs/hub/ui/scan/launch-scan.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
===============================================
Launch a red teaming scan
===============================================

Start testing your AI agent for security vulnerabilities.

How to launch a scan
--------------------

1. **Navigate to the scan page**
Click **Scan** in the left sidebar, then **Launch Scan**

2. **Select your agent**
Choose which AI agent you want to test from the dropdown

3. **Choose vulnerability categories**
Select which types of attacks to test (all categories are included by default)

4. **Add knowledge base (optional)**
Select a knowledge base to enable more targeted testing scenarios

5. **Start the scan**
Click **Launch Scan** to begin the red teaming process

.. image:: /_static/images/hub/scan/launch-scan.png
:align: center
:alt: "Launch scan configuration page"
:width: 800

Monitor scan progress
---------------------

Once started, you can track the scan's progress in real-time:

.. image:: /_static/images/hub/scan/scan-running.png
:align: center
:alt: "Scan progress view"
:width: 800

The scan typically takes 5-15 minutes depending on your agent's complexity and the number of categories selected.
63 changes: 63 additions & 0 deletions script-docs/hub/ui/scan/review-scan-results.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
===============================================
Review scan results
===============================================

Understand your AI agent's security vulnerabilities and take action to fix them.

Understanding your security grade
----------------------------------

Your scan results include a security grade from A to D:

* **A**: No issues detected - your agent passed all security tests
* **B**: Only minor issues detected - low-risk vulnerabilities that should be reviewed
* **C**: A major issue was detected - moderate-risk vulnerability requiring attention
* **D**: A critical issue was detected - high-risk vulnerability needing immediate action

.. image:: /_static/images/hub/scan/scan-results.png
:align: center
:alt: "Scan results overview with security grade"
:width: 800

Explore attack details
----------------------

Scroll to any vulnerability category to see the specific attacks that were tested:

.. image:: /_static/images/hub/scan/probe-listing.png
:align: center
:alt: "List of probes and attack attempts"
:width: 800

Analyze individual vulnerabilities
----------------------------------

Click **Review** next to any probe to see detailed attack results:

.. image:: /_static/images/hub/scan/attempt-successful.png
:align: center
:alt: "Detailed view of a successful attack attempt"
:width: 800

This shows you:

* The exact prompts used in the attack
* Your agent's responses
* Whether the attack succeeded
* Why it's considered a vulnerability

Take action on findings
-----------------------

For each detected issue, you can:

**Mark as false positive**
If the identified issue doesn't represent a real risk in your context, mark it as a false positive. This updates your security grade automatically.

**Convert to test case**
Click **Send to dataset** to save the attack as a reproducible test case. This helps you:

* Track fixes over time
* Build regression tests
* Share examples with your team

1 change: 1 addition & 0 deletions script-docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,7 @@ Some work has been funded by the `the European Commission <https://commission.eu
:maxdepth: 2

hub/ui/index
hub/ui/scan/index
hub/ui/datasets/index
hub/ui/annotate
hub/ui/evaluations
Expand Down
Loading
Loading