Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 14 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@
<h3 align="center" weight='300' >Modular, Lightweight, Dynamic and Async-first </h3>
<div align="center">

[![GitHub release](https://img.shields.io/github/v/release/Giskard-AI/giskard)](https://github.com/Giskard-AI/giskard/releases)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://github.com/Giskard-AI/giskard/blob/main/LICENSE)
[![Downloads](https://static.pepy.tech/badge/giskard/month)](https://pepy.tech/project/giskard)
[![CI](https://github.com/Giskard-AI/giskard-oss/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/Giskard-AI/giskard-oss/actions/workflows/ci.yml/badge.svg?branch=main)
[![Giskard on Discord](https://img.shields.io/discord/939190303397666868?label=Discord)](https://gisk.ar/discord)
[![GitHub release](https://img.shields.io/github/v/release/Giskard-AI/giskard)](https://github.com/Giskard-AI/giskard/releases)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://github.com/Giskard-AI/giskard/blob/main/LICENSE)
[![Downloads](https://static.pepy.tech/badge/giskard/month)](https://pepy.tech/project/giskard)
[![CI](https://github.com/Giskard-AI/giskard-oss/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/Giskard-AI/giskard-oss/actions/workflows/ci.yml/badge.svg?branch=main)
[![Giskard on Discord](https://img.shields.io/discord/939190303397666868?label=Discord)](https://gisk.ar/discord)

<a rel="me" href="https://fosstodon.org/@Giskard"></a>
<a rel="me" href="https://fosstodon.org/@Giskard"></a>

</div>
<h3 align="center">
Expand All @@ -25,7 +25,7 @@
> [!IMPORTANT]
> **Giskard v3** is a fresh rewrite designed for dynamic, multi-turn testing of AI agents. This release drops heavy dependencies for better efficiency while introducing a more powerful AI vulnerability scanner and enhanced RAG evaluation capabilities. For now, the vulnerability scanner and RAG evaluation still rely on Giskard v2.
> **Giskard v2 remains available but is no longer actively maintained.**
> Follow progress → [Read the v3 Annoucement](https://github.com/orgs/Giskard-AI/discussions/2250) · [Roadmap](https://github.com/Giskard-AI/giskard-oss/issues/2252)
> Follow progress → [Read the v3 Announcement](https://github.com/orgs/Giskard-AI/discussions/2250) · [Roadmap](https://github.com/Giskard-AI/giskard-oss/issues/2252)

## Install

Expand All @@ -37,15 +37,15 @@ Requires Python 3.12+.

**Telemetry:** Libraries built on `giskard-core` (including `giskard-checks`) may send **optional, aggregated usage analytics** to help improve the product. No prompts, model outputs, or scenario text are included. See [what is collected and how to opt out](libs/giskard-core/README.md#telemetry).

______________________________________________________________________
---

Giskard is an open-source Python library for **testing and evaluating agentic systems**. The v3 architecture is a modular set of focused packages — each carrying only the dependencies it needs — built from scratch to wrap anything: an LLM, a black-box agent, or a multi-step pipeline.

| Status | Package | Description |
|--------|---------|-------------|
| ✅ Alpha | `giskard-checks` | Testing & evaluation — scenario API, built-in checks, LLM-as-judge |
| 🚧 In progress | `giskard-scan` | Agent vulnerability scanner — red teaming, prompt injection, data leakage (successor of [v2 Scan](https://legacy-docs.giskard.ai/en/stable/open_source/scan/index.html)) |
| 📋 Planned | `giskard-rag` | RAG evaluation & synthetic data generation (successor of [v2 RAGET](https://legacy-docs.giskard.ai/en/stable/open_source/testset_generation/index.html)) |
| Status | Package | Description |
| -------------- | ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| ✅ Beta | `giskard-checks` | Testing & evaluation — scenario API, built-in checks, LLM-as-judge |
| 🚧 In progress | `giskard-scan` | Agent vulnerability scanner — red teaming, prompt injection, data leakage (successor of [v2 Scan](https://legacy-docs.giskard.ai/en/stable/open_source/scan/index.html)) |
| 📋 Planned | `giskard-rag` | RAG evaluation & synthetic data generation (successor of [v2 RAGET](https://legacy-docs.giskard.ai/en/stable/open_source/testset_generation/index.html)) |

## Giskard Checks — create and apply evals for testing agents

Expand All @@ -56,6 +56,7 @@ pip install giskard-checks
**[Giskard Checks](https://docs.giskard.ai/oss/checks)** is a lightweight library for creating evaluations (evals) that test LLM-based systems — from simple assertions to LLM-as-judge assessments. Unlike traditional unit tests, evals are designed for **non-deterministic outputs** where the same input can produce different valid responses.

Use Giskard Checks to:

- **Catch regressions** — verify your system still behaves correctly after changes
- **Validate RAG quality** — check if answers are grounded in retrieved context
- **Enforce safety rules** — ensure outputs conform to your content policies
Expand Down Expand Up @@ -87,7 +88,6 @@ scenario = (
.check(
Groundedness(
name="answer is grounded",
answer_key="trace.last.outputs",
context="France is a country in Western Europe. Its capital is Paris.",
)
)
Expand All @@ -99,7 +99,6 @@ result.print_report()

> The `run()` method is async. In a script, wrap it with `asyncio.run()`. See the [full docs](https://docs.giskard.ai/oss/checks) for `Suites`, `LLMJudge`, multi-turn scenarios, and more.


## Looking for Giskard v2?

Giskard v2 included **Scan** (automatic vulnerability detection) and **RAGET** (RAG evaluation test set generation) for both ML models and LLM applications. These features are not available in v3.
Expand Down Expand Up @@ -173,15 +172,3 @@ Follow the progress and share feedback:
🌟 [Leave us a star](https://github.com/Giskard-AI/giskard), it helps the project to get discovered by others and keeps us motivated to build awesome open-source tools! 🌟

❤️ If you find our work useful, please consider [sponsoring us](https://github.com/sponsors/Giskard-AI) on GitHub. With a monthly sponsoring, you can get a sponsor badge, display your company in this readme, and get your bug reports prioritized. We also offer one-time sponsoring if you want us to get involved in a consulting project, run a workshop, or give a talk at your company.

<h2 id="sponsors">💚 Current sponsors</h2>

We thank the following companies which are sponsoring our project with monthly donations:

**[Lunary](https://lunary.ai/)**

<img src="https://lunary.ai/logo-blue-bg.svg" alt="Lunary logo" width="100"/>

**[Biolevate](https://www.biolevate.com/)**

<img src="https://awsmp-logos.s3.amazonaws.com/seller-wgamx5z6umune/2d10badd2ccac49699096ea7fb986b98.png" alt="Biolevate logo" width="400"/>
Loading