Skip to content
5 changes: 3 additions & 2 deletions docs/getting_started/quickstart/quickstart_nlp.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@
},
"outputs": [],
"source": [
"! pip install giskard --upgrade"
"%pip install giskard --upgrade"
]
},
{
Expand Down Expand Up @@ -329,7 +329,8 @@
"metadata": {},
"source": [
"If you are running in a notebook, you can display the scan report directly in the notebook using `display(...)`, otherwise you can export the report to an HTML file. Check the [API Reference](https://docs.giskard.ai/en/latest/reference/scan/report.html#giskard.scanner.report.ScanReport) for more details on the export methods available on the `ScanReport` class."
]
],
"id": "9dd5baaaa6a7ee62"
},
{
"cell_type": "code",
Expand Down
5 changes: 3 additions & 2 deletions docs/getting_started/quickstart/quickstart_tabular.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@
},
"outputs": [],
"source": [
"! pip install giskard --upgrade"
"%pip install giskard --upgrade"
]
},
{
Expand Down Expand Up @@ -294,7 +294,8 @@
"metadata": {},
"source": [
"If you are running in a notebook, you can display the scan report directly in the notebook using `display(...)`, otherwise you can export the report to an HTML file. Check the [API Reference](https://docs.giskard.ai/en/latest/reference/scan/report.html#giskard.scanner.report.ScanReport) for more details on the export methods available on the `ScanReport` class."
]
],
"id": "28272f36e73f8a76"
},
{
"cell_type": "code",
Expand Down
44 changes: 29 additions & 15 deletions docs/open_source/customize_tests/data_slices/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ This section explains how to create your own slicing function, or customize the

The [Giskard catalog](../../catalogs/slicing-function-catalog/index.rst) provides you with different slicing functions for NLP such as sentiment, hate, and toxicity detectors:

```
```python
#Load sentiment analysis model from the Giskard catalog
from giskard.ml_worker.testing.functions.slicing import positive_sentiment_analysis
```
Expand All @@ -27,17 +27,20 @@ To create a Giskard slicing function, you just need to decorate an existing Pyth

When `row_level=True`, you can decorate a function that takes a pandas dataframe **row** as input and returns a boolean. Make sure that the first argument of your function corresponds to the row you want to filter:

```
from giskard import slicing_function, demo
```python
import pandas as pd
from giskard import slicing_function, demo, Dataset


_, df = demo.titanic()
dataset = Dataset(df=df, target="Survived", cat_columns=['Pclass', 'Sex', "SibSp", "Parch", "Embarked"])


@slicing_function(row_level=True)
def my_func2(row: pd.Series, threshold: int):
return row['Age'] > threshold


dataset.slice(my_func2(threshold=20))
```

Expand All @@ -47,18 +50,21 @@ dataset.slice(my_func2(threshold=20))

When `row_level=False`, you can decorate a function that takes a full **pandas dataframe** as input and returns a filtered pandas dataframe. Make sure that the first argument of your function corresponds to the pandas dataframe you want to filter:

```
from giskard import slicing_function, demo
```python
from giskard import slicing_function, demo, Dataset
import pandas as pd


_, df = demo.titanic()
dataset = Dataset(df=df, target="Survived", cat_columns=['Pclass', 'Sex', "SibSp", "Parch", "Embarked"])


@slicing_function(row_level=False)
def my_func1(df: pd.DataFrame, threshold: int):
df['Age'] = df['Age'] > threshold
return df


dataset.slice(my_func1(threshold=20))
```

Expand All @@ -68,18 +74,20 @@ dataset.slice(my_func1(threshold=20))

When `cell_level=True` (False by default), you can decorate a function that takes a **value** (string, numeric or text) as an argument and returns a boolean. Make sure that the first argument of your function corresponds to the value and that the second argument defines the **column name** where you want to filter the value:

```
from giskard import slicing_function, demo
import pandas as pd
```python
from giskard import slicing_function, demo, Dataset


_, df = demo.titanic()
dataset = Dataset(df=df, target="Survived", cat_columns=['Pclass', 'Sex', "SibSp", "Parch", "Embarked"])


@slicing_function(cell_level=True)
def my_func3(cell: int, threshold: int):
return cell>threshold
return cell > threshold


train_df.slice(my_func3(threshold=20), column_name='Age')
dataset.slice(my_func3(threshold=20), column_name='Age')
```

::::
Expand All @@ -89,7 +97,11 @@ train_df.slice(my_func3(threshold=20), column_name='Age')

Slicing functions can be very powerful to detect complex behaviour when they are used as fixtures inside your test suite. With the Giskard framework you can easily create complex slicing functions. For instance:

```
```python
import pandas as pd
from giskard import slicing_function


def _sentiment_analysis(x, column_name, threshold, model, emotion):
from transformers import pipeline
sentiment_pipeline = pipeline("sentiment-analysis", model=model)
Expand All @@ -98,6 +110,7 @@ def _sentiment_analysis(x, column_name, threshold, model, emotion):
return x.iloc[list(
map(lambda s: s['label'] == emotion and s['score'] >= threshold, sentiment_pipeline(sentences)))]


@slicing_function(name="Emotion sentiment", row_level=False, tags=["sentiment", "text"])
def emotion_sentiment_analysis(x: pd.DataFrame, column_name: str, emotion: str, threshold: float = 0.9) -> pd.DataFrame:
"""
Expand All @@ -110,15 +123,16 @@ def emotion_sentiment_analysis(x: pd.DataFrame, column_name: str, emotion: str,

Giskard enables you to automatically generate the slicing functions that are the most insightul for your ML models. You can easily extract the results of the [scan feature](../scan/index.rst) using the following code:

```
from giskard import Dataset, Model
```python
from giskard import Dataset, Model, scan


my_dataset = Dataset(...)
my_model = Model(...)

scan_result = giskard.scan(my_model, my_dataset)
scan_result = scan(my_model, my_dataset)
test_suite = scan_result.generate_test_suite("My first test suite")
test_suite.run()[1]
test_suite.run()
```

## Upload your slicing function to the Giskard hub
Expand Down
59 changes: 39 additions & 20 deletions docs/open_source/customize_tests/data_transformations/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ This section explains how to create your own transformation function, or customi

The [Giskard catalog](../../catalogs/transformation-function-catalog/index.rst) provides you with different transformation functions for NLP use cases such as *adding typos*, or *punctuation stripping*.

```
#Import keyboard typo transformations
```python
# Import keyboard typo transformations
from giskard.ml_worker.testing.functions.transformation import keyboard_typo_transformation
```

Expand All @@ -26,18 +26,21 @@ To create a Giskard transformation function, you just need to decorate an existi

When `row_level=True`, you can decorate a function that takes a pandas dataframe **row** as input, and returns a boolean. Make sure that the first argument of your function corresponds to the row you want to filter:

```
from giskard import transformation_function, demo
```python
import pandas as pd
from giskard import transformation_function, demo, Dataset


_, my_df = demo.titanic()
dataset = Dataset(df=my_df, target="Survived", cat_columns=['Pclass', 'Sex', "SibSp", "Parch", "Embarked"])


@transformation_function(row_level=True)
def my_func2(row: pd.Series, offset: int):
row['Age'] = row['Age'] + offset
return row


transformed_dataset = dataset.transform(my_func2(offset=20))
```

Expand All @@ -47,18 +50,21 @@ transformed_dataset = dataset.transform(my_func2(offset=20))

When `row_level=False`, you can decorate a function that takes a full **pandas dataframe** as input, and returns a filtered pandas dataframe. Make sure that the first argument of your function corresponds to the pandas dataframe you want to filter:

```
from giskard import transformation_function, demo
```python
import pandas as pd
from giskard import transformation_function, demo, Dataset


_, df = demo.titanic()
dataset = Dataset(df=df, target="Survived", cat_columns=['Pclass', 'Sex', "SibSp", "Parch", "Embarked"])


@transformation_function(row_level=False)
def my_func1(df: pd.DataFrame, offset: int):
df['Age'] = df['Age'] + offset
return df


transformed_dataset = dataset.transform(my_func1(offset=20))
```

Expand All @@ -68,17 +74,19 @@ transformed_dataset = dataset.transform(my_func1(offset=20))

When `cell_level=True` (False by default), you can decorate a function that takes as argument a **value** (string, numeric or text), and returns a boolean. Make sure that the first argument of your function corresponds to the value, and that the second argument defines the **column name** where you want to filter the value:

```
from giskard import transformation_function, demo
import pandas as pd
```python
from giskard import transformation_function, demo, Dataset


_, df = demo.titanic()
dataset = Dataset(df=df, target="Survived", cat_columns=['Pclass', 'Sex', "SibSp", "Parch", "Embarked"])


@transformation_function(cell_level=True)
def my_func3(cell: int, offset: int):
return cell + offset


transformed_dataset = dataset.transform(my_func3(offset=20), column_name='Age')
```

Expand All @@ -89,11 +97,22 @@ transformed_dataset = dataset.transform(my_func3(offset=20), column_name='Age')

Transformation functions can be very powerful to detect complex behaviour when they are used as fixtures inside your test suite. With the Giskard framework you can easily create complex transformation functions. For example:

```
```python
import os
import pandas as pd
from giskard import transformation_function


@transformation_function(name="Change writing style", row_level=False, tags=['text'])
def change_writing_style(x: pd.DataFrame, index: int, column_name: str, style: str,
OPENAI_API_KEY: str) -> pd.DataFrame:
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
def change_writing_style(
x: pd.DataFrame,
index: int,
column_name: str,
style: str,
openai_api_key: str
) -> pd.DataFrame:
os.environ["OPENAI_API_KEY"] = openai_api_key

rewrite_prompt_template = """
As a text rewriting robot, your task is to rewrite a given text using a specified rewriting style. You will receive a prompt with the following format:
```
Expand All @@ -115,10 +134,9 @@ def change_writing_style(x: pd.DataFrame, index: int, column_name: str, style: s
```
"""

from langchain import PromptTemplate
from langchain import LLMChain
from langchain import OpenAI
from langchain import PromptTemplate, LLMChain, OpenAI


rewrite_prompt = PromptTemplate(input_variables=['text', 'style'], template=rewrite_prompt_template)
chain_rewrite = LLMChain(llm=OpenAI(), prompt=rewrite_prompt)

Expand All @@ -129,15 +147,16 @@ def change_writing_style(x: pd.DataFrame, index: int, column_name: str, style: s

Giskard enables you to automatically generate the transformation functions that are the most insightul for your ML models. You can easily extract the results of the [scan feature](../scan/index.rst) using the following code:

```
from giskard import Dataset, Model
```python
from giskard import Dataset, Model, scan


my_dataset = Dataset(...)
my_model = Model(...)

scan_result = giskard.scan(my_model, my_dataset)
scan_result = scan(my_model, my_dataset)
test_suite = scan_result.generate_test_suite("My first test suite")
test_suite.run()[1]
test_suite.run()
```

## Save your transformation function
Expand Down
Loading