Skip to content

Commit cb2a75a

Browse files
rabah-khalekAbSsEnTandreybavt
authored
[GSK-1504] Integration with W&B (#1288)
* added wandb run contextmanager * added to_wandb for scan results * Added new method to the Dataset class to log dataset to the WandB run. (#1294) * Added new method to the Dataset class to log dataset to the WandB run. * updated to_wandb --------- Co-authored-by: Rabah Abdul Khalek <[email protected]> * setting up the doc skeleton * updated pyproject and pdm lock with wandb * working on tests * GSK-1531 (#1301) * Added new method to the TestSuiteResult class to log its execution results to the WandB run. * Resolved issues. * refactoring _parse_test_name --------- Co-authored-by: Rabah Abdul Khalek <[email protected]> * functional tests implemented (GSK-1535) * fixed code smell * updated docs * updated imports * GSK-1533 (#1307) * Initial commit with the implementation of the SHAP explanation graphs logging to the WandB run. * Changed logic of obtaining feature names and types. * Removed redundant 'model.prepare_dataframe'. Small refactoring. * Added sorting of logged dataset, test suite result and scan result to distinct panels. * Moved 'explain' function below shap-related functions. * Code refactoring. * Changed naming for variables inside functions. * Removed explainer return, as it is not needed. * Moved 'prepare_df' to the separate utils.py file to avoid code duplication. * Added docstring to the '_get_cls_prediction_explanation' * Created dataclass ShapResult to store shap explanations there and encapsulate the logic of uploading SHAP charts to the WandB. * Refactoring of the 'background_example' function. * Refactoring. * Refactoring. * Changed enum class declaration. * Refactored model_explanation.py to be able to perform testing of explanation results equality. Added unit-tests for the SHAP logging to the WandB. * Small fix in comments. * Uncommented fixture. * Refactored "_get_highest_prob_shap" function. Made it more compact and self-explainable. * Removed #noqa options from the shap imports. Optimized imports. * Refactored _prepare_for_explanation function. Changed naming of the function output to highlight, that this data will be explained. * Renamed explain_full(one) to "_calculate_dataset(sample)_shap_values" * Refactored _get_background_example function. * Refactored 'explain_with_shap' function and 'ShapResult' dataclass for better handling classification models explanation. * Fixed bugs with unit-tests for wandb. * Transferred '_compare_explain_functions' to the 'test_model_explanation.py' * Refactoring. Renaming and functions replacement. * Renaming. * Transferred plotting functions from the shap_result.py to the wandb_utils.py to better handle wandb importing necessity. * small update to error msg * updated unit test * small update --------- Co-authored-by: Rabah Abdul Khalek <[email protected]> * added errors and telemetry * fixing code smells * fixed indent * turned off validation of Dataset in model_explanation * exposed explain_with_shap * converted error to warning * updated tests * restored fixtures * New example notebook to show WandB integration functionality. * WandB notebook refactoring. Committing images. * Removed blank cell. * Removed blank cell. * updated docs * Replaced screenshots with the giskard scan result. * Fixed 'explain_with_shap' issue, when the model is the LGBM. * Updated screenshot with test-suite results comparison for multiple runs. * updated pdm lock * implementing AA's feedback * GSK-1565 (#1339) * Added docstrings to the "model_explanation.py". * Added docstrings to the "shap_result.py". * Fix in docstrings * Added docstring to the 'Dataset.to_wandb'. * Added docstring to the 'ScanResult.to_wandb'. * Added docstring to the 'TestSuiteResult.to_wandb'. * Resolved issues after PR review. * updated docstrings --------- Co-authored-by: Rabah Abdul Khalek <[email protected]> --------- Co-authored-by: AbSsEnT <[email protected]> Co-authored-by: Andrey Avtomonov <[email protected]>
1 parent 8f69e0f commit cb2a75a

22 files changed

Lines changed: 4520 additions & 3185 deletions
356 KB
Loading
312 KB
Loading
202 KB
Loading
9.61 KB
Loading
562 KB
Loading
188 KB
Loading
252 KB
Loading

python-client/docs/integrations/index.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
:hidden:
88
99
mlflow/index
10+
wandb/index
1011
```
1112

1213
::::::{grid} 1 1 2 2
@@ -18,4 +19,11 @@ mlflow/index
1819
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<img src="../assets/integrations/mlflow/MLflow-logo-final-white-TM.png" alt="mlflow" width="82%">
1920
:::
2021
:::::
22+
23+
:::::{grid-item}
24+
:::{card}
25+
:link: wandb/index.md
26+
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<img src="../assets/integrations/wandb/wandb-logo-yellow-dots-black-wb.png" alt="wandb">
27+
:::
28+
:::::
2129
::::::
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Weights and Biases
2+
3+
Giskard can log SHAP plots, scan reports and test suites into Weights & Biases:
4+
- **Understand feature importance**: Giskard generates plots to highlight feature importance using the SHAP library.
5+
- **Scan your model to find dozens of hidden vulnerabilities**: The Giskard scan automatically detects vulnerability issues such as performance bias, data leakage, unrobustness, spurious correlation, overconfidence, underconfidence, unethical issue, etc.
6+
- **Instantaneously generate domain-specific tests**: Giskard automatically generates relevant tests based on the vulnerabilities detected by the scan. You can easily customize the tests depending on your use case by defining domain-specific data slicers and transformers as fixtures of your test suites.
7+
8+
## Setup
9+
To use Giskard with Weights & Biases, you need to follow these steps:
10+
11+
1. Setup Weights & Biases:
12+
- sign up for a Weights & Biases account [here](https://wandb.ai/site).
13+
- install and open your docker app.
14+
- install the `wandb` python package and server:
15+
```shell
16+
pip install wandb
17+
wandb login --relogin # input the API key you get from the website
18+
wandb server start --upgrade # this will download the docker images if they're not already downloaded
19+
```
20+
21+
2. Setup Giskard:
22+
- install the giskard library by following these [instructions](https://docs.giskard.ai/en/latest/guides/installation_library/index.html).
23+
24+
## Logging from Giskard to Weights & Biases
25+
In order to get the most out this integration, you would need to follow these three steps to diagnose your ML model:
26+
- wrap your dataset by following this [guide](https://docs.giskard.ai/en/latest/guides/wrap_dataset/index.html).
27+
- wrap your ML model by following this [guide](https://docs.giskard.ai/en/latest/guides/wrap_model/index.html).
28+
- scan your ML model for vulnerabilities by following this [guide](https://docs.giskard.ai/en/latest/guides/scan/index.html).
29+
30+
Once the above steps are done, you can know log the results into Weights & Biases by doing the following:
31+
```python
32+
import giskard, wandb
33+
# [...] wrap model and dataset with giskard
34+
scan_results = giskard.scan(giskard_model, giskard_dataset)
35+
test_suite_results = scan_results.generate_test_suite().run()
36+
37+
wandb.login()
38+
giskard_dataset.to_wandb() # log your dataset as a table
39+
scan_results.to_wandb() # log scan results as an HTML report
40+
test_suite_results.to_wandb() # log test suite results as a table
41+
# TODO: log SHAP plots
42+
```
43+
44+
```{eval-rst}
45+
.. note:: You can pass to :code:`to_wandb()` all the arguments you can pass to :code:`wandb.init()` (see `here <https://docs.wandb.ai/ref/python/init>`_)
46+
```
47+
48+
49+
## Notebook examples
50+
::::::{grid} 1 1 2 2
51+
:gutter: 1
52+
53+
:::::{grid-item}
54+
:::{card} <br><h3><center>📊 Tabular</center></h3>
55+
:link: wandb-tabular-example.ipynb
56+
:::
57+
:::::
58+
::::::
59+
60+
```{toctree}
61+
:caption: Table of Contents
62+
:name: mastertoc
63+
:maxdepth: 2
64+
:hidden:
65+
66+
wandb-tabular-example
67+
```
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"source": [
6+
"# Notebook Example - Tabular"
7+
],
8+
"metadata": {
9+
"collapsed": false
10+
}
11+
},
12+
{
13+
"cell_type": "markdown",
14+
"source": [
15+
"## Detecting tabular ML models vulnerabilities in W&B with Giskard\n",
16+
"This example demonstrates how to efficiently scan two tabular ML models for hidden vulnerabilities using Giskard, log the results and interpret them within the W&B framework in just a few lines of code. We will use the following two tabular ML models:\n",
17+
"\n",
18+
"| Model | Description | Training data |\n",
19+
"|----------|------------------------------------------------------------------------|-----------------|\n",
20+
"| `model1` | A simple sklearn `LogisticRegression` model trained only for 5 epochs. | Titanic dataset |\n",
21+
"| `model2` | A simple sklearn `LogisticRegression` model trained for 100 epochs. | Titanic dataset |"
22+
],
23+
"metadata": {
24+
"collapsed": false
25+
}
26+
},
27+
{
28+
"cell_type": "code",
29+
"execution_count": null,
30+
"outputs": [],
31+
"source": [
32+
"import wandb\n",
33+
"\n",
34+
"from giskard import Model, Dataset, demo, explain_with_shap, scan\n",
35+
"\n",
36+
"model1, df = demo.titanic(max_iter=5)\n",
37+
"model2, __ = demo.titanic(max_iter=100) # Datasets are identical.\n",
38+
"models = {\"titanic-max_iter=5\": model1, \"titanic-max_iter=100\": model2}\n",
39+
"\n",
40+
"wrapped_data = Dataset(df=df, \n",
41+
" target=\"Survived\",\n",
42+
" cat_columns=['Pclass', 'Sex', \"SibSp\", \"Parch\", \"Embarked\"])\n",
43+
"\n",
44+
"for model_name, model in models.items():\n",
45+
" wrapped_model = Model(model=model.predict_proba,\n",
46+
" model_type=\"classification\",\n",
47+
" feature_names=['PassengerId', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch', 'Fare', 'Embarked'], \n",
48+
" classification_labels=model.classes_)\n",
49+
" \n",
50+
" # Log results to the new W&B run.\n",
51+
" wrapped_data.to_wandb(name=model_name)\n",
52+
" \n",
53+
" shap_explanation_result = explain_with_shap(wrapped_model, wrapped_data)\n",
54+
" shap_explanation_result.to_wandb()\n",
55+
" \n",
56+
" scan_results = scan(wrapped_model, wrapped_data)\n",
57+
" scan_results.to_wandb()\n",
58+
" \n",
59+
" test_suite = scan_results.generate_test_suite()\n",
60+
" test_suite.run().to_wandb()\n",
61+
"\n",
62+
" # Finish a current run.\n",
63+
" wandb.finish()"
64+
],
65+
"metadata": {
66+
"collapsed": false
67+
}
68+
},
69+
{
70+
"cell_type": "markdown",
71+
"source": [
72+
"After logging the results, you can visualise them on the W&B User Interface by running `wandb server start` via <http://localhost:8080>. You will be able to visualise the following:\n",
73+
" \n",
74+
"### The dataset\n",
75+
"<img src=\"../../assets/integrations/wandb/wandb-dataset.png\">\n",
76+
"\n",
77+
"### The SHAP bar plots for categorical features\n",
78+
"<img src=\"../../assets/integrations/wandb/wandb-categorical-chart.png\">\n",
79+
"\n",
80+
"### The SHAP scatter plots for numerical features\n",
81+
"<img src=\"../../assets/integrations/wandb/wandb-numerical-chart.png\">\n",
82+
"\n",
83+
"### The SHAP global feature importance plot\n",
84+
"<img src=\"../../assets/integrations/wandb/wandb-global-chart.png\">\n",
85+
"\n",
86+
"### The Giskard scan results\n",
87+
"<img src=\"../../assets/integrations/wandb/wandb-scanning-result.png\">\n",
88+
"\n",
89+
"### The Giskard test-suite results\n",
90+
"<img src=\"../../assets/integrations/wandb/wandb-test-suite-result.png\">"
91+
],
92+
"metadata": {
93+
"collapsed": false
94+
}
95+
}
96+
],
97+
"metadata": {
98+
"kernelspec": {
99+
"display_name": "Python 3",
100+
"language": "python",
101+
"name": "python3"
102+
},
103+
"language_info": {
104+
"codemirror_mode": {
105+
"name": "ipython",
106+
"version": 2
107+
},
108+
"file_extension": ".py",
109+
"mimetype": "text/x-python",
110+
"name": "python",
111+
"nbconvert_exporter": "python",
112+
"pygments_lexer": "ipython2",
113+
"version": "2.7.6"
114+
}
115+
},
116+
"nbformat": 4,
117+
"nbformat_minor": 0
118+
}

0 commit comments

Comments
 (0)