feature/debug #968

rabah-khalek · 2023-04-25T14:40:26Z

ToDo

spec
check the outdated branch of this feature.
converge on the technical implementation
to be filled

comment (deprecated)

I found couple of issues with the previous implementation:
https://github.com/Giskard-AI/giskard/tree/GSK-106_Reduce_the_number_of_test_outputs

Our test should be agnostic to the type of dataset we use. Once we have CV, we might not have pandas.DataFrame anymore. Therefore, the slicing operation shouldn't be only valid for the latter.
We should return a dataset and not a df that can be ready to be rendered in the UI (with new name, uuid, etc.).
The debug_filters along which we define the slices_to_debug should not be duplicated and defined per test, rather collected somewhere for readability and better handling.

Here's my first preliminary proposal:

a function that returns a fresh_copy of a dataset:
https://github.com/Giskard-AI/giskard/blob/3d9ae1e0b20185979b5fab8ec6d977147e2a23a1/python-client/giskard/ml_worker/testing/tests/debug_utils.py#L13-L17
creating a filter catalogue where each filter has the name of the test it belongs to (valid for any type of dataset we support):
https://github.com/Giskard-AI/giskard/blob/3d9ae1e0b20185979b5fab8ec6d977147e2a23a1/python-client/giskard/ml_worker/testing/tests/debug_utils.py#L20-L26
adding slices_to_debug to the TestResult by using the debug_filters catalogue:
https://github.com/Giskard-AI/giskard/blob/3d9ae1e0b20185979b5fab8ec6d977147e2a23a1/python-client/giskard/ml_worker/testing/tests/performance.py#L45-L49
so basically:
- debug_filters.get(test_name) will get the filter for the corresponding test.
- (gsk_dataset, prediction) are the parameters needed by this filter.
- returns a list of slices (Datasets) assigned to slices_to_debug.

This will result in:

@andreybavt and @jmsquare let me know what you think about this.

minutes of 12/04/2023

After the 12/04/2023 discussion with @jmsquare and @andreybavt :

add debug arg to the tests
The filter has to return a list of either booleans or ints
~~The name slice for Dataset might not be ideal -- to spec~~ (deprecated)
~~The catalogue idea to review by @andreybavt~~ (abandonned)
For next week we should have at least one representative slice per type of tests:
- performance
- statistical
- metamorphic
- drift

https://linear.app/giskard/project/test-debug-feature-0024fb6da8c5?filter=eyJhbmQiOlt7InN0YXRlIjp7Im5hbWUiOnsibmluIjpbIkRvbmUiXX19fV19

rabah-khalek · 2023-04-25T14:49:20Z

In this commit 19de6c6:

details (deprecated)

I added the possibility of data.slice(..., get_mask=True) which returns a boolean mask as a numpy.array.
The decision on the type of mask output was based on the fact that it was the least memory occupying object:
I dumped the idea of the catalogue in favour of gathering all the debugging slicing functions here (where we are already collecting them) -- the tag "hidden" is to be communicated to @kevinmessiaen in order to now show these function in the UI (unless they're relevant).
Here's a small demo of what is expected:

we can see that by caching the mask we're gaining 40x in memory (as opposed to caching the sliced dataset)

Update 1

Update: regarding the second point, I think that's a more fair comparison to do (that favours int masks):
--> assuming we expect the accuracy of inspected models to be > 50%.

Example:
>>> a = np.ones(100)
>>> b = np.zeros(100)
>>> c = (a == b) # bool mask of 100 entries
>>> d = a.astype(int)[:50] # int mask of 50 entries (assuming 50% accuracy)

>>> sys.getsizeof(c)
212
>>> sys.getsizeof(d)
112

Update 2

Python bool is 1 byte, int is 4 bytes. So if the model is at least 75% correct it’s cheaper to store int line numbers and not bool map.

andreybavt · 2023-05-16T12:15:00Z

common/proto/ml-worker.proto


  repeated uint32 actual_slices_size = 21;
  repeated uint32 reference_slices_size = 22;
+  string output_df_id = 23;


where is it used?

@Googleton could you answer this one?

According to @Googleton :

it’s what we use for the frontend to know which debugging session to open

frontend/src/views/main/project/SuiteTestExecutionList.vue

…giskard into feature/debug_output

# Conflicts: # python-client/giskard/core/suite.py # python-client/giskard/ml_worker/server/ml_worker_service.py

# Conflicts: # frontend/src/views/main/project/Datasets.vue

… model if none found

# Conflicts: # frontend/src/api.ts # frontend/src/views/main/project/Datasets.vue # frontend/src/views/main/project/modals/SuiteTestInfoModal.vue # python-client/giskard/datasets/base/__init__.py

…re/debug_output

sonarqubecloud · 2023-07-21T14:58:05Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
14 Code Smells

55.2% Coverage
1.9% Duplication

rabah-khalek changed the title ~~first commit after latest version of api v2~~ feature/debug Apr 25, 2023

rabah-khalek marked this pull request as draft April 25, 2023 14:40

rabah-khalek self-assigned this Apr 25, 2023

rabah-khalek added feature v2.0 Created by Linear-GitHub Sync labels Apr 25, 2023

rabah-khalek added this to the 2.0.0 milestone Apr 25, 2023

rabah-khalek linked an issue Apr 25, 2023 that may be closed by this pull request

[GSK-883] Debug feature #969

Closed

andreybavt self-requested a review May 16, 2023 10:06

andreybavt suggested changes May 16, 2023

View reviewed changes

andreybavt added on hold and removed v2.0 Created by Linear-GitHub Sync labels May 26, 2023

andreybavt force-pushed the feature/ai-test-v2-merged branch from 80c1113 to be66b69 Compare June 7, 2023 15:24

rabah-khalek and others added 17 commits June 9, 2023 11:42

first commit after latest version of api v2

5335dd1

restored python-client/giskard/datasets/base/__init__.py to origin

170deac

implemented filter method in Dataset

816f283

refactoring and new mask

0c35723

fixed top_nper_abs_err_rows_mask

e30728b

better naming of mask

922bd1c

working on feedback of 26 Apr 2023

48c2110

modified runAdHocTest in ml_worker_service

b1cde17

restored debug to False by default in test_auc

4f99464

Work in progress over running tests via the debug button

8cacf7c

The debugging session is now created, just need to open it ...

1a328d7

Open sesame!

c784055

Debug arg passed (false by default)

206073d

Fix typing issue

a5ef36a

Slight cleanup

26a5697

Fix issue when dataset inputs come from above

c453dd7

updated performance tests

ae13e46

rabah-khalek and others added 27 commits July 5, 2023 15:24

Merge branch 'main' into feature/debug_output

9e05944

skipping data drift debug functional-tests

5c6d8c3

Merge branch 'feature/debug_output' of https://github.com/Giskard-AI/…

d461d56

…giskard into feature/debug_output

fixed linter

a2fb296

Merge branch 'main' into feature/debug_output

0c26870

# Conflicts: # python-client/giskard/core/suite.py # python-client/giskard/ml_worker/server/ml_worker_service.py

Merge branch 'main' into feature/debug_output

720ce09

better if condition in ml worker service

aae788a

GSK-1387: better input validation

7207d55

Merge branch 'main' into feature/debug_output

301b399

# Conflicts: # frontend/src/views/main/project/Datasets.vue

fix dataset filtering

e06f9b4

Merge branch 'main' into feature/debug_output

25990fd

Add a loading indicator when "debug" is pressed. Also modal to select…

402346b

… model if none found

re-enable debug for pure data drift tests

603f099

updated drift tests

6a40779

Make it a modal instead of just a loading button

ba76e67

updated drift tests

b1b48e7

Modal for model selection looks better

607fb47

added check if debuggable for drift tests

dc429dc

Proper reset the modals

0f20b1f

Fix an issue with argument generation on the frontend

f111468

error changed, docstrings populated

26fdf8a

Add regex to translate model names and datasets tags into their names

3d7c35d

Merge branch 'main' into feature/debug_output

36e7462

# Conflicts: # frontend/src/api.ts # frontend/src/views/main/project/Datasets.vue # frontend/src/views/main/project/modals/SuiteTestInfoModal.vue # python-client/giskard/datasets/base/__init__.py

fixed unit-test

8aad5a4

added missing docstring

ed1a7ba

Fix bug introduced by merge

34f111d

Merge remote-tracking branch 'origin/feature/debug_output' into featu…

75b2b7b

…re/debug_output

andreybavt merged commit 70dc79b into main Jul 21, 2023

Hartorn deleted the feature/debug_output branch September 22, 2023 10:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feature/debug #968

feature/debug #968

Uh oh!

rabah-khalek commented Apr 25, 2023 •

edited

Loading

Uh oh!

rabah-khalek commented Apr 25, 2023 •

edited

Loading

Uh oh!

andreybavt May 16, 2023

Uh oh!

rabah-khalek Jun 26, 2023

Uh oh!

rabah-khalek Jun 26, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud bot commented Jul 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

Uh oh!

feature/debug #968

feature/debug #968

Uh oh!

Conversation

rabah-khalek commented Apr 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rabah-khalek commented Apr 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andreybavt May 16, 2023

Choose a reason for hiding this comment

Uh oh!

rabah-khalek Jun 26, 2023

Choose a reason for hiding this comment

Uh oh!

rabah-khalek Jun 26, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud bot commented Jul 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

rabah-khalek commented Apr 25, 2023 •

edited

Loading

rabah-khalek commented Apr 25, 2023 •

edited

Loading