Skip to content

Conversation

@rabah-khalek
Copy link
Contributor

@rabah-khalek rabah-khalek commented Apr 25, 2023

ToDo
  • spec
  • check the outdated branch of this feature.
  • converge on the technical implementation
  • to be filled
comment (deprecated)

I found couple of issues with the previous implementation:
https://github.com/Giskard-AI/giskard/tree/GSK-106_Reduce_the_number_of_test_outputs

  1. Our test should be agnostic to the type of dataset we use. Once we have CV, we might not have pandas.DataFrame anymore. Therefore, the slicing operation shouldn't be only valid for the latter.
  2. We should return a dataset and not a df that can be ready to be rendered in the UI (with new name, uuid, etc.).
  3. The debug_filters along which we define the slices_to_debug should not be duplicated and defined per test, rather collected somewhere for readability and better handling.

Here's my first preliminary proposal:

This will result in:
Screenshot 2023-04-12 at 14 40 44

@andreybavt and @jmsquare let me know what you think about this.

minutes of 12/04/2023

After the 12/04/2023 discussion with @jmsquare and @andreybavt :

  • add debug arg to the tests
  • The filter has to return a list of either booleans or ints
  • The name slice for Dataset might not be ideal -- to spec (deprecated)
  • The catalogue idea to review by @andreybavt (abandonned)
  • For next week we should have at least one representative slice per type of tests:
    • performance
    • statistical
    • metamorphic
    • drift

https://linear.app/giskard/project/test-debug-feature-0024fb6da8c5?filter=eyJhbmQiOlt7InN0YXRlIjp7Im5hbWUiOnsibmluIjpbIkRvbmUiXX19fV19

@rabah-khalek rabah-khalek changed the title first commit after latest version of api v2 feature/debug Apr 25, 2023
@rabah-khalek rabah-khalek marked this pull request as draft April 25, 2023 14:40
@rabah-khalek rabah-khalek self-assigned this Apr 25, 2023
@rabah-khalek rabah-khalek added feature v2.0 Created by Linear-GitHub Sync labels Apr 25, 2023
@rabah-khalek rabah-khalek added this to the 2.0.0 milestone Apr 25, 2023
@rabah-khalek rabah-khalek linked an issue Apr 25, 2023 that may be closed by this pull request
@rabah-khalek
Copy link
Contributor Author

rabah-khalek commented Apr 25, 2023

In this commit 19de6c6:

details (deprecated)
  • I added the possibility of data.slice(..., get_mask=True) which returns a boolean mask as a numpy.array.
  • The decision on the type of mask output was based on the fact that it was the least memory occupying object:
    Screenshot 2023-04-25 at 16 11 18
  • I dumped the idea of the catalogue in favour of gathering all the debugging slicing functions here (where we are already collecting them) -- the tag "hidden" is to be communicated to @kevinmessiaen in order to now show these function in the UI (unless they're relevant).
  • Here's a small demo of what is expected:
    Screenshot 2023-04-25 at 16 35 57
    we can see that by caching the mask we're gaining 40x in memory (as opposed to caching the sliced dataset)
Update 1

Update: regarding the second point, I think that's a more fair comparison to do (that favours int masks):
--> assuming we expect the accuracy of inspected models to be > 50%.

Example:
>>> a = np.ones(100)
>>> b = np.zeros(100)
>>> c = (a == b) # bool mask of 100 entries
>>> d = a.astype(int)[:50] # int mask of 50 entries (assuming 50% accuracy)

>>> sys.getsizeof(c)
212
>>> sys.getsizeof(d)
112
Update 2

Python bool is 1 byte, int is 4 bytes. So if the model is at least 75% correct it’s cheaper to store int line numbers and not bool map.

@andreybavt andreybavt self-requested a review May 16, 2023 10:06

repeated uint32 actual_slices_size = 21;
repeated uint32 reference_slices_size = 22;
string output_df_id = 23;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is it used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Googleton could you answer this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to @Googleton :

it’s what we use for the frontend to know which debugging session to open

@andreybavt andreybavt added on hold and removed v2.0 Created by Linear-GitHub Sync labels May 26, 2023
@andreybavt andreybavt force-pushed the feature/ai-test-v2-merged branch from 80c1113 to be66b69 Compare June 7, 2023 15:24
rabah-khalek and others added 27 commits July 5, 2023 15:24
# Conflicts:
#	python-client/giskard/core/suite.py
#	python-client/giskard/ml_worker/server/ml_worker_service.py
# Conflicts:
#	frontend/src/views/main/project/Datasets.vue
# Conflicts:
#	frontend/src/api.ts
#	frontend/src/views/main/project/Datasets.vue
#	frontend/src/views/main/project/modals/SuiteTestInfoModal.vue
#	python-client/giskard/datasets/base/__init__.py
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 14 Code Smells

55.2% 55.2% Coverage
1.9% 1.9% Duplication

@andreybavt andreybavt merged commit 70dc79b into main Jul 21, 2023
@Hartorn Hartorn deleted the feature/debug_output branch September 22, 2023 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

[GSK-883] Debug feature

4 participants