Fixed metamorphic tests for LLM #1185

kevinmessiaen · 2023-06-20T08:19:45Z

Description

Fixed metamorphic tests for LLM

Type of Change

📚 Examples / docs / tutorials / dependencies update
🔧 Bug fix (non-breaking change which fixes an issue)
🥂 Improvement (non-breaking change which improves an existing feature)
🚀 New feature (non-breaking change which adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to change)
🔐 Security fix

linear · 2023-06-20T08:19:47Z

GSK-1311 Error when running auto-generated test suite for LLM model

https://demo.giskard.ai/main/projects/1451/testing/1511/overview

2023-06-15 14:01:53,457 pid:801164 ml_worker_thread_0 giskard.ml_worker.server.ml_worker_service ERROR    An error occurred during the test suite execution: Invalid prediction task: SupportedModelTypes.TEXT_GENERATION
Traceback (most recent call last):
  File "/home/ubuntu/demo-venv/lib/python3.10/site-packages/giskard/ml_worker/server/ml_worker_service.py", line 269, in runTestSuite
    is_pass, results = suite.run(**global_arguments)
  File "/home/ubuntu/demo-venv/lib/python3.10/site-packages/giskard/core/suite.py", line 228, in run
    result = test_partial.giskard_test.get_builder()(**test_params).execute()
  File "/home/ubuntu/demo-venv/lib/python3.10/site-packages/giskard/ml_worker/testing/registry/giskard_test.py", line 132, in execute
    return self.test_fn(**self.params)
  File "/home/ubuntu/demo-venv/lib/python3.10/site-packages/giskard/testing/tests/metamorphic.py", line 219, in test_metamorphic_invariance
    return _test_metamorphic(
  File "/home/ubuntu/demo-venv/lib/python3.10/site-packages/giskard/testing/tests/metamorphic.py", line 155, in _test_metamorphic
    passed_idx, _ = _compare_prediction(results_df, model.meta.model_type, direction, output_sensitivity)
  File "/home/ubuntu/demo-venv/lib/python3.10/site-packages/giskard/ml_worker/utils/logging.py", line 76, in wrap
    result = fn(*args, **kw)
  File "/home/ubuntu/demo-venv/lib/python3.10/site-packages/giskard/testing/tests/metamorphic.py", line 71, in _compare_prediction
    raise ValueError(f"Invalid prediction task: {prediction_task}")
ValueError: Invalid prediction task: SupportedModelTypes.TEXT_GENERATION

The model is created using llm_comment_generation.ipynb

mattbit

Good for me!

mattbit · 2023-06-20T12:49:38Z

python-client/giskard/testing/tests/metamorphic.py

 from giskard.ml_worker.utils.logging import timer
 from giskard.models.base import BaseModel
 from giskard.models.utils import fix_seed
+from giskard.scanner.llm.utils import LLMImportError


Maybe we should move LLMImportError outside of scanner now that we are integrating it more with the rest of the codebase. But not super important.

…erated-test-suite-for-llm-model

mattbit · 2023-06-20T13:24:25Z

@kevinmessiaen would be nice to add a simple test if you have time, using FakeListLLM from langchain.

andreybavt · 2023-06-21T14:49:24Z

python-client/giskard/testing/tests/metamorphic.py

+            except ImportError as err:
+                raise LLMImportError() from err
+
+            scorer = evaluate.load("bertscore")


It's better to add evaluate.load("bertscore") to the try-except block since it relies on another extra dependency that we add. In case a person already has evaluate (previously installed for example) but not bert-score he'd also get the same LLMImportError with explanations

…erated-test-suite-for-llm-model

sonarqubecloud · 2023-07-07T05:05:05Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

No Coverage information
No Duplication information

Fixed metamorphic tests for LLM

eacd055

kevinmessiaen requested review from andreybavt and mattbit June 20, 2023 08:19

mattbit approved these changes Jun 20, 2023

View reviewed changes

Merge branch 'main' into feature/gsk-1311-error-when-running-auto-gen…

17c079d

…erated-test-suite-for-llm-model

andreybavt suggested changes Jun 21, 2023

View reviewed changes

Handle missing bertscore

8a65972

kevinmessiaen requested a review from andreybavt June 22, 2023 01:59

kevinmessiaen and others added 5 commits June 22, 2023 15:05

Added test for LLM metamorphic tests

5cd3791

Merge branch 'main' into feature/gsk-1311-error-when-running-auto-gen…

430592c

…erated-test-suite-for-llm-model

Merge branch 'main' into feature/gsk-1311-error-when-running-auto-gen…

da5d262

…erated-test-suite-for-llm-model

Merge branch 'main' into feature/gsk-1311-error-when-running-auto-gen…

ce0e6f0

…erated-test-suite-for-llm-model

Merge branch 'main' into feature/gsk-1311-error-when-running-auto-gen…

ccd4f06

…erated-test-suite-for-llm-model

kevinmessiaen merged commit cc83df1 into main Jul 7, 2023

Hartorn deleted the feature/gsk-1311-error-when-running-auto-generated-test-suite-for-llm-model branch September 13, 2023 11:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fixed metamorphic tests for LLM #1185

Fixed metamorphic tests for LLM #1185

Uh oh!

kevinmessiaen commented Jun 20, 2023

Uh oh!

linear bot commented Jun 20, 2023

Uh oh!

mattbit left a comment

Uh oh!

mattbit Jun 20, 2023

Uh oh!

mattbit commented Jun 20, 2023

Uh oh!

andreybavt Jun 21, 2023

Uh oh!

sonarqubecloud bot commented Jul 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

Uh oh!

Fixed metamorphic tests for LLM #1185

Fixed metamorphic tests for LLM #1185

Uh oh!

Conversation

kevinmessiaen commented Jun 20, 2023

Description

Type of Change

Uh oh!

linear bot commented Jun 20, 2023

Uh oh!

mattbit left a comment

Choose a reason for hiding this comment

Uh oh!

mattbit Jun 20, 2023

Choose a reason for hiding this comment

Uh oh!

mattbit commented Jun 20, 2023

Uh oh!

andreybavt Jun 21, 2023

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Jul 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants