Skip to content

Conversation

@henchaves
Copy link
Member

Description

Hallucination tests, such as test_llm_output_coherency and test_llm_output_plausibility, have a eval_prompt parameter which is not used by the _BaseLLMEvaluator. These tests are trying to instantiate an evaluator passing this parameter, which outputs an error:

 Traceback (most recent call last):
  File "/Users/hchaves/GitHub/Giskard/giskard/giskard/core/suite.py", line 573, in run
    result = test_partial.giskard_test(**test_params).execute()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hchaves/GitHub/Giskard/giskard/giskard/registry/giskard_test.py", line 192, in execute
    return configured_validate_arguments(self.test_fn)(*self.args, **self.kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hchaves/GitHub/Giskard/giskard/.venv/lib/python3.11/site-packages/pydantic/validate_call_decorator.py", line 59, in wrapper_function
    return validate_call_wrapper(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hchaves/GitHub/Giskard/giskard/.venv/lib/python3.11/site-packages/pydantic/_internal/_validate_call.py", line 81, in __call__
    res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hchaves/GitHub/Giskard/giskard/giskard/testing/tests/llm/hallucination.py", line 41, in test_llm_output_coherency
    passed=eval_result.passed,
            ^^^^^^^^^^^^^^^^^^^
TypeError: _BaseLLMEvaluator.__init__() got an unexpected keyword argument 'eval_prompt'

To fix that issue, this PR removes the eval_prompt from hallucination tests.

Related Issue

GSK-3495 (available on Linear)

Type of Change

  • 📚 Examples / docs / tutorials / dependencies update
  • 🔧 Bug fix (non-breaking change which fixes an issue)
  • 🥂 Improvement (non-breaking change which improves an existing feature)
  • 🚀 New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to change)
  • 🔐 Security fix

@henchaves henchaves requested a review from kevinmessiaen April 26, 2024 10:00
@linear
Copy link

linear bot commented Apr 26, 2024

@kevinmessiaen kevinmessiaen enabled auto-merge April 30, 2024 03:03
@sonarqubecloud
Copy link

@kevinmessiaen kevinmessiaen merged commit fc449c0 into main Apr 30, 2024
@kevinmessiaen kevinmessiaen deleted the feature/gsk-3495-unexpected-keyword-argument-eval_prompt branch April 30, 2024 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants