Add hallucination judge check#2436
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new Hallucination check to the Giskard checks library, designed to detect fabricated facts in AI agent answers. The implementation includes the Hallucination class, a Jinja2 prompt template, and comprehensive unit tests. Feedback was provided to improve the get_inputs method by ensuring type consistency with TraceType, better formatting of list-based context inputs using newlines, and safer handling of None values to prevent the LLM from misinterpreting them as content.
| @override | ||
| async def get_inputs(self, trace: Trace[InputType, OutputType]) -> dict[str, str]: | ||
| inputs = { | ||
| "answer": str( | ||
| provided_or_resolve( | ||
| trace, | ||
| key=self.answer_key, | ||
| value=provide_not_none(self.answer), | ||
| ) | ||
| ), | ||
| "context": "", | ||
| } | ||
| if self.context is not None or self.context_key is not None: | ||
| inputs["context"] = str( | ||
| provided_or_resolve( | ||
| trace, | ||
| key=self.context_key, | ||
| value=provide_not_none(self.context), | ||
| ) | ||
| ) | ||
| return inputs |
There was a problem hiding this comment.
The get_inputs implementation has a few areas for improvement:
- Type Consistency: It uses
Trace[InputType, OutputType]instead of the genericTraceTypedefined in the class signature. While technically compatible here, usingTraceTypeis consistent with the base class and allows for proper type resolution in subclasses. - List Formatting: The
contextattribute supportslist[str]. Usingstr()on a list results in a Python-style string representation (e.g.,"['chunk1', 'chunk2']"), which is suboptimal for LLM prompts. Joining chunks with newlines is generally preferred. - None Handling: If a value resolves to
None,str(None)produces the string"None". This can be misinterpreted by the LLM as actual content. Defaulting to an empty string is safer.
@override
async def get_inputs(self, trace: TraceType) -> dict[str, str]:
def _fmt(v) -> str:
if isinstance(v, list):
return "\n\n".join(map(str, v))
return str(v) if v is not None else ""
inputs = {
"answer": _fmt(
provided_or_resolve(
trace,
key=self.answer_key,
value=provide_not_none(self.answer),
)
),
"context": "",
}
if self.context is not None or self.context_key is not None:
inputs["context"] = _fmt(
provided_or_resolve(
trace,
key=self.context_key,
value=provide_not_none(self.context),
)
)
return inputs| @@ -0,0 +1,44 @@ | |||
| Your role is to evaluate whether an AI agent's answer contains hallucinated or fabricated factual claims. | |||
There was a problem hiding this comment.
Hi, wehere did you base this prompt on? Any reference to research or other existing libraries would be great.
davidberenstein1957
left a comment
There was a problem hiding this comment.
@mindbomber this looks nice. Would you be able to resolve the checks and conflict, and address my minor comment?
|
Thanks, addressed in 3e87688.
Verification:
The new CI run is queued now; the |
Summary
HallucinationLLM judge check registered ashallucination.answer/contextvalues, trace JSONPath extraction, and no-context mode.giskard.checksandgiskard.checks.judges.Fixes #2369.
Verification
uv run pytest libs\giskard-checks\tests\builtin\test_hallucination.py libs\giskard-checks\tests\builtin\test_groundedness.py libs\giskard-checks\tests\core\test_jsonpath_enforcement.py -q-> 19 passeduv run ruff check libs\giskard-checks\src\giskard\checks\judges\hallucination.py libs\giskard-checks\src\giskard\checks\judges\__init__.py libs\giskard-checks\src\giskard\checks\__init__.py libs\giskard-checks\tests\builtin\test_hallucination.py-> passeduv run ruff format --check libs\giskard-checks\src\giskard\checks\judges\hallucination.py libs\giskard-checks\src\giskard\checks\judges\__init__.py libs\giskard-checks\src\giskard\checks\__init__.py libs\giskard-checks\tests\builtin\test_hallucination.py-> passedgit diff --check-> passedAANA gate
acceptacceptScope note
This introduces the built-in judge interface and prompt behavior. It does not claim to certify factuality or replace external evidence verification; the check provides a reviewable hallucination signal for Giskard test workflows.