Skip to content

checks: add readability NLP metric check#2412

Open
harsh21234i wants to merge 3 commits into
Giskard-AI:mainfrom
harsh21234i:fix/readability-check
Open

checks: add readability NLP metric check#2412
harsh21234i wants to merge 3 commits into
Giskard-AI:mainfrom
harsh21234i:fix/readability-check

Conversation

@harsh21234i
Copy link
Copy Markdown
Contributor

Adds a new built-in Readability check backed by textstat, with
configurable metric, min_score, and max_score. Exposes it via
giskard.checks exports, wires it into giskard.checks.builtin, adds
textstat under the nlp optional extra (giskard-checks[nlp]), and includes
unit tests covering pass/fail cases, missing key/type errors, missing
dependency behavior, and invalid threshold ranges. Related to issue

#2349.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a Readability check using the textstat library, allowing users to validate text quality metrics like Flesch Reading Ease. The implementation includes optional dependency handling and threshold validation. Reviewers suggested expanding the list of supported metrics, ensuring JSON serializability for error details, and adding error handling around the metric computation to improve robustness.

Comment on lines +13 to +17
ReadabilityMetric = Literal[
"flesch_reading_ease",
"flesch_kincaid_grade",
"gunning_fog",
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The ReadabilityMetric literal is currently limited to only three metrics. To better align with the "configurable metric" goal mentioned in the PR description, consider expanding this list to include other common metrics supported by textstat, such as automated_readability_index, coleman_liau_index, and dale_chall_readability_score.

Comment on lines +89 to +95
return CheckResult.failure(
message=(
f"Value for key '{self.key}' must be a string, but found "
f"{type(text).__name__}."
),
details={**details, "value": text},
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Including the raw text value in the details dictionary when it is not a string (e.g., a complex dictionary or custom object) may lead to serialization errors if the CheckResult is converted to JSON for the UI or storage. It is safer to store a string representation of the value.

Suggested change
return CheckResult.failure(
message=(
f"Value for key '{self.key}' must be a string, but found "
f"{type(text).__name__}."
),
details={**details, "value": text},
)
if not isinstance(text, str):
return CheckResult.failure(
message=(
f"Value for key '{self.key}' must be a string, but found "
f"{type(text).__name__}."
),
details={**details, "value": str(text)},
)

Comment on lines +97 to +98
score_fn = getattr(textstat, self.metric)
score = float(score_fn(text))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The call to textstat metrics can raise exceptions due to internal processing errors or missing NLTK data (e.g., the punkt tokenizer). Wrapping this in a try-except block and returning a CheckStatus.ERROR result makes the check more robust and provides better feedback than an unhandled exception.

        try:
            score = float(getattr(textstat, self.metric)(text))
        except Exception as e:
            return CheckResult(
                status=CheckStatus.ERROR,
                message=f"Failed to compute readability score ({self.metric}): {e}",
                details={**details, "error": str(e)},
            )

@harsh21234i
Copy link
Copy Markdown
Contributor Author

  • Expanded ReadabilityMetric with additional textstat metrics
    (automated_readability_index, coleman_liau_index,
    dale_chall_readability_score).
  • Made details["value"] JSON-safe for non-string extracted values
    by storing str(...).
  • Wrapped metric computation in try/except and return
    CheckStatus.ERROR with details["error"] when textstat fails.

Updated/added unit tests; uv run -m pytest -q libs/giskard-checks/ tests/builtin/test_nlp_metrics.py passes.

@davidberenstein1957
Copy link
Copy Markdown
Member

davidberenstein1957 commented May 13, 2026

Hi @harsh21234i can you resolve the conflict. Also, can we expose what are some realistic and expected readability scores across the various proposed metrics so the end-user can understand the min-max paradigm? After we should be good to merge.

"rich>=14.2.0,<15",
]

[project.optional-dependencies]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make these checks more specific like readability?

@harsh21234i
Copy link
Copy Markdown
Contributor Author

done can you recheck sir!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants