Skip to content

Tolerate minor text drift without weakening HTML checks#128

Merged
nicpottier merged 1 commit intomainfrom
nicpottier/tolerant-html-valid
Feb 26, 2026
Merged

Tolerate minor text drift without weakening HTML checks#128
nicpottier merged 1 commit intomainfrom
nicpottier/tolerant-html-valid

Conversation

@nicpottier
Copy link
Contributor

This updates HTML validation to auto-correct expected text content while only failing mismatches below a similarity threshold. It keeps security checks intact by validating descendants before text replacement, preventing nested unsafe tags/attributes from bypassing validation. It also skips text substitution for image IDs/tags to avoid malformed output if expected-text maps are misconfigured. Tests were expanded to cover tolerant matching, edit-distance helpers, nested unsafe-content regressions, and the image-ID substitution guard.

@nicpottier nicpottier merged commit 63a73fc into main Feb 26, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant