Skip to content

Clam 2484: Fix warning when scanning some HTML files#1084

Merged
val-ms merged 1 commit intoCisco-Talos:mainfrom
val-ms:CLAM-2484-file-bytes-not-valid
Nov 21, 2023
Merged

Clam 2484: Fix warning when scanning some HTML files#1084
val-ms merged 1 commit intoCisco-Talos:mainfrom
val-ms:CLAM-2484-file-bytes-not-valid

Conversation

@val-ms
Copy link
Contributor

@val-ms val-ms commented Nov 14, 2023

HTML files with <style> blocks containing non-utf8 sequences are causing warnings when processing them to extract base64 encoded images.

To resolve this, we can use the to_string_lossy() method that may allocate and sanitize a copy of the content if the non-utf8 characters are encountered.

Resolves: #1082

HTML files with <style> blocks containing non-utf8 sequences are causing
warnings when processing them to extract base64 encoded images.

To resolve this, we can use the to_string_lossy() method that may
allocate and sanitize a copy of the content if the non-utf8 characters
are encountered.

Resolves: Cisco-Talos#1082
@shutton shutton self-requested a review November 14, 2023 19:30
Copy link
Contributor

@shutton shutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much cleaner

@val-ms val-ms merged commit 86ba9bc into Cisco-Talos:main Nov 21, 2023
@val-ms val-ms deleted the CLAM-2484-file-bytes-not-valid branch November 21, 2023 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Maybe Bug: LibClamAV Warning: file_bytes is not valid unicode: invalid utf-8 sequence of 1 bytes from index ...

2 participants