Skip to content

Clam 2586 save urls html#1281

Merged
val-ms merged 1 commit intoCisco-Talos:mainfrom
ragusaa:CLAM-2586-SaveUrlsHTML
Sep 12, 2024
Merged

Clam 2586 save urls html#1281
val-ms merged 1 commit intoCisco-Talos:mainfrom
ragusaa:CLAM-2586-SaveUrlsHTML

Conversation

@ragusaa
Copy link
Contributor

@ragusaa ragusaa commented Jun 18, 2024

Add the ability to record URL's found in HTML if the the generate JSON metadata option is enabled.

Also adds an option disable this in case you want the json metadata feature but don't want to record HTML URL's:

  • clamscan command-line option: --json-store-html-urls=no
  • clamd.conf config option: JsonStoreHTMLUrls no
  • libclamav general scan option: CL_SCAN_GENERAL_STORE_HTML_URLS

Note: Prior to the 1.5.0 release, we have since changed the options use "URI" instead of "URL"

  • clamscan command-line option: --json-store-html-uris=no
  • clamd.conf config option: JsonStoreHTMLURIs no
  • libclamav general scan option: CL_SCAN_GENERAL_STORE_HTML_URIS

@ragusaa ragusaa force-pushed the CLAM-2586-SaveUrlsHTML branch from 638344d to 73a2786 Compare June 18, 2024 15:50
@ragusaa ragusaa force-pushed the CLAM-2586-SaveUrlsHTML branch 3 times, most recently from 685401d to 44832fc Compare June 25, 2024 18:08
@ragusaa ragusaa force-pushed the CLAM-2586-SaveUrlsHTML branch 4 times, most recently from 7243032 to 5ea9ec7 Compare July 10, 2024 22:08
Copy link
Contributor

@val-ms val-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm done with code review part. Time to do some testing.

@ragusaa ragusaa force-pushed the CLAM-2586-SaveUrlsHTML branch 2 times, most recently from d6705c7 to 113412e Compare July 18, 2024 18:37
@ragusaa ragusaa force-pushed the CLAM-2586-SaveUrlsHTML branch from f8c043a to 2c9abc4 Compare July 18, 2024 20:54
@ragusaa ragusaa requested a review from val-ms July 18, 2024 20:54
@ragusaa ragusaa force-pushed the CLAM-2586-SaveUrlsHTML branch from eba2c5d to 44aa119 Compare July 19, 2024 19:52
val-ms
val-ms previously approved these changes Jul 19, 2024
Copy link
Contributor

@val-ms val-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

I also did some manual testing with a small selection of HTML files. That went well.

Approving for inclusion in 1.5.0

Store URLs found in HTML `<a>` and `<form>` tags during scan of HTML files
when recording scan metadata.

HTML URL recording will be ON by default, but is a part of the
generate-metadata-json feature.
The generate-metadata-json feature is OFF by default.

This introduces a new general scan option:
- libclamav: `CL_SCAN_GENERAL_STORE_HTML_URLS`.
- ClamD: `JsonStoreHTMLUrls`.
- ClamScan: `--json-store-html-urls`

Thank you Matt Jolly for the helpful comment on the pull request.
@val-ms
Copy link
Contributor

val-ms commented Sep 11, 2024

Rebased and removed the temporary function from the TODO. Ready to test and hopefully merge.

@val-ms val-ms merged commit 03d0481 into Cisco-Talos:main Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants