FIPS-mode support Re: md5 -> sha2-256 caching, FP signatures, legacy CVD dsig verification, and related improvements#1532
Conversation
846f128 to
c3ada9e
Compare
c3ada9e to
392095b
Compare
a8a308e to
dccb188
Compare
TheRaynMan
left a comment
There was a problem hiding this comment.
Finished full review together, and we did amendments and suggestions live.
|
It looks like I accidentally broke support for openssl 1.1 and also libjson 13.1. Our Jenkins AlmaLinux 8 build fails with: If we want to maintain support for those, will have to |
fc296fa to
98456e7
Compare
71d026c to
8307327
Compare
8307327 to
fda858e
Compare
csbg-draynor
left a comment
There was a problem hiding this comment.
All my consistency recommendations from review were resolved.
|
@val-ms My Observations while doing the testing.
|
rsundriyal
left a comment
There was a problem hiding this comment.
We need to add the change to debug logs for --hash-hint use. Apart from it, looks great.
2538726 to
3f1cdca
Compare
Add test to ensure CVD loading fails with --fips-limits flag unless you provide the .cvd.sign file. Also provide .sign files in freshclam tests because they'll be needed to pass if building in a fips-enabled environment.
Scripted inputs may be used for automated tests. Added automated tests for the example program to verify correct behavior using different callback return codes and also using the new scan layer and fmap API's. Fixed a bug in ClamAV's evidence module (recording strong, PUA, and weak indicators for each layer). Rust HashMaps are unordered so the feature to get the last alert would return a random alert and not specifically the last one. Switching to IndexMap resolves this, and allows us to maintain insertion-order for iterating keys even when removing a key.
Add string message to `cl_strerror()` and in example program for cl_error_t::CL_BREAK enum variant. Also fix uninitialized memory use issue in example program.
It is a shortcoming of existing scan APIs that it is not possible to return an error without masking a verdict. We presently work around this limitation by counting up detections at the end and then overriding the error code with `CL_VIRUS`, if necessary. The `cl_scanfile_ex()`, `cl_scandesc_ex()`, and `cl_scanmap_ex()` functions should provide the scan verdict separately from the error code. This introduces a new enum for recording and reporting a verdict: `cl_verdict_t` with options: - `CL_VERDICT_NOTHING_FOUND` - `CL_VERDICT_TRUSTED` - `CL_VERDICT_STRONG_INDICATOR` - `CL_VERDICT_POTENTIALLY_UNWANTED` Notably, the newer scan APIs may set the verdict to `CL_VERDICT_TRUSTED` if there is a (hash-based) FP signature for a file, or in the cause where Authenticode or similar certificate-based verification was performed, or in the case where an application scan callback returned `CL_VERIFIED`. CLAM-763 CLAM-865
Improvements to the ex_scan_callbacks.c program: - Print the verdict enum variant names to be more explicit. - Add the file_props callback (aka metadata JSON) with --gen-json option. - Add a --debug option. - Use '-' in option names instead of '_' to be consistent with other programs. - Add option to disable allmatch, which I named --one-match. :) Tests: Add ex_scan_callbacks test where --allmatch is disabled. Verify that CL_VIRUS is returned when a match occurs. I found a few bugs and inconsistencies from this test and went and fixed them, and improved the clamav.h function comments as well. Largely this resulted in cleanup in `cli_magic_scan()` to make sure we don't accidentally overwrite the return code. But it also meant making sure that callback functions which are supposed to trust a file actually clear the evidence/verdict and don't return CL_VIRUS.
Both enum variants are 0, so it's a no-op. `cl_error_t` should be used to determine if we keep going or stop, whether that's because there was a detection and we're not in allmatch mode, or if because of an error. `cl_verdict_t` should be used to determine or say the verdict (clean is a verdict, though I feel 'nothing found' is more accurate).
Also minor fixes to sys.rs and clamav.h formatting
If the outermost layer is trusted (e.g. using an FP signature), the verdict passed back by the `cl_scan*_ex()` functions should be CL_VERDICT_TRUSTED. To ensure this, and other correct verdicts, I moved the logic setting the verdict to occur when adding indicators, or trusting a given layer. Then at the end of a scan, it will set the output verdict parameter to the top level verdict. This commit also: * Fixes a bug in the `ex_scan_callbacks` program where a crash would happen when a hash was retrieved for an inner layer, but isn't for the container. * Added debug logs whenever a hash is calculated or set, printing the hash type and hash string. * When a layer is trusted, in addition to removing evidence for that layer, it will also index the metadata JSON (if that feature is enabled) and will rename any "Viruses" to "IgnoredAlerts", and rename "ContainedIndicators" to "IgnoredContainedIndicators". * Fixed an issue where setting the hash algorithm with extra characters, like setting to "sha256789" would ignore the extra characters, and report the hash type as the same. It will now fail if the string length differs from the known hash algorithm.
The current name "Viruses" is incorrect both because not all malware are viruses, but also because ClamAV may be used to classify other data, not just to search for malware indicators. Renaming to "Alerts" is more consistent with other language such as the options "--alert-exceeds-max", etc. It will match the new "IgnoredAlerts", "ContainedAlerts", and "IgnoredContainedAlerts" JSON key names.
We presently record Alerts as an array of signature names. Instead, it should be an object with properties of its own. We should record alerting indicators and weak indicators in a single "Indicators", likely with the same structure as the "Alerts" objects. When an alerting indicator is ignored (e.g. ignored by callback or if the file is trusted by an FP signature), we can remove it from the "Alerts" array, and for the "Indicators" array, add a "Ignored" key with a string value that explains why it was ignored. This eliminates the need to track and propagate the additional "WeakIndicators" and "IgnoredAlerts" arrays.
6c6b77c to
b34ea5e
Compare
TheRaynMan
left a comment
There was a problem hiding this comment.
Still okay after rebase.
For OpenSSL 1, `EVP_get_digestbyname()` will fail with "sha2-*" algorithm names. Must use "sha256", etc. I made a shim that does the conversion, and I made an improvement to ignore case when converting alg names to our hash type enumeration. Other fixes for a few warnings.
b11cbf1 to
5314973
Compare
|
Added openssl 1.x compatibility fix for debian:11 when using distro-provided dependencies. Force-push is to fix commit signing because apparently my macbook's signing isn't properly configured. |
This PR migrates ClamAV away from dependence on MD5 so that we can properly operate in FIPS environment. But it includes numerous other improvements.
This PR touches on a lot of modules and interfaces. Please review each commit messages. They are a little bit more detailed than the following. You may also find it easier to review each commit diff one at a time. They build on each other but are largely independent.
The key changes:
Migrate the clean cache from MD5 to SHA2-256.
This task had major ripple effects because the MD5 we calculate to use for the clean cache is also made available to callbacks. The consequence is that I had to make sure we only calcualte MD5 (or any hash) when required. And then offer new callbacks that let the caller get whichever hash they want. This tied into other necessary improvements to the callbacks, and the
cl_scan*()interfaces.This snowballed - but I am very happy with the end product.
Disable the legacy MD5-based CVD digital signature verification method when OpenSSL has FIPS-mode enabled, or when ClamAV is configured to enforce FIPS limits. This was rather simple.
Disable loading MD5 and SHA1 false positive (FP) signatures or using those hashes to check for FPs. This was also simple.
At a high level, this PR adds the following:
Add new option to calculate and record additional hash types when the "generate metadata JSON" feature is enabled:
CL_SCAN_GENERAL_STORE_EXTRA_HASHES--json-store-extra-hashes(default off)JsonStoreExtraHashes(default 'no')Renamed the sigtool option
--sha256to--sha2-256. The original option is still functional, but is deprecated.Record unique object-id for each layer scanned. This can now be found in the scan metadata JSON and may be queried for each scan layer from the new scan callbacks functions (below).
For the "generate metadata JSON" feature:
Signature matches with names that start with "Weak." will no longer alert. They will instead be tracked internally and will be present in scan metadata JSON. This is a step towards enabling alerting signatures (Strong indicators or PUA indicators) to depend on prior Weak indicator matches in the current layer or in child layers.
libclamav: Added a new engine option to toggle temp directory recursion.
Temp directory recursion is the idea that each object scanned in ClamAV's recursive extract/scan process would get a new temp subdirectory, mimicing the nesting structure of the file. It was introduced in ClamAV 0.103 and is enabled whenever
--leave-temps/LeaveTemporaryFilesis enabled. This PR makes it that an application can separately enable it if they wish. For ClamScan and ClamD, it will remain tied to--leave-temps/LeaveTemporaryFiles. The new engine option can be enabled with:ClamBC: fix crashes on startup related to recursion stack initialization and cleanup.
Resolves: Execute clambc -f 3986187.cbc return Segmentation fault (core dumped) #1484
libclamav: Add
cl_scanfile_ex(),cl_scanmap_ex(), andcl_scandesc_ex()functions that provide the following additional parameters:hash_hint: (Optional) A NULL terminated string of the file hash so that libclamav does not need to calculate it.[out] hash_out: (Optional) A NULL terminated string of the file hash. The caller is responsible for freeing the string.hash_alg: The hashing algorithm used for eitherhash_hintorhash_out. Supported algorithms are "md5", "sha1", "sha2-256". If not specified, the default is "sha2-256".file_type_hint: (Optional) A NULL terminated string of the file type hint. E.g. "pe", "elf", "zip", etc. You may also use ClamAV type names such as "CL_TYPE_PE". ClamAV will ignore the hint if it is not familiar with the specified type. See also: https://docs.clamav.net/appendix/FileTypes.html#file-typesfile_type_out: (Optional) A NULL terminated string of the file type of the top layer as determined by ClamAV. Will take the form of the standard ClamAV file type format. E.g. "CL_TYPE_PE". See also: https://docs.clamav.net/appendix/FileTypes.html#file-typesClamScan: Add hash & file-type in/out CLI options. We will NOT be adding this for ClamDScan, as we don't have a mechanism
in the ClamD socket API to receive scan options or a way for ClamD to include scan metadata in the response:
--hash-hint: The file hash so that libclamav does not need to calculate it. The type of hash must match the--hash-alg.--log-hash: Print the file hash after each file scanned. The type of hash printed will match the--hash-alg.--hash-alg: The hashing algorithm used for either--hash-hintor--log-hash. Supported algorithms are 'md5', 'sha1', 'sha2-256'. If not specified, the default is 'sha2-256'.--file-type-hint: The file type hint so that libclamav can optimize scanning. E.g. 'pe', 'elf', 'zip', etc. You may also use ClamAV type names such as 'CL_TYPE_PE'. ClamAV will ignore the hint if it is not familiar with the specified type. See also: https://docs.clamav.net/appendix/FileTypes.html#file-types--log-file-type: Print the file type after each file scanned.Fix double-extraction of OOXML-based office documents.
Always run callbacks for embedded files. That is, files found within other files through signature matches instead of by parsing the files will now be processed the same way and so they will trigger application callbacks (e.g. "pre-scan", "post-scan", etc.).
libclamav: Added new scan-layer callback API functions:
Each callback may alter scan behavior using the following return codes:
CL_BREAK: Scan aborted by callback (the rest of the scan is skipped). This does not mark the file as clean or infected, it just skips the rest of the scan.CL_SUCCESS/CL_CLEAN: File scan will continue. This is different than CL_VERIFIED because it does not affect prior or future alerts. Return CL_VERIFIED instead if you want to remove prior alerts for this layer and skip the rest of the scan for this layer.CL_VIRUS: This means you don't trust the file. A new alert will be added. For CL_SCAN_CALLBACK_ALERT: Means you agree with the alert (no extra alert needed).CL_VERIFIED: Layer explicitly trusted by the callback and previous alerts removed FOR THIS layer. You might want to do this if you trust the hash or verified a digital signature. The rest of the scan will be skipped FOR THIS layer. For contained files, this does NOT mean that the parent or adjacent layers are trusted.Each callback is given a pointer to the current scan layer from which they can get previous layers, can get the the layer's fmap, and then various attributes of the layer and of the fmap. To make this possible, this I introduced a handful of new APIs to
query scan-layer details and fmap details:
This deprecates, but does not immediately remove, the existing scan callbacks:
I have also added an interactive test program to demonstrate the callbacks. See:
examples/ex_scan_callbacks.cClamScan & libclamav: Improved the precision of tje bytes-scanned and bytes-read counters. The ClamScan scan summary will now report exact counts in "GiB", "MiB", "KiB", or "B" as appropriate. Previously, it always report "MB".
libclamav: Added extended hashing functions with a "flags" parameter that allows the caller to choose if they want to bypass FIPS:
Freshclam, ClamD, ClamScan, and Sigtool: Added an option to enable FIPS-like limits disabling MD5 and SHA1 from being used for verifying digital sigantures or for being used for false-positive (FP) checks.
For
freshclam.confandclamd.confset this config option:For
clamscanandsigtooluse this commandline option:For libclamav: Enable FIPS-limits for a ClamAV engine like this:
CLAM-255
CLAM-1412
CLAM-1433
CLAM-1583
CLAM-1858
CLAM-1859
CLAM-1860
CLAM-2485
CLAM-2626
CLAM-2796
Resolves: #564