Skip to content

Conversation

@knqyf263
Copy link
Collaborator

@knqyf263 knqyf263 commented Sep 4, 2025

Description

This PR adds the ability to reuse existing BOM structure when scanning SBOMs, preserving the original SBOM components, properties, and relationships while only updating vulnerability information.

Key Features

  • BOM Structure Preservation: Maintains original SBOM components, custom properties, and relationships
  • External Property Preservation: Custom properties from third-party SBOMs retain their original names (no unwanted namespace prefixes)
  • Vulnerability-Only Updates: Only refreshes vulnerability data without altering the original structure
  • Dependency Preservation: Maintains empty dependsOn arrays and component relationships
  • Library Integration: Enables SBOM customization when using Trivy as a library - add custom components and properties that will be preserved across scans

Implementation Details

  • Add reuseExistingBOM() function to detect and preserve existing BOM structure
  • Add BOM.Clone() and Component.Clone() methods for safe BOM manipulation
  • Add External field to core.Property to distinguish Trivy vs external properties
  • Enhanced CycloneDX unmarshal/marshal logic to preserve external property names
  • Use BOM-Ref for component lookup during vulnerability updates

Before/After Examples

Before (main branch):

# Step 1: Generate base SBOM
./trivy image alpine:3.18 --format cyclonedx --output base.json

# Step 2: Add custom properties and application component to SBOM
jq '
.metadata.component.properties += [{"name": "custom:environment", "value": "test"}] |
.components += [{
  "type": "application",
  "bom-ref": "[email protected]",
  "name": "custom-app", 
  "version": "1.0.0",
  "properties": [{"name": "custom:app-type", "value": "test-application"}]
}]
' base.json > custom.json

# Step 3: Scan the modified SBOM
./trivy sbom custom.json --format cyclonedx --output result.json

# Step 4: Check if custom properties and components are preserved
jq '.metadata.component.properties[] | select(.name | contains("custom"))' result.json
# Result: Wrong namespaces
{
  "name": "aquasecurity:trivy:custom:environment",
  "value": "test"
}

jq '.components[] | select(.name == "custom-app")' result.json  
# Result: Empty (custom components lost)

After (this PR):

# Step 1: Generate base SBOM  
./trivy image alpine:3.18 --format cyclonedx --output base.json

# Step 2: Add custom properties and components
jq '
.metadata.component.properties += [{"name": "custom:environment", "value": "test"}] |
.components += [{
  "type": "application",
  "bom-ref": "[email protected]", 
  "name": "custom-app",
  "version": "1.0.0",
  "properties": [{"name": "custom:app-type", "value": "test-application"}]
}]
' base.json > custom.json

# Step 3: Scan the modified SBOM
./trivy sbom custom.json --format cyclonedx --output result.json

# Step 4: Verify custom properties are preserved
jq '.metadata.component.properties[] | select(.name | contains("custom"))' result.json
# Result: {"name": "custom:environment", "value": "test"}

# Step 5: Verify custom components are preserved
jq '.components[] | select(.name == "custom-app")' result.json
# Result: Complete custom component with original properties

# Step 6: Verify Trivy properties still get namespace prefixes
jq '.metadata.component.properties[] | select(.name | startswith("aquasecurity:trivy:"))' custom.json
# Result: Trivy properties with proper namespace prefixes

Related issues

Checklist

  • I've read the guidelines for contributing to this repository.
  • I've followed the conventions in the PR title.
  • I've added tests that prove my fix is effective or that my feature works.
  • I've updated the documentation with the relevant information (if needed).
  • I've added usage information (if the PR introduces new options)
  • I've included a "before" and "after" example to the description (if the PR is a user interface change).

- Add reuseExistingBOM() to preserve original SBOM structure
- Only update vulnerabilities when BOM is already present
- Add BOM.Clone() and Component.Clone() methods to prevent side effects
- Fix CycloneDX unmarshal to preserve empty dependsOn arrays
- Use BOM-Ref for component lookup in vulnerability updates

This change enables proper SBOM rescanning where the original structure
is maintained and only vulnerability information is refreshed.
Add External field to Property struct to distinguish between Trivy-generated
and external properties. External properties maintain their original names
while Trivy properties receive namespace prefixes during marshaling.

- Add External bool field to core.Property struct
- Update unmarshalProperties to detect and mark external properties
- Modify marshal logic to preserve external property names
- Use strings.CutPrefix for efficient prefix detection
- Use cmp.Or and lo.Ternary for clean conditional logic

This ensures SPDX compatibility and prevents corruption of third-party
SBOM property names during processing.
@knqyf263 knqyf263 self-assigned this Sep 4, 2025
The addition of the External field to core.Property struct changes the
hash calculation in calcSPDXID, resulting in different SPDX IDs being
generated. This is expected behavior and does not affect functionality.
@knqyf263 knqyf263 marked this pull request as ready for review September 9, 2025 13:37
@DmitriyLewen
Copy link
Contributor

We already tried to do similar logic - #7340 (review)

IIRC there are 2 big problem:

  • bom-ref is still not required field - https://cyclonedx.org/docs/1.6/json/#components_items_bom-ref
  • we can't use filter flags for packages (e.g. --pkg-types, --relationships):
     ➜   ./trivy -q image aquasec/trivy -f cyclonedx -o report.cdx.json
     ➜   cat report.cdx.json | grep "pkg:apk" | wc -l
          151
     ➜  trivy -q sbom --pkg-types library report.cdx.json -f cyclonedx | grep "pkg:apk" | wc -l
            0
     ➜  ./trivy -q sbom --pkg-types library report.cdx.json -f cyclonedx | grep "pkg:apk" | wc -l
          151

- Enable GenerateBOMRef option for CycloneDX JSON and attestation formats
- Keep GenerateBOMRef disabled for SPDX formats (no BOM-Ref concept)
- Add test case for components with missing BOM-Ref fields
- Verify that missing BOM-Refs are auto-generated from PURL
@knqyf263
Copy link
Collaborator Author

Thank you for reminding me of the issues. As for the generation of BOM-Refs, it should not be a problem since BOM-Refs are automatically generated within the intermediate representation of our SBOM. I fixed a bug that I found.
d309c4f

On the other hand, filtering is a difficult problem. Perhaps one option would be not to handle it at all. I'll consider it.

@knqyf263
Copy link
Collaborator Author

@DmitriyLewen Implementing filtering itself is possible, but it is not a quick task. Have you actually seen users who want to scan CycloneDX, filter packages and export CycloneDX again?

@DmitriyLewen
Copy link
Contributor

I don’t think I’ve come across such cases from users.
I think that scanning an SBOM file and also using an SBOM as the output format (e.g. trivy sbom rep.cdx.json -f cyclonedx) is a very questionable use case.

@knqyf263
Copy link
Collaborator Author

I think that scanning an SBOM file and also using an SBOM as the output format (e.g. trivy sbom rep.cdx.json -f cyclonedx) is a very questionable use case.

I think so too. I believe that scanning an SBOM and then outputting another SBOM is quite a rare use case (and only possible with CycloneDX). Therefore, for now, if an CycloneDX is scanned and then output, I intend to make the filtering CLI flag unavailable. Of course, a warning will be displayed.

And if there are requests from users, I would like to carefully listen to that use case and then decide at that time whether or not to implement it.

@DmitriyLewen
Copy link
Contributor

That sounds logical.

Therefore, for now, if an CycloneDX is scanned and then output, I intend to make the filtering CLI flag unavailable. Of course, a warning will be displayed.

Let’s do it this way.

Copy link
Contributor

@DmitriyLewen DmitriyLewen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

{
"name": "openssl",
"SPDXID": "SPDXRef-Package-22a178da112ac20a",
"SPDXID": "SPDXRef-Package-cb268df467bc826c",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We calc the SPDX-ID as a hash of core.component:

func (m *Marshaler) spdxPackage(c *core.Component, timeNow, pkgDownloadLocation string) (spdx.Package, error) {
pkgID, err := calcSPDXID(m.hasher, c)

So after adding core.Property.External field, the hash was changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scanned file (testdata/sbom/fluentd-multiple-lockfiles-cyclonedx.json) doesn’t contain these fields.
That is why we remove them.

Comment on lines 1219 to 1225
// For SBOM-to-SBOM scanning (for example, to add vulnerabilities to the SBOM file), we should not modify the scanned file.
// cf. https://github.com/aquasecurity/trivy/pull/9439#issuecomment-3295533665
if slices.Contains(types.SupportedSBOMFormats, options.Format) &&
(!slices.Equal(options.PkgTypes, types.PkgTypes) || !slices.Equal(options.PkgRelationships, ftypes.Relationships)) {
log.Warnf("Trivy doesn't support '--pkg-types' and '--pkg-relationships' options for SBOM to SBOM scan. These options will be ignored.")
}

Copy link
Contributor

@DmitriyLewen DmitriyLewen Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added warning about the --pkg-types and --pkg-relationship flags for SBOM-to-SBOM scanning (as we discussed here).

I am not sure this is the good place, but for other places (e.g. in Align function) we would need to throw targetKind.
However, as we said, this is most likely a very rare case, so I settled on this solution.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly because I was struggling with where to display the warning, I hadn’t committed it yet. Since I don’t want to put too many implementation details in app.go, I moved it to run.go.
7a59f28

However, I’m thinking of eventually refactoring it and implementing it inside Options.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix the linter issues

@DmitriyLewen DmitriyLewen force-pushed the fix/sbom-reuse-bom-structure branch from 489e1d7 to b62ab48 Compare September 19, 2025 10:10
- Move SBOM package filtering validation from app.go to run.go
- Consolidate validation logic into checkOptions function
- Include client/server mode warning alongside package filtering warning
- Improve warning message clarity for SBOM to SBOM scanning
- Add TODO comment for future refactoring
@knqyf263 knqyf263 force-pushed the fix/sbom-reuse-bom-structure branch from 5b25223 to 9a39f27 Compare September 19, 2025 12:25
Signed-off-by: knqyf263 <[email protected]>
@knqyf263 knqyf263 added this pull request to the merge queue Sep 20, 2025
Merged via the queue into aquasecurity:main with commit aff03eb Sep 20, 2025
13 checks passed
@knqyf263 knqyf263 deleted the fix/sbom-reuse-bom-structure branch September 20, 2025 14:45
alexlebens pushed a commit to alexlebens/infrastructure that referenced this pull request Sep 30, 2025
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [mirror.gcr.io/aquasec/trivy](https://www.aquasec.com/products/trivy/) ([source](https://github.com/aquasecurity/trivy)) | minor | `0.66.0` -> `0.67.0` |

---

### Release Notes

<details>
<summary>aquasecurity/trivy (mirror.gcr.io/aquasec/trivy)</summary>

### [`v0.67.0`](https://github.com/aquasecurity/trivy/blob/HEAD/CHANGELOG.md#0670-2025-09-30)

[Compare Source](aquasecurity/trivy@v0.66.0...v0.67.0)

##### Features

- add documentation URL for database lock errors ([#&#8203;9531](aquasecurity/trivy#9531)) ([eba48af](aquasecurity/trivy@eba48af))
- **cli:** change --list-all-pkgs default to true ([#&#8203;9510](aquasecurity/trivy#9510)) ([7b663d8](aquasecurity/trivy@7b663d8))
- **cloudformation:** support default values and list results in Fn::FindInMap ([#&#8203;9515](aquasecurity/trivy#9515)) ([42b3bf3](aquasecurity/trivy@42b3bf3))
- **cyclonedx:** preserve SBOM structure when scanning SBOM files with vulnerability updates ([#&#8203;9439](aquasecurity/trivy#9439)) ([aff03eb](aquasecurity/trivy@aff03eb))
- **redhat:** add os-release detection for RHEL-based images ([#&#8203;9458](aquasecurity/trivy#9458)) ([cb25a07](aquasecurity/trivy@cb25a07))
- **sbom:** added support for CoreOS ([#&#8203;9448](aquasecurity/trivy#9448)) ([6d562a3](aquasecurity/trivy@6d562a3))
- **seal:** add seal support ([#&#8203;9370](aquasecurity/trivy#9370)) ([e4af279](aquasecurity/trivy@e4af279))

##### Bug Fixes

- **aws:** use `BuildableClient` insead of `xhttp.Client` ([#&#8203;9436](aquasecurity/trivy#9436)) ([fa6f1bf](aquasecurity/trivy@fa6f1bf))
- close file descriptors and pipes on error paths ([#&#8203;9536](aquasecurity/trivy#9536)) ([a4cbd6a](aquasecurity/trivy@a4cbd6a))
- **db:** Dowload database when missing but metadata still exists ([#&#8203;9393](aquasecurity/trivy#9393)) ([92ebc7e](aquasecurity/trivy@92ebc7e))
- **k8s:** disable parallel traversal with fs cache for k8s images ([#&#8203;9534](aquasecurity/trivy#9534)) ([c0c7a6b](aquasecurity/trivy@c0c7a6b))
- **misconf:** handle tofu files in module detection ([#&#8203;9486](aquasecurity/trivy#9486)) ([bfd2f6b](aquasecurity/trivy@bfd2f6b))
- **misconf:** strip build metadata suffixes from image history ([#&#8203;9498](aquasecurity/trivy#9498)) ([c938806](aquasecurity/trivy@c938806))
- **misconf:** unmark cty values before access ([#&#8203;9495](aquasecurity/trivy#9495)) ([8e40d27](aquasecurity/trivy@8e40d27))
- **misconf:** wrap legacy ENV values in quotes to preserve spaces ([#&#8203;9497](aquasecurity/trivy#9497)) ([267a970](aquasecurity/trivy@267a970))
- **nodejs:** parse workspaces as objects for package-lock.json files ([#&#8203;9518](aquasecurity/trivy#9518)) ([404abb3](aquasecurity/trivy@404abb3))
- **nodejs:** use snapshot string as `Package.ID` for pnpm packages ([#&#8203;9330](aquasecurity/trivy#9330)) ([4517e8c](aquasecurity/trivy@4517e8c))
- **vex:** don't  suppress vulns for packages with infinity loop ([#&#8203;9465](aquasecurity/trivy#9465)) ([78f0d4a](aquasecurity/trivy@78f0d4a))
- **vuln:** compare `nuget` package names in lower case ([#&#8203;9456](aquasecurity/trivy#9456)) ([1ff9ac7](aquasecurity/trivy@1ff9ac7))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0MS4xMTYuNiIsInVwZGF0ZWRJblZlciI6IjQxLjExNi42IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJpbWFnZSJdfQ==-->

Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/1622
Co-authored-by: Renovate Bot <[email protected]>
Co-committed-by: Renovate Bot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(cyclonedx): preserve SBOM structure when scanning SBOM files with vulnerability updates

2 participants