-
Notifications
You must be signed in to change notification settings - Fork 2.8k
fix(cyclonedx): handle multiple license types #9378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(cyclonedx): handle multiple license types #9378
Conversation
- Use Name/ID or Expression type for Licenses - Use Name field for Expression if both types exist
|
It’s unclear when exactly we should use an expression in licenses rather than ids/names, but the current logic feels unnecessarily complicated. Wouldn’t it be sufficient to have a simple implementation where everything is consolidated into an SPDX expression whenever possible? diff --git a/pkg/sbom/cyclonedx/marshal.go b/pkg/sbom/cyclonedx/marshal.go
index f887692d2..d42085f86 100644
--- a/pkg/sbom/cyclonedx/marshal.go
+++ b/pkg/sbom/cyclonedx/marshal.go
@@ -295,20 +295,42 @@ func (m *Marshaler) Licenses(licenses []string) *cdx.Licenses {
if len(licenses) == 0 {
return nil
}
- choices := lo.Map(licenses, func(license string, _ int) cdx.LicenseChoice {
+ return m.normalizeLicenses(licenses)
+}
+
+func (m *Marshaler) normalizeLicenses(licenses []string) *cdx.Licenses {
+ expressions := lo.Map(licenses, func(license string, _ int) expression.Expression {
return m.normalizeLicense(license)
})
- return lo.ToPtr(NormalizeLicenses(choices))
+ // Check if all licenses are valid SPDX expressions (both SimpleExpr and CompoundExpr)
+ if lo.EveryBy(expressions, func(expr expression.Expression) bool {
+ return expr.IsSPDXExpression()
+ }) {
+ // When all licenses are valid SPDX expressions, combine them into a single expression field
+ exprStrs := lo.Map(expressions, func(expr expression.Expression, _ int) string {
+ return expr.String()
+ })
+ return &cdx.Licenses{{Expression: strings.Join(exprStrs, " AND ")}}
+ }
+
+ // If not all licenses are SPDX expressions, use individual LicenseChoice entries with license.id or license.name
+ choices := lo.Map(expressions, func(expr expression.Expression, _ int) cdx.LicenseChoice {
+ if s, ok := expr.(expression.SimpleExpr); ok && s.IsSPDXExpression() {
+ // Use license.id for valid SPDX ID (e.g., "MIT", "Apache-2.0")
+ return cdx.LicenseChoice{License: &cdx.License{ID: s.String()}}
+ }
+ // Use license.name for everything else (invalid SPDX ID, SPDX expression, etc.)
+ return cdx.LicenseChoice{License: &cdx.License{Name: expr.String()}}
+ })
+ return lo.ToPtr(cdx.Licenses(choices))
}
-func (m *Marshaler) normalizeLicense(license string) cdx.LicenseChoice {
+func (m *Marshaler) normalizeLicense(license string) expression.Expression {
// Save text license as licenseChoice.license.name
if after, ok := strings.CutPrefix(license, licensing.LicenseTextPrefix); ok {
- return cdx.LicenseChoice{
- License: &cdx.License{
- Name: after,
- },
+ return expression.SimpleExpr{
+ License: after,
}
}
@@ -320,81 +342,9 @@ func (m *Marshaler) normalizeLicense(license string) cdx.LicenseChoice {
if err != nil {
// Not fail on the invalid license
m.logger.Warn("Unable to marshal SPDX licenses", log.String("license", license))
- return cdx.LicenseChoice{
- License: &cdx.License{Name: license},
- }
- }
-
- // The license is not a valid SPDX ID or SPDX expression
- if !normalizedLicenses.IsSPDXExpression() {
- // Use LicenseChoice.License.Name for invalid SPDX ID / SPDX expression
- return cdx.LicenseChoice{
- License: &cdx.License{Name: normalizedLicenses.String()},
- }
- }
-
- // The license is a valid SPDX ID or SPDX expression
- var licenseChoice cdx.LicenseChoice
- switch normalizedLicenses.(type) {
- case expression.SimpleExpr:
- // Use LicenseChoice.License.ID for valid SPDX ID
- licenseChoice.License = &cdx.License{ID: normalizedLicenses.String()}
- case expression.CompoundExpr:
- // Use LicenseChoice.Expression for valid SPDX expression (with any conjunction)
- // e.g. "GPL-2.0 WITH Classpath-exception-2.0" or "GPL-2.0 AND MIT"
- licenseChoice.Expression = normalizedLicenses.String()
- }
-
- return licenseChoice
-}
-
-// NormalizeLicenses fixes invalid CycloneDX schema cases:
-// - using both LicenseChoice.License and LicenseChoice.Expression in Licenses (convert Expression to License.Name)
-// - using more than one LicenseChoice with Expression (combine them with AND into one Expression)
-func NormalizeLicenses(choices []cdx.LicenseChoice) cdx.Licenses {
- var licenses, expressions []cdx.LicenseChoice
-
- for _, choice := range choices {
- if choice.Expression != "" {
- expressions = append(expressions, choice)
- } else if choice.License != nil {
- licenses = append(licenses, choice)
- }
- }
-
- if len(licenses) == 0 && len(expressions) == 0 {
- return nil
- }
-
- // If both License and Expression are used, convert Expression to License.Name
- if len(expressions) > 0 && len(licenses) > 0 {
- for _, c := range expressions {
- licenses = append(licenses, cdx.LicenseChoice{
- License: &cdx.License{Name: c.Expression},
- })
- }
- return licenses
- }
-
- // Names/IDs only
- if len(licenses) > 0 {
- return licenses
- }
-
- // Merge multiple expressions into one expression with AND
- if len(expressions) > 1 {
- var b strings.Builder
- for i, c := range expressions {
- if i > 0 {
- b.WriteString(" AND ")
- }
- b.WriteString(c.Expression)
- }
- return cdx.Licenses{{Expression: b.String()}}
+ return expression.SimpleExpr{License: license}
}
-
- // len(expressions) == 1
- return expressions
+ return normalizedLicenses
}
func (*Marshaler) Properties(properties []core.Property) *[]cdx.Property {
diff --git a/pkg/sbom/cyclonedx/marshal_test.go b/pkg/sbom/cyclonedx/marshal_test.go
index fb0277c0c..a78cfd82a 100644
--- a/pkg/sbom/cyclonedx/marshal_test.go
+++ b/pkg/sbom/cyclonedx/marshal_test.go
@@ -511,9 +511,7 @@ func TestMarshaler_MarshalReport(t *testing.T) {
Version: "2.30-93.el8",
Licenses: &cdx.Licenses{
cdx.LicenseChoice{
- License: &cdx.License{
- ID: "GPL-3.0-or-later",
- },
+ Expression: "GPL-3.0-or-later",
},
},
PackageURL: "pkg:rpm/centos/[email protected]?arch=aarch64&distro=centos-8.3.2011",
@@ -1031,9 +1029,7 @@ func TestMarshaler_MarshalReport(t *testing.T) {
Version: "1:2.2.53-1.el8",
Licenses: &cdx.Licenses{
cdx.LicenseChoice{
- License: &cdx.License{
- ID: "GPL-2.0-or-later",
- },
+ Expression: "GPL-2.0-or-later",
},
},
PackageURL: "pkg:rpm/centos/[email protected]?arch=aarch64&distro=centos-8.3.2011&epoch=1",
@@ -1077,9 +1073,7 @@ func TestMarshaler_MarshalReport(t *testing.T) {
Version: "2.28-151.el8",
Licenses: &cdx.Licenses{
cdx.LicenseChoice{
- License: &cdx.License{
- ID: "GPL-2.0-or-later",
- },
+ Expression: "GPL-2.0-or-later",
},
},
PackageURL: "pkg:rpm/centos/[email protected]?arch=aarch64&distro=centos-8.3.2011",
@@ -2020,9 +2014,7 @@ func TestMarshaler_MarshalReport(t *testing.T) {
PackageURL: "pkg:npm/[email protected]",
Licenses: &cdx.Licenses{
cdx.LicenseChoice{
- License: &cdx.License{
- ID: "MIT",
- },
+ Expression: "MIT",
},
},
Properties: &[]cdx.Property{
@@ -2140,9 +2132,7 @@ func TestMarshaler_Licenses(t *testing.T) {
},
want: &cdx.Licenses{
cdx.LicenseChoice{
- License: &cdx.License{
- ID: "MIT",
- },
+ Expression: "MIT",
},
},
},
@@ -2240,14 +2230,7 @@ func TestMarshaler_Licenses(t *testing.T) {
},
want: &cdx.Licenses{
cdx.LicenseChoice{
- License: &cdx.License{
- ID: "MIT",
- },
- },
- cdx.LicenseChoice{
- License: &cdx.License{
- ID: "AFL-2.0",
- },
+ Expression: "MIT AND AFL-2.0",
},
},
},
@@ -2278,14 +2261,7 @@ func TestMarshaler_Licenses(t *testing.T) {
},
want: &cdx.Licenses{
cdx.LicenseChoice{
- License: &cdx.License{
- ID: "MIT",
- },
- },
- cdx.LicenseChoice{
- License: &cdx.License{
- Name: "AFL-2.0 WITH Linux-syscall-note",
- },
+ Expression: "MIT AND AFL-2.0 WITH Linux-syscall-note",
},
},
},
If using license.id is preferable in the case of a single SPDX ID, then we could also handle it as shown below. Diff
diff --git a/pkg/sbom/cyclonedx/marshal.go b/pkg/sbom/cyclonedx/marshal.go
index f887692d2..8bbb6a0b6 100644
--- a/pkg/sbom/cyclonedx/marshal.go
+++ b/pkg/sbom/cyclonedx/marshal.go
@@ -295,20 +295,50 @@ func (m *Marshaler) Licenses(licenses []string) *cdx.Licenses {
if len(licenses) == 0 {
return nil
}
- choices := lo.Map(licenses, func(license string, _ int) cdx.LicenseChoice {
+ return m.normalizeLicenses(licenses)
+}
+
+func (m *Marshaler) normalizeLicenses(licenses []string) *cdx.Licenses {
+ expressions := lo.Map(licenses, func(license string, _ int) expression.Expression {
return m.normalizeLicense(license)
})
- return lo.ToPtr(NormalizeLicenses(choices))
+ // Check if all licenses are valid SPDX expressions
+ allValidSPDX := lo.EveryBy(expressions, func(expr expression.Expression) bool {
+ return expr.IsSPDXExpression()
+ })
+
+ // Check if at least one is a CompoundExpr
+ hasCompoundExpr := lo.ContainsBy(expressions, func(expr expression.Expression) bool {
+ _, isCompound := expr.(expression.CompoundExpr)
+ return isCompound
+ })
+
+ // If all are valid SPDX AND at least one contains CompoundExpr, combine into single Expression
+ if allValidSPDX && hasCompoundExpr {
+ exprStrs := lo.Map(expressions, func(expr expression.Expression, _ int) string {
+ return expr.String()
+ })
+ return &cdx.Licenses{{Expression: strings.Join(exprStrs, " AND ")}}
+ }
+
+ // Otherwise use individual LicenseChoice entries with license.id or license.name
+ choices := lo.Map(expressions, func(expr expression.Expression, _ int) cdx.LicenseChoice {
+ if s, ok := expr.(expression.SimpleExpr); ok && s.IsSPDXExpression() {
+ // Use license.id for valid SPDX ID (e.g., "MIT", "Apache-2.0")
+ return cdx.LicenseChoice{License: &cdx.License{ID: s.String()}}
+ }
+ // Use license.name for everything else (invalid SPDX ID, SPDX expression, etc.)
+ return cdx.LicenseChoice{License: &cdx.License{Name: expr.String()}}
+ })
+ return lo.ToPtr(cdx.Licenses(choices))
}
-func (m *Marshaler) normalizeLicense(license string) cdx.LicenseChoice {
+func (m *Marshaler) normalizeLicense(license string) expression.Expression {
// Save text license as licenseChoice.license.name
if after, ok := strings.CutPrefix(license, licensing.LicenseTextPrefix); ok {
- return cdx.LicenseChoice{
- License: &cdx.License{
- Name: after,
- },
+ return expression.SimpleExpr{
+ License: after,
}
}
@@ -320,81 +350,9 @@ func (m *Marshaler) normalizeLicense(license string) cdx.LicenseChoice {
if err != nil {
// Not fail on the invalid license
m.logger.Warn("Unable to marshal SPDX licenses", log.String("license", license))
- return cdx.LicenseChoice{
- License: &cdx.License{Name: license},
- }
- }
-
- // The license is not a valid SPDX ID or SPDX expression
- if !normalizedLicenses.IsSPDXExpression() {
- // Use LicenseChoice.License.Name for invalid SPDX ID / SPDX expression
- return cdx.LicenseChoice{
- License: &cdx.License{Name: normalizedLicenses.String()},
- }
- }
-
- // The license is a valid SPDX ID or SPDX expression
- var licenseChoice cdx.LicenseChoice
- switch normalizedLicenses.(type) {
- case expression.SimpleExpr:
- // Use LicenseChoice.License.ID for valid SPDX ID
- licenseChoice.License = &cdx.License{ID: normalizedLicenses.String()}
- case expression.CompoundExpr:
- // Use LicenseChoice.Expression for valid SPDX expression (with any conjunction)
- // e.g. "GPL-2.0 WITH Classpath-exception-2.0" or "GPL-2.0 AND MIT"
- licenseChoice.Expression = normalizedLicenses.String()
+ return expression.SimpleExpr{License: license}
}
-
- return licenseChoice
-}
-
-// NormalizeLicenses fixes invalid CycloneDX schema cases:
-// - using both LicenseChoice.License and LicenseChoice.Expression in Licenses (convert Expression to License.Name)
-// - using more than one LicenseChoice with Expression (combine them with AND into one Expression)
-func NormalizeLicenses(choices []cdx.LicenseChoice) cdx.Licenses {
- var licenses, expressions []cdx.LicenseChoice
-
- for _, choice := range choices {
- if choice.Expression != "" {
- expressions = append(expressions, choice)
- } else if choice.License != nil {
- licenses = append(licenses, choice)
- }
- }
-
- if len(licenses) == 0 && len(expressions) == 0 {
- return nil
- }
-
- // If both License and Expression are used, convert Expression to License.Name
- if len(expressions) > 0 && len(licenses) > 0 {
- for _, c := range expressions {
- licenses = append(licenses, cdx.LicenseChoice{
- License: &cdx.License{Name: c.Expression},
- })
- }
- return licenses
- }
-
- // Names/IDs only
- if len(licenses) > 0 {
- return licenses
- }
-
- // Merge multiple expressions into one expression with AND
- if len(expressions) > 1 {
- var b strings.Builder
- for i, c := range expressions {
- if i > 0 {
- b.WriteString(" AND ")
- }
- b.WriteString(c.Expression)
- }
- return cdx.Licenses{{Expression: b.String()}}
- }
-
- // len(expressions) == 1
- return expressions
+ return normalizedLicenses
}
func (*Marshaler) Properties(properties []core.Property) *[]cdx.Property { |
I think this is more correct behavior. Thanks for simplifying and organizing my logic. |
SBOM is always too flexible :) We need to make a decision. |
This PR contains the following updates: | Package | Update | Change | |---|---|---| | [mirror.gcr.io/aquasec/trivy](https://www.aquasec.com/products/trivy/) ([source](https://github.com/aquasecurity/trivy)) | minor | `0.65.0` -> `0.66.0` | --- ### Release Notes <details> <summary>aquasecurity/trivy (mirror.gcr.io/aquasec/trivy)</summary> ### [`v0.66.0`](https://github.com/aquasecurity/trivy/blob/HEAD/CHANGELOG.md#0660-2025-09-02) [Compare Source](aquasecurity/trivy@v0.65.0...v0.66.0) ##### Features - add timeout handling for cache database operations ([#​9307](aquasecurity/trivy#9307)) ([235c24e](aquasecurity/trivy@235c24e)) - **misconf:** added audit config attribute ([#​9249](aquasecurity/trivy#9249)) ([4d4a244](aquasecurity/trivy@4d4a244)) - **secret:** implement streaming secret scanner with byte offset tracking ([#​9264](aquasecurity/trivy#9264)) ([5a5e097](aquasecurity/trivy@5a5e097)) - **terraform:** use .terraform cache for remote modules in plan scanning ([#​9277](aquasecurity/trivy#9277)) ([298a994](aquasecurity/trivy@298a994)) ##### Bug Fixes - **conda:** memory leak by adding closure method for `package.json` file ([#​9349](aquasecurity/trivy#9349)) ([03d039f](aquasecurity/trivy@03d039f)) - create temp file under composite fs dir ([#​9387](aquasecurity/trivy#9387)) ([ce22f54](aquasecurity/trivy@ce22f54)) - **cyclonedx:** handle multiple license types ([#​9378](aquasecurity/trivy#9378)) ([46ab76a](aquasecurity/trivy@46ab76a)) - **fs:** avoid shadowing errors in file.glob ([#​9286](aquasecurity/trivy#9286)) ([b51c789](aquasecurity/trivy@b51c789)) - **image:** use standardized HTTP client for ECR authentication ([#​9322](aquasecurity/trivy#9322)) ([84fbf86](aquasecurity/trivy@84fbf86)) - **misconf:** ensure ignore rules respect subdirectory chart paths ([#​9324](aquasecurity/trivy#9324)) ([d3cd101](aquasecurity/trivy@d3cd101)) - **misconf:** ensure module source is known ([#​9404](aquasecurity/trivy#9404)) ([81d9425](aquasecurity/trivy@81d9425)) - **misconf:** preserve original paths of remote submodules from .terraform ([#​9294](aquasecurity/trivy#9294)) ([1319d8d](aquasecurity/trivy@1319d8d)) - **misconf:** use correct field log\_bucket instead of target\_bucket in gcp bucket ([#​9296](aquasecurity/trivy#9296)) ([04ad0c4](aquasecurity/trivy@04ad0c4)) - persistent flag option typo ([#​9374](aquasecurity/trivy#9374)) ([6e99dd3](aquasecurity/trivy@6e99dd3)) - **plugin:** don't remove plugins when updating index.yaml file ([#​9358](aquasecurity/trivy#9358)) ([5f067ac](aquasecurity/trivy@5f067ac)) - **python:** impove package name normalization ([#​9290](aquasecurity/trivy#9290)) ([1473e88](aquasecurity/trivy@1473e88)) - **repo:** preserve RepoMetadata on FS cache hit ([#​9389](aquasecurity/trivy#9389)) ([4f2a44e](aquasecurity/trivy@4f2a44e)) - **repo:** sanitize git repo URL before inserting into report metadata ([#​9391](aquasecurity/trivy#9391)) ([1ac9b1f](aquasecurity/trivy@1ac9b1f)) - **sbom:** add support for `file` component type of `CycloneDX` ([#​9372](aquasecurity/trivy#9372)) ([aa7cf43](aquasecurity/trivy@aa7cf43)) - suppress debug log for context cancellation errors ([#​9298](aquasecurity/trivy#9298)) ([2458d5e](aquasecurity/trivy@2458d5e)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0MS4zNS4xIiwidXBkYXRlZEluVmVyIjoiNDEuMzUuMSIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiaW1hZ2UiXX0=--> Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/1367 Co-authored-by: Renovate Bot <[email protected]> Co-committed-by: Renovate Bot <[email protected]>
Description
This PR fixes license handling in CycloneDX SBOM marshaling to properly support mixed license types and resolve normalization errors. The changes include:
Examples
before:
after:
Related issues
Related PRs
Checklist