Skip to content

Conversation

@DmitriyLewen
Copy link
Contributor

@DmitriyLewen DmitriyLewen commented Jul 31, 2025

Description

This PR normalizes Python package names according to PEP-0503 specification instead of PEP-0426, and fixes dist-info directory name generation to properly handle package names with special characters.

Reason

The existing implementation followed PEP-0426 for package name normalization, but Python packaging has evolved to use PEP-0503 for normalized names. Additionally, the dist-info directory name generation was not properly handling
package names containing -, _, or . characters, causing license information lookup failures for packages with these characters in their names.

Improvements

  • Updated normalization standard: Switched from PEP-0426 to PEP-0503 for package name normalization
  • Improved regex-based replacement: Replaced multiple string replacements with a single regex that handles consecutive runs of -, _, or . characters more efficiently
  • Fixed dist-info directory lookup: Added proper normalization and conversion for dist-info directory names (normalize then replace - with _)
  • Enhanced test coverage: Added test case for packages with consecutive special characters (foo--bar__baz)
  • Better license detection: Fixed license information retrieval for packages like annotated-types that use underscores in their dist-info directory names

Example with fix License detection

before (Trivy uses typing-inspection-0.4.1.dist-info dir)

➜ VIRTUAL_ENV=./.venv trivy -q fs ./requirements.txt -f json --list-all-pkgs | jq '.Results[].Packages[] | select (.Name=="typing-inspection") | .Licenses' 
null

after (Trivy uses typing_inspection-0.4.1.dist-info dir)

➜ VIRTUAL_ENV=./.venv ./trivy -q fs ./requirements.txt -f json --list-all-pkgs | jq '.Results[].Packages[] | select (.Name=="typing-inspection") | .Licenses'
[
  "MIT"
]

Related issues

Checklist

  • I've read the guidelines for contributing to this repository.
  • I've followed the conventions in the PR title.
  • I've added tests that prove my fix is effective or that my feature works.
  • I've updated the documentation with the relevant information (if needed).
  • I've added usage information (if the PR introduces new options)
  • I've included a "before" and "after" example to the description (if the PR is a user interface change).

@DmitriyLewen DmitriyLewen self-assigned this Jul 31, 2025
@DmitriyLewen DmitriyLewen added the autoready Automatically mark PR as ready for review when all checks pass label Jul 31, 2025
@github-actions github-actions bot marked this pull request as ready for review July 31, 2025 09:54
@github-actions github-actions bot removed the autoready Automatically mark PR as ready for review when all checks pass label Jul 31, 2025
@github-actions github-actions bot requested a review from knqyf263 as a code owner July 31, 2025 09:54
if os.IsNotExist(err) {
a.logger.Debug("No package metadata found", log.String("site-packages", pkgDir),
log.String("name", pkgName), log.String("version", pkgVer))
metadataFile := a.metadataFile(pkgName, pkgVer, spDir)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file closed somewhere?
metadataFile.Close().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!
added in 4ce96c7

@DmitriyLewen DmitriyLewen added this pull request to the merge queue Aug 1, 2025
Merged via the queue into aquasecurity:main with commit 1473e88 Aug 1, 2025
13 checks passed
@DmitriyLewen DmitriyLewen deleted the fix/python/normilize-pkg-name-by-pep0503 branch August 1, 2025 08:25
@aqua-bot aqua-bot mentioned this pull request Aug 1, 2025
yutatokoi pushed a commit to yutatokoi/trivy that referenced this pull request Aug 12, 2025
alexlebens pushed a commit to alexlebens/infrastructure that referenced this pull request Sep 3, 2025
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [mirror.gcr.io/aquasec/trivy](https://www.aquasec.com/products/trivy/) ([source](https://github.com/aquasecurity/trivy)) | minor | `0.65.0` -> `0.66.0` |

---

### Release Notes

<details>
<summary>aquasecurity/trivy (mirror.gcr.io/aquasec/trivy)</summary>

### [`v0.66.0`](https://github.com/aquasecurity/trivy/blob/HEAD/CHANGELOG.md#0660-2025-09-02)

[Compare Source](aquasecurity/trivy@v0.65.0...v0.66.0)

##### Features

- add timeout handling for cache database operations ([#&#8203;9307](aquasecurity/trivy#9307)) ([235c24e](aquasecurity/trivy@235c24e))
- **misconf:** added audit config attribute ([#&#8203;9249](aquasecurity/trivy#9249)) ([4d4a244](aquasecurity/trivy@4d4a244))
- **secret:** implement streaming secret scanner with byte offset tracking ([#&#8203;9264](aquasecurity/trivy#9264)) ([5a5e097](aquasecurity/trivy@5a5e097))
- **terraform:** use .terraform cache for remote modules in plan scanning ([#&#8203;9277](aquasecurity/trivy#9277)) ([298a994](aquasecurity/trivy@298a994))

##### Bug Fixes

- **conda:** memory leak by adding closure method for `package.json` file ([#&#8203;9349](aquasecurity/trivy#9349)) ([03d039f](aquasecurity/trivy@03d039f))
- create temp file under composite fs dir ([#&#8203;9387](aquasecurity/trivy#9387)) ([ce22f54](aquasecurity/trivy@ce22f54))
- **cyclonedx:** handle multiple license types ([#&#8203;9378](aquasecurity/trivy#9378)) ([46ab76a](aquasecurity/trivy@46ab76a))
- **fs:** avoid shadowing errors in file.glob ([#&#8203;9286](aquasecurity/trivy#9286)) ([b51c789](aquasecurity/trivy@b51c789))
- **image:** use standardized HTTP client for ECR authentication ([#&#8203;9322](aquasecurity/trivy#9322)) ([84fbf86](aquasecurity/trivy@84fbf86))
- **misconf:** ensure ignore rules respect subdirectory chart paths ([#&#8203;9324](aquasecurity/trivy#9324)) ([d3cd101](aquasecurity/trivy@d3cd101))
- **misconf:** ensure module source is known ([#&#8203;9404](aquasecurity/trivy#9404)) ([81d9425](aquasecurity/trivy@81d9425))
- **misconf:** preserve original paths of remote submodules from .terraform ([#&#8203;9294](aquasecurity/trivy#9294)) ([1319d8d](aquasecurity/trivy@1319d8d))
- **misconf:** use correct field log\_bucket instead of target\_bucket in gcp bucket ([#&#8203;9296](aquasecurity/trivy#9296)) ([04ad0c4](aquasecurity/trivy@04ad0c4))
- persistent flag option typo ([#&#8203;9374](aquasecurity/trivy#9374)) ([6e99dd3](aquasecurity/trivy@6e99dd3))
- **plugin:** don't remove plugins when updating index.yaml file ([#&#8203;9358](aquasecurity/trivy#9358)) ([5f067ac](aquasecurity/trivy@5f067ac))
- **python:** impove package name normalization  ([#&#8203;9290](aquasecurity/trivy#9290)) ([1473e88](aquasecurity/trivy@1473e88))
- **repo:** preserve RepoMetadata on FS cache hit ([#&#8203;9389](aquasecurity/trivy#9389)) ([4f2a44e](aquasecurity/trivy@4f2a44e))
- **repo:** sanitize git repo URL before inserting into report metadata ([#&#8203;9391](aquasecurity/trivy#9391)) ([1ac9b1f](aquasecurity/trivy@1ac9b1f))
- **sbom:** add support for `file` component type of `CycloneDX` ([#&#8203;9372](aquasecurity/trivy#9372)) ([aa7cf43](aquasecurity/trivy@aa7cf43))
- suppress debug log for context cancellation errors ([#&#8203;9298](aquasecurity/trivy#9298)) ([2458d5e](aquasecurity/trivy@2458d5e))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0MS4zNS4xIiwidXBkYXRlZEluVmVyIjoiNDEuMzUuMSIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiaW1hZ2UiXX0=-->

Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/1367
Co-authored-by: Renovate Bot <[email protected]>
Co-committed-by: Renovate Bot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(python): Trivy uses non-normalized pkg name in *.dist-info dir name

2 participants