forked from python/peps
-
Notifications
You must be signed in to change notification settings - Fork 0
pip provenance PEP #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 9 commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
774c143
Revisit PEP for Python provenance
c335a06
Incorporate suggestions based on Donald Stufft's review
5addbfc
Address linter issues
5baebca
Minor adjustments to the text
0c4ef9f
Apply suggestions from code review by Brett
ec5f757
The hashes key MUST be present
3c9b67f
Drop hashes and tighten requirements on hashes
4643a71
Provide link to hashlib's canonical name docs
3433011
State hash key in the archive_info as a rejected idea
6bc3c29
s/a canonical/the canonical/
414d71e
Add Paul Moore as a PEP delegate
5ac0f2b
State cached and built wheels from source distributions
2fa69dd
Fix code block example
9099604
Link Direct URL Data Structure
117ad02
State Stéphane Bidoul in the Acknowledgements section
9e72479
Add open issues section
b1bc04e
Explicitly state which files are generated in the examples section
f07f63c
Minor cosmetic changes
fd99e56
Last changes before submitting as a PEP
3c61f4d
Rename to PEP-710
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,283 @@ | ||
| PEP: 9999 | ||
| Title: Recording provenance of installed packages | ||
| Author: Fridolín Pokorný <fridolin.pokorny at gmail.com> | ||
| Sponsor: Donald Stufft <[email protected]> | ||
| PEP-Delegate: | ||
| Discussions-To: https://discuss.python.org/t/pep-705-recording-provenance-of-installed-packages/23340 | ||
| Status: Draft | ||
| Type: Process | ||
| Content-Type: text/x-rst | ||
| Created: 09-Mar-2023 | ||
| Post-History: | ||
|
|
||
| Abstract | ||
| ======== | ||
|
|
||
| This PEP describes a way to record provenance of Python distributions | ||
| installed. The record is created by an installer and is available to users in | ||
| a form of a JSON file ``provenance_url.json`` in ``.dist-info`` directory. The | ||
| mentioned JSON file captures additional metadata to allow recording a URL to a | ||
| Python distribution together with the installed Python distribution hash. This | ||
| proposal is built on top of :pep:`610` following `its corresponding canonical | ||
| PyPA spec | ||
| <https://packaging.python.org/en/latest/specifications/direct-url/>`__ and | ||
| complements ``direct_url.json`` with ``provenance_url.json`` file when packages | ||
| are identified by a name, and either a version. | ||
|
|
||
| Motivation | ||
| ========== | ||
|
|
||
| Installing a Python package involves downloading a distribution from an index | ||
| and extracting its content to an appropriate place. After the installation | ||
| process is done, information about the distribution used as well as its source | ||
| is generally lost. Nevertheless, there are use cases for keeping records of | ||
| distributions used for installing packages and their provenance. | ||
|
|
||
| Python wheels can be built with different compiler flags or supporting | ||
| different wheel tags. In both cases, users might get into a situation in which | ||
| multiple wheels might be considered by installers (possibly from different | ||
| package indexes) and immediately finding out which wheel file was actually used | ||
| during the installation might be helpful. This way, developers can use | ||
| information about wheels to debug issues making sure the desired wheel | ||
| was actually installed. Another use case could be tools reporting software | ||
| installed, such as tools reporting SBOM (Software Bill of Material), that might | ||
| give more accurate reports. | ||
|
|
||
| The motivation described in this PEP is an extension to :pep:`610`. Besides | ||
| stating information about packages installed using a direct URL, installers SHOULD | ||
| record information also for packages installed from Python package indexes when | ||
| identified by their name, and optionally their version. | ||
|
|
||
| Specification | ||
| ============= | ||
|
|
||
| The ``provenance_url.json`` file SHOULD be created in the ``*.dist-info`` | ||
| directory by installers when installing a distribution identified by their | ||
| name, and optionally their version specifier. | ||
|
|
||
| This file MUST NOT be created when installing a distribution from a requirement | ||
| specifying a direct URL reference (including a VCS URL). | ||
|
|
||
| Only one of ``provenance_url.json`` and ``direct_url.json`` from :pep:`610` | ||
| files MAY be present in ``*.dist-info`` directory. | ||
|
|
||
| The ``provenance_url.json`` JSON file MUST be a dictionary, compliant with | ||
| :rfc:`8259` and UTF-8 encoded. | ||
|
|
||
| If present, it MUST contain exactly two keys. The first one is ``url``, with | ||
| type ``string``. The second key MUST be ``archive_info`` with a value defined | ||
| below. | ||
|
|
||
| Following :pep:`610`, the ``url`` field MUST be stripped of any sensitive | ||
| authentication information, for security reasons. | ||
|
|
||
| The user:password section of the URL MAY however be composed of environment | ||
| variables, matching the following regular expression:: | ||
|
|
||
| \$\{[A-Za-z0-9-_]+\}(:\$\{[A-Za-z0-9-_]+\})? | ||
|
|
||
| Additionally, the user:password section of the URL MAY be a well-known, | ||
| non-security sensitive string. A typical example is ``git`` in the case of an | ||
| URL such as ``ssh://[email protected]``. | ||
|
|
||
| The value of ``archive_info`` MUST be a dictionary with a single key | ||
| ``hashes``. The ``hashes`` key is a dictionary mapping a hash name to a | ||
| hex-encoded digest of the file. Multiple hashes can be included, and it is up | ||
| to the consumer to decide what to do with multiple hashes (it may validate all | ||
| of them or a subset of them, or nothing at all). | ||
|
|
||
| Each hash MUST be one of the single argument hashes provided by | ||
| ``hashlib.algorithms_guaranteed`` except for ``sha1`` and ``md5`` hashes. At | ||
| the time of writing this PEP, the listing does not include multi-argument | ||
| hashes ``shake_128`` and ``shake_256``: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| >>> import hashlib | ||
| >>> sorted(hashlib.algorithms_guaranteed - {"shake_128", "shake_256", "sha1", "md5"}) | ||
| ['blake2b', 'blake2s', 'sha224', 'sha256', 'sha384', 'sha3_224', 'sha3_256', 'sha3_384', 'sha3_512', 'sha512'] | ||
|
|
||
| Each hash MUST be referenced by a canonical name of the hash, always lower case. | ||
fridex marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| Hashes ``sha1`` and ``md5`` MUST NOT be present, respecting security | ||
| limitations of these hash algorithms. On the other hand, hash ``sha256`` SHOULD | ||
| be included. | ||
|
|
||
| Examples | ||
| ======== | ||
|
|
||
| Examples of a valid provenance_url.json | ||
| --------------------------------------- | ||
|
|
||
| A valid ``provenance_url.json`` stating multiple hashes: | ||
|
|
||
| .. code:: json | ||
|
|
||
| { | ||
| "archive_info": { | ||
| "hashes": { | ||
| "blake2s": "fffeaf3d0bd71dc960ca2113af890a2f2198f2466f8cd58ce4b77c1fc54601ff", | ||
| "sha256": "236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f", | ||
| "sha3_256": "c856930e0f707266d30e5b48c667a843d45e79bb30473c464e92dfa158285eab", | ||
| "sha512": "6bad5536c30a0b2d5905318a1592948929fbac9baf3bcf2e7faeaf90f445f82bc2b656d0a89070d8a6a9395761f4793c83187bd640c64b2656a112b5be41f73d" | ||
| } | ||
| }, | ||
| "url": "https://files.pythonhosted.org/packages/07/51/2c0959c5adf988c44d9e1e0d940f5b074516ecc87e96b1af25f59de9ba38/pip-23.0.1-py3-none-any.whl" | ||
| } | ||
|
|
||
| A valid ``provenance_url.json`` stating a single hash entry: | ||
|
|
||
| .. code:: json | ||
|
|
||
| { | ||
| "archive_info": { | ||
| "hashes": { | ||
| "sha256": "236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f" | ||
| } | ||
| }, | ||
| "url": "https://files.pythonhosted.org/packages/07/51/2c0959c5adf988c44d9e1e0d940f5b074516ecc87e96b1af25f59de9ba38/pip-23.0.1-py3-none-any.whl" | ||
| } | ||
|
|
||
| Examples of an invalid provenance_url.json | ||
| ------------------------------------------ | ||
|
|
||
| The following example includes ``hash`` key in the ``archive_info`` dictionary | ||
| as originally designed in :pep:`610`. The ``hash`` key MUST NOT be present to | ||
| prevent from any possible confusion with ``hashes`` and additional checks that | ||
| would be required to keep hash values in sync. | ||
|
|
||
| .. code:: json | ||
|
|
||
| { | ||
| "archive_info": { | ||
| "hash": "sha256=236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f", | ||
| "hashes": { | ||
fridex marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| "sha256": "236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f" | ||
| } | ||
| }, | ||
| "url": "https://files.pythonhosted.org/packages/07/51/2c0959c5adf988c44d9e1e0d940f5b074516ecc87e96b1af25f59de9ba38/pip-23.0.1-py3-none-any.whl" | ||
| } | ||
|
|
||
| Another example demonstrates an invalid hash name. The referenced hash does not | ||
| correspond to canonical hash name described in this PEP and `Python docs | ||
| <https://docs.python.org/3/library/hashlib.html#hashlib.hash.name>`__. | ||
|
|
||
| .. code:: json | ||
|
|
||
| { | ||
| "archive_info": { | ||
| "hashes": { | ||
| "SHA-256": "236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f" | ||
| } | ||
| }, | ||
| "url": "https://files.pythonhosted.org/packages/07/51/2c0959c5adf988c44d9e1e0d940f5b074516ecc87e96b1af25f59de9ba38/pip-23.0.1-py3-none-any.whl" | ||
| } | ||
|
|
||
|
|
||
| Example pip commands and their effect on provenance_url.json and direct_url.json | ||
| -------------------------------------------------------------------------------- | ||
|
|
||
| Commands that generate a ``direct_url.json`` file, following :pep:`610`: | ||
|
|
||
| * ``pip install https://example.com/app-1.0.tgz`` | ||
| * ``pip install https://example.com/app-1.0.whl`` | ||
| * ``pip install “git+https://example.com/repo/app.git#egg=app&subdirectory=setup”`` | ||
| * ``pip install ./app`` | ||
| * ``pip install file:///home/user/app`` | ||
| * ``pip install –editable "git+https://example.com/repo/app.git#egg=app&subdirectory=setup"`` (in which case, ``url`` will be the local directory where the git repository has been cloned to, and ``dir_info`` will be present with ``"editable": true`` and no ``vcs_info`` will be set) | ||
| * ``pip install -e ./app`` | ||
|
|
||
| Commands that generate a ``provenance_url.json`` file: | ||
|
|
||
| * ``pip install app`` | ||
| * ``pip install app~=2.2.0`` | ||
| * ``pip install app –no-index –find-links "https://example.com/"`` | ||
|
|
||
| This behaviour can be tested using changes to pip introduced in [1]_. | ||
|
|
||
| Rejected Ideas | ||
| ============== | ||
|
|
||
| Naming the file direct_url.json instead of provenance_url.json | ||
| -------------------------------------------------------------- | ||
|
|
||
| To preserve backwards compatibility with :pep:`610`, the file cannot be named | ||
| ``direct_url.json`` (from :pep:`610`): | ||
|
|
||
| This file MUST NOT be created when installing a distribution from an other | ||
| type of requirement (i.e. name plus version specifier). | ||
|
|
||
| The change might introduce backwards compatibility issues for consumers of | ||
| ``direct_url.json`` who rely on its presence only when distributions are | ||
| installed using a direct URL reference. | ||
|
|
||
| Deprecate direct_url.json and use only provenance_url.json | ||
| ---------------------------------------------------------- | ||
|
|
||
| File ``direct_url.json`` is already well established in :pep:`610` and is | ||
| already used by installers. For example, ``pip`` uses ``direct_url.json`` to | ||
| report a direct URL reference on ``pip freeze``. Deprecating | ||
| ``direct_url.json`` would require additional changes to the ``pip freeze`` | ||
| implementation in pip (see [2]_) and could introduce backwards compatibility | ||
| issues for already existing ``direct_url.json`` consumers. | ||
|
|
||
| Keeping hash key in the archive_info dictionary | ||
| ----------------------------------------------- | ||
|
|
||
| :pep:`610` and `its corresponding canonical PyPA spec | ||
| <https://packaging.python.org/en/latest/specifications/direct-url/>`__ discuss | ||
| the possibility to state ``hash`` key alongside the ``hashes`` key in the | ||
| ``archive_info`` dictionary. This PEP explicitly discards the ``hash`` key in | ||
| the ``provenance_url.json`` file and expects only ``hashes`` key to be present. | ||
| By doing so we eliminate possible redundancy in the file, possible confusion, | ||
| and any additional checks that would need to be done to make sure hashes are in | ||
| sync. | ||
|
|
||
| Backwards Compatibility | ||
| ======================= | ||
|
|
||
| Since this PEP specifies a new file in the ``*.dist-info`` directory, there are | ||
| no backwards compatibility implications to consider in the ``provenance_url.json`` | ||
| file itself. Also, this proposal does not make any changes to the | ||
| ``direct_url.json`` described in :pep:`610` and `its corresponding canonical | ||
| PyPA spec | ||
| <https://packaging.python.org/en/latest/specifications/direct-url/>`__. | ||
|
|
||
| The content of ``provenance_url.json`` file was designed in a way to eventually | ||
fridex marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| allow installers reuse some of the logic supporting :pep:`610` when a | ||
| direct URL refers to a source archive or a wheel. | ||
|
|
||
| References | ||
| ========== | ||
|
|
||
| The following changes were done to pip to support this PEP: | ||
|
|
||
| .. [1] `A patch to pip introducing provenance_url.json as discussed in this PEP | ||
| <https://github.com/fridex/pip/pull/1/>`__ | ||
|
|
||
| .. [2] `Changes to pip to support the decision for creating | ||
| provenance_url.json instead of stating provenance in already existing | ||
| direct_url.json <https://github.com/fridex/pip/pull/2/>`__ | ||
|
|
||
| Acknowledgements | ||
| ================ | ||
|
|
||
| Thanks to Dustin Ingram, Brett Cannon, Paul Moore for the initial discussion in | ||
| which this idea originated. | ||
|
|
||
| Thanks to Donald Stufft, Ofek Lev, and Trishank Kuppusamy for early feedback | ||
| and support to work on this PEP. | ||
|
|
||
| Thanks to Gregory P. Smith for reviewing this PEP and providing valuable | ||
| suggestions. | ||
|
|
||
| Thanks to Stéphane Bidoul and Chris Jerdonek for :pep:`610`. | ||
|
|
||
| Last, but not least, thanks to Donald Stufft for sponsoring this PEP. | ||
|
|
||
| Copyright | ||
| ========= | ||
|
|
||
| This document is placed in the public domain or under the CC0-1.0-Universal | ||
| license, whichever is more permissive. | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.