Skip to content
Merged
Changes from 4 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
ea78881
PEP 9999: Recording provenance of installed packages
Mar 27, 2023
81a9dd7
Rename to PEP-710
Mar 27, 2023
29b86f8
Add PEP-710 to CODEOWNERS
Mar 28, 2023
ac86eda
Apply suggestions from code review
Mar 28, 2023
51ccbed
Apply suggestions from code review
Mar 28, 2023
1d394c4
Apply suggestions from code review
Mar 28, 2023
8a86906
Remove duplicate topic
Mar 28, 2023
3f0478b
Add Christopher A. M. Gerlach to the Acknowledgements section
Mar 28, 2023
c99e676
Fix name in the Acknowledgements section
Mar 28, 2023
d2cb745
Move Backwards Compatibility after Specification
Mar 29, 2023
a4334fb
Add How to Teach This section
Mar 29, 2023
e1b3106
Add Security Implications section
Mar 29, 2023
28d93a0
Add Reference Implementation section
Mar 29, 2023
8f2e4e4
Fix reference to pip-preserve
Mar 29, 2023
96f0a5e
Apply suggestions from code review
Mar 30, 2023
9eb94f8
s/*.dist-info/.dist-info/
Mar 30, 2023
2356439
Add Rationale section
Mar 30, 2023
ca729f8
Fix reference to a term
Mar 30, 2023
00ec0ea
Use a reference to the pip installation report thraed
Mar 30, 2023
bc55397
Apply suggestions from code review
Mar 30, 2023
de7cf45
Adjust Backwards Compatibility section
Mar 31, 2023
2a29627
State main difference between direct_url.json and provenance_url.json
Mar 31, 2023
3b09caf
State Conda's conda-meta directory created by Conda
Mar 31, 2023
8cb9ce9
Mention compatibility considerations with direct_url.json
Mar 31, 2023
7939192
Remove a leftover from review
Mar 31, 2023
b400b39
Fix links to project sites
Mar 31, 2023
eb3efa9
Apply suggestions from code review
Mar 31, 2023
6c9e95c
Create appendix for the tools survey
Mar 31, 2023
dfb21eb
Apply suggestions from code review
Apr 2, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 59 additions & 12 deletions pep-0710.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,16 +48,56 @@ give more accurate reports. Yet another use case could be reconstruction of the
Python environment by pinning each installed package to a specific distribution
artifact consumed from a Python package index.

The motivation described in this PEP is an extension of those in :pep:`610`. Besides
stating information about packages installed using a direct URL, installers SHOULD
record information also for packages installed from Python package indexes when
identified by their name, and optionally their version.

Rationale
=========

[describing the reasoning behind the approach chosen in this PEP, and/or a brief history of the discussions that led to this PEP being proposed]

The motivation described in this PEP is an extension of those in :pep:`610`.
Besides stating information about packages installed using a direct URL,
installers should record information also for packages installed from Python
package indexes when identified by their name, and optionally their version.

The idea described in this PEP originated in a tool called `micropipenv
<https://github.com/thoth-station/micropipenv>`__ that is used to install
:term:`distribution packages <Distribution Package>` in containerized
environments (see the reported issue `thoth-station/micropipenv#206`_). When
installing a :term:`Distribution Package` during a containerized application
build, the assembled containerized application does not implicitly carry
information about the provenance of installed :term:`distribution packages
<Distribution Package>` when installed using their name, and optionally their
version. This creates a requirement for container image suppliers to link
container images with the corresponding build process, its configuration and
the application source code for checking requirements files in cases when
software present in containerized environments needs to be audited.

The `subsequent discussion in the Discourse thread
<pip_installation_report>`_ mentioned also
a new option ``--report`` in pip that can generate a detailed JSON report about
the installation process. This option could help with the provenance problem
this PEP approaches. Nevertheless, this option needs to be *explicitly* passed
to pip to have the provenance information and carries additional metadata that
might not be necessary for checking the provenance (such as Python version
requirements of each :term:`Distribution Package`). Also, this option is
specific to pip as of today.

Note the current :ref:`spec for recording installed packages
<packaging:recording-installed-packages>` defines a ``RECORD`` file that
records installed files, but not the installed artifact that brought these
files into the Python environment. Auditing installed artifacts could happen
based on matching the entries stated in the ``RECORD`` file. However, this
technique requires a pre-computed database of files each artifact provides or a
comparison with the actual artifact content. Both approaches are relatively
expensive and time consuming operations which could be eliminated with the
proposed ``provenance_url.json`` file.

Having information about the provenance of :term:`distribution packages
<Distribution Package>` stored in metadata files for cases when distribution
packages are installed by their direct URL as well as when installed by their
name and optionally their version from an index, can simplify auditing Python
environments in general. This auditing can be done beyond the use case for
containerized applications mentioned earlier and generalized for Python
environments. A community project `pip-audit
<https://github.com/pypa/pip-audit>`__ raised their possible interest in
`pypa/pip-audit#170`_.

Specification
=============
Expand All @@ -66,15 +106,15 @@ The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHOULD”,
“SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL”
in this document are to be interpreted as described in :rfc:`2119`.

The ``provenance_url.json`` file SHOULD be created in the ``*.dist-info``
The ``provenance_url.json`` file SHOULD be created in the ``.dist-info``
directory by installers when installing a :term:`Distribution Package`
specified by name (and optionally by :term:`Version Specifier`).

This file MUST NOT be created when installing a distribution package from a requirement
specifying a direct URL reference (including a VCS URL).

Only one of the files ``provenance_url.json`` and ``direct_url.json`` (from :pep:`610`),
may be present in a given ``*.dist-info`` directory; installers MUST NOT add both.
may be present in a given ``.dist-info`` directory; installers MUST NOT add both.

The ``provenance_url.json`` JSON file MUST be a dictionary, compliant with
:rfc:`8259` and UTF-8 encoded.
Expand Down Expand Up @@ -133,7 +173,7 @@ from the installer's cache.
Backwards Compatibility
=======================

Since this PEP specifies a new file in the ``*.dist-info`` directory, there are
Since this PEP specifies a new file in the ``.dist-info`` directory, there are
no backwards compatibility implications to consider in the ``provenance_url.json``
file itself. Also, this proposal does not make any changes to the
``direct_url.json`` described in :pep:`610` and
Expand All @@ -148,8 +188,9 @@ Security Implications

One of the main security features of the ``provenance_url.json`` file is the
ability to audit installed artifacts in Python environments. Tools can check
which Python package indexes were used to install Python :term:`Distribution Packages`
as well as the hash digests of their release artifacts.
which Python package indexes were used to install Python :term:`distribution
packages <Distribution Package>` as well as the hash digests of their release
artifacts.

As an example, we can take the recent compromised dependency chain in `the
PyTorch incident <https://pytorch.org/blog/compromised-nightly-dependency/>`__.
Expand Down Expand Up @@ -393,6 +434,12 @@ References

.. _pip_preserve: https://pypi.org/project/pip-preserve/

.. _thoth-station/micropipenv#206: https://github.com/thoth-station/micropipenv/issues/206

.. _pypa/pip-audit#170: https://github.com/pypa/pip-audit/issues/170

.. _pip_installation_report: https://pip.pypa.io/en/stable/reference/installation-report/

Acknowledgements
================

Expand Down