Skip to content
Merged
Changes from 6 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
ea78881
PEP 9999: Recording provenance of installed packages
Mar 27, 2023
81a9dd7
Rename to PEP-710
Mar 27, 2023
29b86f8
Add PEP-710 to CODEOWNERS
Mar 28, 2023
ac86eda
Apply suggestions from code review
Mar 28, 2023
51ccbed
Apply suggestions from code review
Mar 28, 2023
1d394c4
Apply suggestions from code review
Mar 28, 2023
8a86906
Remove duplicate topic
Mar 28, 2023
3f0478b
Add Christopher A. M. Gerlach to the Acknowledgements section
Mar 28, 2023
c99e676
Fix name in the Acknowledgements section
Mar 28, 2023
d2cb745
Move Backwards Compatibility after Specification
Mar 29, 2023
a4334fb
Add How to Teach This section
Mar 29, 2023
e1b3106
Add Security Implications section
Mar 29, 2023
28d93a0
Add Reference Implementation section
Mar 29, 2023
8f2e4e4
Fix reference to pip-preserve
Mar 29, 2023
96f0a5e
Apply suggestions from code review
Mar 30, 2023
9eb94f8
s/*.dist-info/.dist-info/
Mar 30, 2023
2356439
Add Rationale section
Mar 30, 2023
ca729f8
Fix reference to a term
Mar 30, 2023
00ec0ea
Use a reference to the pip installation report thraed
Mar 30, 2023
bc55397
Apply suggestions from code review
Mar 30, 2023
de7cf45
Adjust Backwards Compatibility section
Mar 31, 2023
2a29627
State main difference between direct_url.json and provenance_url.json
Mar 31, 2023
3b09caf
State Conda's conda-meta directory created by Conda
Mar 31, 2023
8cb9ce9
Mention compatibility considerations with direct_url.json
Mar 31, 2023
7939192
Remove a leftover from review
Mar 31, 2023
b400b39
Fix links to project sites
Mar 31, 2023
eb3efa9
Apply suggestions from code review
Mar 31, 2023
6c9e95c
Create appendix for the tools survey
Mar 31, 2023
dfb21eb
Apply suggestions from code review
Apr 2, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
170 changes: 160 additions & 10 deletions pep-0710.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,8 @@ In addition to recording provenance information for packages installed using a d
installers should also do so for packages installed by name
(and optionally version) from Python package indexes.

The idea described in this PEP originated in a tool called `micropipenv
<https://github.com/thoth-station/micropipenv>`__ that is used to install
The idea described in this PEP originated in a tool called `micropipenv`_
that is used to install
:term:`distribution packages <Distribution Package>` in containerized
environments (see the reported issue `thoth-station/micropipenv#206`_).
Currently, the assembled containerized application does not implicitly carry
Expand Down Expand Up @@ -93,7 +93,7 @@ Recording provenance information for installed distribution packages,
both those obtained from direct URLs and by name/version from an index,
can simplify auditing Python environments in general, beyond just
the specific use case for containerized applications mentioned earlier.
environments. A community project `pip-audit
A community project `pip-audit
<https://github.com/pypa/pip-audit>`__ raised their possible interest in
`pypa/pip-audit#170`_.

Expand Down Expand Up @@ -171,16 +171,146 @@ from the installer's cache.
Backwards Compatibility
=======================

Since this PEP specifies a new file in the ``.dist-info`` directory, there are
no backwards compatibility implications to consider in the ``provenance_url.json``
file itself. Also, this proposal does not make any changes to the
``direct_url.json`` described in :pep:`610` and
:ref:`its corresponding canonical PyPA spec <direct-url>`.
Following the :ref:`packaging:recording-installed-packages` specification,
installers may keep additional installer-specific files in the ``.dist-info``
directory. To make sure this PEP does not cause any backwards compatibility
issues, there was conducted a research whether introducing the
``provenance_url.json`` file in the ``.dist-info`` directory is feasible and
does not cause any backwards compatibility issues in the Python packaging
ecosystem.

A specification for :ref:`binary-distribution-format` lists files that can be
present in the ``.dist-info`` directory. None of these file names collide with
the proposed ``provenance_url.json`` file from this PEP.

Presence of provenance_url.json in installers and libraries
-----------------------------------------------------------

A research conducted on the existing installers, libraries, and dependency
managers in the Python ecosystem shown below proved there are no backwards
compatibility issues and introducing the ``provenance_url.json`` file will not
clash with any existing tool at the time of writing this PEP.

pip
~~~

The function from pip's internal API responsible for installing wheels, named
`_install_wheel
<https://github.com/pypa/pip/blob/10d9cbc601e5cadc45163452b1bc463d8ad2c1f7/src/pip/_internal/operations/install/wheel.py#L432>`__,
does not store any ``provenance_url.json`` file in the ``.dist-info``
directory. Additionally, a prototype introducing the mentioned file to pip in
`pypa/pip#11865`_ demonstrates incorporating logic for handling the
``provenance_url.json`` file in pip's source code.

As pip is used by some of the tools mentioned below to install Python package
distributions, findings for pip apply to these tools as well as pip does not
allow parametrizing creation of files in the ``.dist-info`` directory in its
internal API. Most of the tools mentioned below that use pip invoke pip as a
subprocess which has no effect on the eventual presence of the
``provenance_url.json`` file in the ``.dist-info`` directory.

distlib
~~~~~~~

`distlib <distlib_homepage_>`_ implements low-level functionality that manipulates with the
``dist-info`` directory. The database of installed distributions does not use
any file named ``provenance_url.json`` based on `the distlib's source code
<https://github.com/pypa/distlib/blob/05375908c1b2d6b0e74bdeb574569d3609db9f56/distlib/database.py#L39-L40>`__.

Pipenv
~~~~~~

`Pipenv <pipenv_homepage_>`_ uses pip `to install Python package distributions
<https://github.com/pypa/pipenv/blob/babd428d8ee3c5caeb818d746f715c02f338839b/pipenv/routines/install.py#L262>`__
. There wasn't identified any additional logic that would cause backwards
compatibility issues when introducing the ``provenance_url.json`` file in the
``.dist-info`` directory.

installer
~~~~~~~~~

installer does not create ``provenance_url.json`` file explicitly.
Nevertheless, as per :ref:`packaging:recording-installed-packages`
specification, installer allows passing the ``additional_metadata`` argument to
create a file in the ``.dist-info`` directory - see `the source code
<https://github.com/pypa/installer/blob/f89b5d93a643ef5e9858a6e3f450c83a57bbe1f1/src/installer/_core.py#L67>`__.
To avoid any backwards compatibility issues, any library or tool using
installer must not request creating the ``provenance_url.json`` file using the
mentioned ``additional_metadata`` argument.

Poetry
~~~~~~

The installation logic in `Poetry <poetry_homepage_>`_ depends on the
``installer.modern-installer`` configuration option (`see docs
<https://python-poetry.org/docs/configuration#installermodern-installation>`__).

For cases when the ``installer.modern-installer`` configuration option is set
to ``false``, Poetry uses `pip for installing Python package distributions
<https://github.com/python-poetry/poetry/blob/2b15ce10f02b0c6347fe2f12ae902488edeaaf7c/src/poetry/installation/executor.py#L543-L544>`__.

On the other hand, when ``installer.modern-installer`` configuration option is
set to ``true``, Poetry uses `installer to install Python package distributions
<https://github.com/python-poetry/poetry/blob/2b15ce10f02b0c6347fe2f12ae902488edeaaf7c/src/poetry/installation/wheel_installer.py#L99-L109>`__.
As can be seen from the linked sources, there isn't passed any additional
metadata file named ``provenance_url.json`` that would cause compatibility
issues with this PEP.

Conda
~~~~~

`Conda <conda_homepage_>`_ does not create any ``provenance_url.json`` file
`when Python package distributions are installed
<https://github.com/conda/conda/blob/86e83925e17c68233ac659633bdc4d76b05a245a/conda/common/pkg_formats/python.py#L370-L390>`__.

Hatch
~~~~~

`Hatch <hatch_homepage_>`_ uses pip `to install project dependencies
<https://github.com/pypa/hatch/blob/dd6e9545a355a0b5b58e065b489c1ef087e3bcaf/src/hatch/env/system.py#L28-L29>`__.

micropipenv
~~~~~~~~~~~

As `micropipenv <micropipenv_homepage_>`_ is a wrapper on top of pip, it uses
pip to install Python distributions, for both `lock files
<https://github.com/thoth-station/micropipenv/blob/8176862ec96df23e152938659d6f45645246e398/micropipenv.py#L393>`__
as well as `for requirements files
<https://github.com/thoth-station/micropipenv/blob/8176862ec96df23e152938659d6f45645246e398/micropipenv.py#L977>`__.

Thamos
~~~~~~

`Thamos <thamos_homepage_>`_ uses micropipenv `to install Python package
distributions
<https://github.com/thoth-station/thamos/blob/234351025c77cfe28b0df07f7ee017469b57d3f4/thamos/lib.py#L1290>`__,
hence any findings for micropipenv apply for Thamos.

PDM
~~~

`Project PDM <pdm_homepage_>`_ uses installer `to install binary distributions
<https://github.com/pdm-project/pdm/blob/d39a8e5b36c37093ea31e666d0e55fe21b38c16b/src/pdm/installers/installers.py#L241>`__.
The only additional metadata file it eventually creates in the ``.dist-info``
directory is `the REFER_TO file
<https://github.com/pdm-project/pdm/blob/d39a8e5b36c37093ea31e666d0e55fe21b38c16b/src/pdm/installers/installers.py#L197>`__.

Compatibility with direct_url.json
----------------------------------

This proposal does not make any changes to the ``direct_url.json`` file
described in :pep:`610` and :ref:`its corresponding canonical PyPA spec
<direct-url>`.

The content of ``provenance_url.json`` file was designed in a way to eventually
allow installers reuse some of the logic supporting ``direct_url.json`` when a
direct URL refers to a source archive or a wheel.

The main difference between ``provenance_url.json`` and ``direct_url.json``
files are mandatory keys and their values in the ``provenance_url.json`` file.
This helps making sure consumers of the ``provenance_url.json`` file may rely
on its content, if the file is present in the ``.dist-info`` directory.

Security Implications
=====================

Expand Down Expand Up @@ -408,8 +538,12 @@ Availability of the provenance_url.json file in Conda
-----------------------------------------------------

We would like to get feedback on the ``provenance_url.json`` file from Conda
maintainers. It is not clear whether Conda would like to adopt
the ``provenance_url.json`` file.
maintainers. It is not clear whether Conda would like to adopt the
``provenance_url.json`` file. Conda already stores provenance related
information (similar to the provenance information proposed in this PEP) in
JSON files located in the ``conda-meta`` directory `following its actions
during installation
<https://conda.io/projects/conda/en/latest/dev-guide/deep-dives/install.html>`__.

Using provenance_url.json in downstream installers
--------------------------------------------------
Expand Down Expand Up @@ -438,6 +572,22 @@ References

.. _pip_installation_report: https://pip.pypa.io/en/stable/reference/installation-report/

.. _pdm_homepage: https://pdm.fming.dev/

.. _poetry_homepage: https://python-poetry.org/

.. _pipenv_homepage: https://pipenv.pypa.io/

.. _conda_homepage: https://docs.conda.io/

.. _distlib_homepage: https://distlib.readthedocs.io/

.. _micropipenv_homepage: https://github.com/thoth-station/micropipenv

.. _thamos_homepage: https://github.com/thoth-station/thamos/

.. _hatch_homepage: https://hatch.pypa.io/

Acknowledgements
================

Expand Down