Skip to content

Consider publishing dask/distributed nightlies to our nightly pip index #85

@vyasr

Description

@vyasr

Currently we use rapids-dask-dependency to manage our dask pinnings across RAPIDS both during the development cycle and at release time. Since dask does not publish nightly wheels, only conda packages, during the development cycle we point directly to git URLs in the pip metadata (for more details, see the RDD Readme). This approach has generally been working for us, but it has some serious drawbacks:

  • In general pip's dependency resolver is far less intelligent when dealing with URLs than with versioned wheels and will often reclone/rebuild a package unnecessarily.
  • DLFW's particular build pipeline, which involves building pinned versions once and then installing them later, runs into the above issue because it simultaneously sees a built dask wheel and the git dependency from rapids-dask-dependency and the former does not satisfy the latter. In the case of DLFW, because the later part is not allowed to download new wheels at all, this actually results in a failure.
  • Direct URLs are explicitly disallowed by the official spec (and the original source, PEP 440). PyPI will reject any packages containing such dependencies, which in turn means that we will be blocked from publishing our nightlies on PyPI. This will not affect our release builds, however, since at that point we do pin to a specific version instead.
  • uv does not support transitive URL dependencies, and this is documented as an intentional behavior. This was discussed in a recent issue and seems unlikely to change any time soon. Since I anticipate uv usage only growing over time, we can reasonably expect that we'll start seeing users of our nightlies (perhaps only internal users to start, but still) run into this limitation. We can observe the issue easily by attempting to install a nightly RAPIDS package that depends on rapids-dask-dependency:
(rapids) coder ➜ ~ $ uv pip install --extra-index-url https://pypi.anaconda.org/rapidsai-wheels-nightly/simple 'dask-cudf-cu12>=24.10.00a0' --dry-run --prerelease=allow
error: Package `dask` attempted to resolve via URL: git+https://github.com/dask/dask.git@main. URL dependencies must be expressed as direct requirements or constraints. Consider adding `dask @ git+https://github.com/dask/dask.git@main` to your dependencies or constraints file.

Based on the above concerns, I believe it is time for us to consider publishing dask nightly wheels to our nightly pip index. We have previously discussed having the dask project build these themselves, but the response has generally been that they would want us to maintain this since they don't see much interest in such nightlies. We can restart that discussion if we think it's beneficial, but realistically I don't anticipate anything changing. Therefore, if we are going to build these I suggest that we manage building this in our own standalone repo and publish these to our own nightly index so that it's clear that these are just for our use in nightlies and not for general use. We should never upload these to our release index (or pypi.org). We now have precedent for building a wheel for an external project with the ucx-wheels repo. dask/distributed should be far easier to handle in this respect because they're pure Python, so there's not much tricky in actually building the wheels.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions