Skip to content

build wheels with CUDA 13.0.x, test wheels against mix of CTK versions#2971

Merged
rapids-bot[bot] merged 19 commits intorapidsai:release/26.04from
jameslamb:test-older-ctk
Mar 18, 2026
Merged

build wheels with CUDA 13.0.x, test wheels against mix of CTK versions#2971
rapids-bot[bot] merged 19 commits intorapidsai:release/26.04from
jameslamb:test-older-ctk

Conversation

@jameslamb
Copy link
Member

@jameslamb jameslamb commented Mar 3, 2026

Contributes to rapidsai/build-planning#257

  • builds CUDA 13 wheels with the 13.0 CTK
  • ensures wheels ship with a runtime dependency of nvidia-nvjitlink>={whatever-minor-version-they-were-built-against}

Contributes to rapidsai/build-planning#256

  • updates wheel tests to cover a range of CTK versions (we previously, accidentally, were only testing the latest 12.x and 13.x)

Other changes

  • ensures conda packages also take on floors of libnvjitlink>={whatever-version-they-were-built-against}

Notes for Reviewers

How I tested this

This uses wheels from similar PRs from RAPIDS dependencies, at build and test time:

@jameslamb jameslamb added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Mar 3, 2026
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 3, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

rapids-bot bot pushed a commit to rapidsai/gha-tools that referenced this pull request Mar 5, 2026
Contributes to rapidsai/build-planning#256

`rapids-generate-pip-constraints` currently special-cases `RAPIDS_DEPENDENCIES="latest"` and skips generating constraints in that case.

This will be helpful in rapidsai/build-planning#256, where we want to start constraining `cuda-toolkit` in wheels CI based on the CTK version in the CI image being used.

## Notes for Reviewers

### How I tested this

Looked for projects using this ([GitHub search](https://github.com/search?q=org%3Arapidsai+language%3AShell+%22rapids-generate-pip-constraints%22+AND+NOT+is%3Aarchived+&type=code)) and tested in them.

It's just a few:

* [ ] cudf (rapidsai/cudf#21639)
* [ ] cuml (rapidsai/cuml#7853)
* [ ] dask-cuda (rapidsai/dask-cuda#1632)
* [ ] nvforest (rapidsai/nvforest#62)
* [ ] raft (rapidsai/raft#2971)
* [ ] rmm (rapidsai/rmm#2270)

On all of those, wheels CI jobs worked exactly as expected and without needing any code changes or `dependencies.yaml` updates... so this PR is safe to merge any time.

### Is this safe?

It should be (see "How I tested this").

This is only used to add **constraints** (not requirements), so it shouldn't change our ability to catch problems like "forgot to declare a dependency" in CI.

It WILL increase the risk of `[test]` extras being underspecified. For example, if `cuml[test]` has `scikit-learn>=1.3` and the constraints have `scikit-learn>=1.5`, we might never end up testing `scikit-learn>=1.3,<1.5` (unless it's explicitly accounted for in a `dependencies: "oldest"` block).

The other risk here is that this creates friction because constraints passed to `--constraint` cannot contain extras. So e.g. if you want to depend on `xgboost[dask]`, that cannot be in any of the lists generated by `rapids-generate-pipe-constraints`. I think we can work around that though when we hit those cases.

Overall, I think these are acceptable tradeoffs.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #247
@jameslamb
Copy link
Member Author

/ok to test

@jameslamb jameslamb added the DO NOT MERGE Hold off on merging; see PR for details label Mar 18, 2026
@jameslamb jameslamb changed the title WIP: build wheels with CUDA 13.0.x, test wheels against mix of CTK versions build wheels with CUDA 13.0.x, test wheels against mix of CTK versions Mar 18, 2026
@jameslamb jameslamb marked this pull request as ready for review March 18, 2026 05:10
@jameslamb jameslamb requested review from a team as code owners March 18, 2026 05:10
@jameslamb jameslamb requested a review from bdice March 18, 2026 05:10
Comment on lines +210 to +214
- if: cuda_major == "13"
then:
# always want libnvJitLink >= whatever was built against
# ref: https://docs.nvidia.com/cuda/nvjitlink/index.html#compatibility
- ${{ pin_compatible("libnvjitlink", lower_bound="x.x.x", upper_bound="x") }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added nvjitlink to RAFT to work around a bug in CUDA WHEEL packaging. #2948

There is no direct usage of nvjitlink in RAFT.

There is no conda dependency needed here at all, conda-forge packaging is already correct. Everything touching conda should be reverted.

Copy link
Member Author

@jameslamb jameslamb Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I misunderstood that.

libnvjitlink is in the libraft host environment

...
 │ │ │ libnvjitlink                ┆ 13.2.51     ┆ hecca717_0                        ┆ conda-forge      ┆   30.22 MiB │
...

(recent conda-cpp-build CUDA 13.1 build link)

But I get that there's notdirect usage here, and it's not showing up in the runtime dependencies.

I'll revert this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did this in 651edef

- *cuda_toolkit_any_cu13
- &nvjitlink_cu13 nvidia-nvjitlink>=13.1,<14
- matrix:
cuda: "13.*"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need three matrices for 13.0, 13.1, and 13.*?

I think it should be fine to have one 13.* matrix that requires nvidia-nvjitlink>=13.0,<14 if we're building with 13.0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These groups makes the coupling between the CTK version we're building against and the nvidia-nvjitlink floor explicit.

  • having a catch-all 13.* because we happen to be building against 13.0 for now increases the risk that we'll accidentally build wheels in the future with a too-low nvidia-nvjitlink pin
  • not having a 13.1 means that if we switch back to building against CTK 13.1, we'll need new PRs to all the repos to change these floors

Although I guess we already end up with a fallback matrix anyway to populate pyproject.toml (for documentation purposes) so we're already in that position of needing to remember to update this.

I'll take this suggestion and make it a single 13.* (and do that for all the other PRs), maybe we can find a better and stricter mix in as a follow-up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did this in 28a72c5

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having a catch-all 13.* because we happen to be building against 13.0 for now increases the risk that we'll accidentally build wheels in the future with a too-low nvidia-nvjitlink pin

Hopefully this is not an issue for very long if we can get cuda-toolkit pinnings loosened.

I think this outcome is clearer -- though it is a bit more "hardcoded", we don't know exactly what future we're pointing towards until we know whether cuda-toolkit pinnings will be loosened.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep fair enough, I'll apply this in the other PRs in this series, thanks

@jameslamb jameslamb removed the DO NOT MERGE Hold off on merging; see PR for details label Mar 18, 2026
@jameslamb
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 048aa19 into rapidsai:release/26.04 Mar 18, 2026
152 of 154 checks passed
rapids-bot bot pushed a commit to rapidsai/cuvs that referenced this pull request Mar 19, 2026
…wheels against mix of CTK versions (#1862)

The changes from #1405 introduced linking against nvJitLink. nvJitLink has versioned symbols that are added in each new CTK release, and some of those are exposed in `libcuvs.so`.

`libcuvs` wheels are built against the latest CTK supported in RAPIDS (CUDA 13.1.1 as of this writing), so when those wheels are used in environments with older nvJitLink, runtime errors like this can happen:

> libcugraph.so: undefined symbol: __nvJitLinkGetErrorLog_13_1, version libnvJitLink.so.13

For more details, see rapidsai/cugraph#5443

This tries to fix that.

Contributes to rapidsai/build-planning#257

* builds CUDA 13 wheels with the 13.0 CTK
* ensures CUDA 13 wheels ship with a runtime dependency of `nvidia-nvjitlink>={whatever-minor-version-they-were-built-against}`

Contributes to rapidsai/build-planning#256

* updates wheel tests to cover a range of CTK versions (we previously, accidentally, were only testing the latest 12.x and 13.x)

Other changes

* ensures conda packages also take on floors of `libnvjitlink>={whatever-minor-version-they-were-built-against}`

## Notes for Reviewers

### How I tested this

This uses wheels from similar PRs from RAPIDS dependencies, at build and test time:

* rapidsai/raft#2971
* rapidsai/rmm#2270
* rapidsai/ucxx#604

### Other Options

1. avoiding those versioned symbols with a build-time shim (#1855 does this, but hasn't been successful yet)
2. statically linking libnvJitLink (hasn't been successful yet)

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Gil Forsyth (https://github.com/gforsyth)

URL: #1862
rapids-bot bot pushed a commit to rapidsai/cugraph that referenced this pull request Mar 19, 2026
…wheels against mix of CTK versions (#5457)

Fixes #5443

Contributes to rapidsai/build-planning#257

* builds CUDA 13 wheels with the 13.0 CTK
* ensures wheels ship with a runtime dependency of `nvidia-nvjitlink>={whatever-minor-version-they-were-built-against}`

Contributes to rapidsai/build-planning#256

* updates wheel tests to cover a range of CTK versions (we previously, accidentally, were only testing the latest 12.x and 13.x)

Other changes

* ensures conda packages also take on floors of `libnvjitlink>={whatever-version-they-were-built-against}`

## Notes for Reviewers

### How I tested this

This uses wheels from similar PRs from RAPIDS dependencies, at build and test time:

* rapidsai/cudf#21671
* rapidsai/kvikio#942
* rapidsai/raft#2971
* rapidsai/rmm#2270
* rapidsai/ucxx#604

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Gil Forsyth (https://github.com/gforsyth)

URL: #5457
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Development

Successfully merging this pull request may close these issues.

2 participants