-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Description
In mid-26.04, RAPIDS was building its wheels with v13.1.1 of the CUDA toolkit (including libnvJitLink 13.1.1) and directly linking against libnvJitLink for JIT-LTO (example: rapidsai/cuvs#1405).
This resulted in runtime issues in environments with v13.0.x of the CTK, like this:
libcugraph.so: undefined symbol: __nvJitLinkGetErrorLog_13_1, version libnvJitLink.so.13
Requiring nvidia-nvjitlink>=13.1 at runtime would solve those issues, but it'd also make RAPIDS wheels incompatible with cuda-toolkit[nvjitlink]<13.1, which torch 2.10 (the latest release) pins to:
- WIP: wheels CI: stricter torch index selection, test oldest versions of dependencies cugraph-gnn#413
- (pytorch/pytorch - .github/scripts/generate_binary_build_matrix.py)
In an offline discussion with @bdice @vyasr and @divyegala we discussed and tried several options (example: rapidsai/cuvs#1855), and decided to try building RAPIDS wheels against CTK 13.0.x for RAPIDS 26.04, to avoid losing compatibility with projects tightly pinned to earlier nvJitLink versions.
This tracks that work.
Benefits of this work
- allows RAPIDS to continue adopting JIT-LTO while also staying compatible with
torchand other projects tightly pinning to earliernvidia-nvjitlinkversions
Acceptance Criteria
- all RAPIDS libraries build wheels against CTK 13.0
- RAPIDS conda builds continue to build against the latest CUDA 13 CTK RAPIDS supports (as of this writing, 13.1.1)
- RAPIDS devcontainers continue to support the latest CUDA 13 CTK RAPIDS supports
- RAPIDS CUDA 12 wheels continue to build against the latest CUDA 12 CTK RAPIDS support (as of this writing, 12.9.1)
cugraph-gnnwheels CI is successfully testing against CUDA 12 and CUDA 13torchwheels
Approach
-
ci-imgschanges- start building CTK 13.0.2 images again ci-wheel: restore CUDA 13.0.2 and 12.2.2 images ci-imgs#373)
- set
ci-wheel:{rapids-version}-latestto 13.0.2 (ci-wheel: drop CUDA 12.2.2 images, move latest back to 13.0.2 ci-imgs#386)
-
shared-workflowschanges (wheels-build: build on CUDA 13.0 shared-workflows#510) - library changes, in RAPIDS dependency order (paired with wheels CI: test mix of
cuda-toolkitversion in CI #256)- rmm (build wheels with CUDA 13.0.x, test wheels against mix of CTK versions rmm#2270)
- kvikio (build wheels with CUDA 13.0.x, test wheels against mix of CTK versions kvikio#942)
dask-cudanot necessary: pure Python- ucxx (build wheels with CUDA 13.0.x, test wheels against mix of CTK versions ucxx#604)
- raft (build wheels with CUDA 13.0.x, test wheels against mix of CTK versions raft#2971)
- cuvs (enforce a floor on libnvjitlink, build wheels with CUDA 13.0.x, test wheels against mix of CTK versions cuvs#1862)
- nvforest (build wheels with CUDA 13.0.x, test wheels against mix of CTK versions, drop CUDA math libraries dependencies nvforest#87)
- cudf (enforce a floor on libnvjitlink, build wheels with CUDA 13.0.x, test wheels against mix of CTK versions cudf#21671)
- cuopt (build wheels with CUDA 13.0.x, test wheels against mix of CTK versions NVIDIA/cuopt#973)
- cucim (build wheels with CUDA 13.0.x, test wheels against mix of CTK versions cucim#1054)
- cuxfilter (build wheels with CUDA 13.0.x, test wheels against mix of CTK versions cuxfilter#777)
- rapidsmpf (build wheels with CUDA 13.0.x, test wheels against mix of CTK versions rapidsmpf#919)
- cugraph
- enforce a floor on libnvjitlink, build wheels with CUDA 13.0.x, test wheels against mix of CTK versions cugraph#5457
- removed hacked-in
pip install nvidia-nvjitlink(ref: ensure 'torch' CUDA wheels are installed in CI, remove unused dependencies cugraph#5453 (comment))
- cuml (build wheels with CUDA 13.0.x, test wheels against mix of CTK versions cuml#7907)
- cugraph-gnn
WIP: wheels CI: stricter torch index selection, test oldest versions of dependencies cugraph-gnn#413- fix
torchdependency handling (ensure 'torch' CUDA wheels are installed in CI, test that 'torch' is an optional dependency cugraph-gnn#425) - test against a mix of versions, remove hacked-in
pip install nvidia-nvjitlink(wheels: build with CUDA 13.0, test against mix of CTK versions, make 'torch-geometric' fully optional for 'cugraph-pyg' cugraph-gnn#434)
nx-cugraphnot necessary: pure Python
- revert
ci-imgsCUDA 12.2.2 images if we end up not needing them (ci-wheel: drop CUDA 12.2.2 images, move latest back to 13.0.2 ci-imgs#386) - fix any forward-merge conflicts from
release/26.04 -> main
Notes
N/A