Skip to content

use CUDA 13.0.1 CI images#1353

Merged
rapids-bot[bot] merged 3 commits intorapidsai:branch-25.10from
jameslamb:cuda13.0.1
Sep 24, 2025
Merged

use CUDA 13.0.1 CI images#1353
rapids-bot[bot] merged 3 commits intorapidsai:branch-25.10from
jameslamb:cuda13.0.1

Conversation

@jameslamb
Copy link
Copy Markdown
Member

RAPIDS recently started building and testing against CUDA 13.0.1:

This updates hard-coded 13.0.0 references in some CI jobs to 13.0.1.

@jameslamb jameslamb added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Sep 22, 2025
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Sep 22, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@jameslamb
Copy link
Copy Markdown
Member Author

/ok to test

- '12.9.1'
- '13.0.0'
- *latest_cuda12
- *latest_cuda13
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub just recently announced support for YAML anchors: https://github.blog/changelog/2025-09-18-actions-yaml-anchors-and-non-public-workflow-templates/

This seems like a useful application for them.

@jameslamb jameslamb changed the title WIP: use CUDA 13.0.1 CI images use CUDA 13.0.1 CI images Sep 22, 2025
@jameslamb jameslamb marked this pull request as ready for review September 22, 2025 18:11
@jameslamb jameslamb requested a review from a team as a code owner September 22, 2025 18:11
@jameslamb
Copy link
Copy Markdown
Member Author

All Python tests are failing, across all combinations (amd64 and arch, CUDA 12 and 13, conda and wheels), with errors like this:

FAILED python/cuvs/cuvs/tests/test_brute_force.py::test_prefiltered_brute_force_knn[bitmap-float32-True-sqeuclidean-0.01-32-100-100-100] - AssertionError: 
Not equal to tolerance rtol=0.001, atol=0.001

nan location mismatch:
 ACTUAL: array([13.072728,       inf,       inf,       inf,       inf,       inf,
             inf,       inf,       inf,       inf,       inf,       inf,
             inf,       inf,       inf,       inf,       inf,       inf,...
 DESIRED: array([13.072731,       nan,       nan,       nan,       nan,       nan,
             nan,       nan,       nan,       nan,       nan,       nan,
             nan,       nan,       nan,       nan,       nan,       nan,...

(wheel-tests-cuvs link)

I don't believe this is related to the changes in this PR.

It looks like this was happening on other PRs before rapidsai/shared-workflows#423 was merged, so probably not related to a CUDA 13.0.0 vs. 13.0.1 difference. For example, from 3 days ago: https://github.com/rapidsai/cuvs/actions/runs/17864860319/job/50807276494

@jameslamb
Copy link
Copy Markdown
Member Author

Saw some offline discussion with @cjnolet and @achirkin suggesting this may be caused by these changes in RAFT: rapidsai/raft#2807

@achirkin
Copy link
Copy Markdown
Contributor

The likely fix is rapidsai/raft#2814, currently testing against it in #1356

@jameslamb jameslamb removed the request for review from KyleFromNVIDIA September 24, 2025 14:46
@jameslamb
Copy link
Copy Markdown
Member Author

/merge

@rapids-bot rapids-bot Bot merged commit b6a09dd into rapidsai:branch-25.10 Sep 24, 2025
161 of 164 checks passed
@jameslamb jameslamb deleted the cuda13.0.1 branch September 24, 2025 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants