-
Notifications
You must be signed in to change notification settings - Fork 47
Use sccache-dist build cluster for conda and wheel builds
#542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use sccache-dist build cluster for conda and wheel builds
#542
Conversation
conda/recipes/ucxx/recipe.yaml
Outdated
| SCCACHE_S3_USE_SSL: ${{ env.get("SCCACHE_S3_USE_SSL") }} | ||
| SCCACHE_S3_NO_CREDENTIALS: ${{ env.get("SCCACHE_S3_NO_CREDENTIALS") }} | ||
| SCCACHE_S3_KEY_PREFIX: libucxx/${{ env.get("RAPIDS_CONDA_ARCH") }}/cuda${{ cuda_major }} | ||
| NVCC_APPEND_FLAGS: ${{ env.get("NVCC_APPEND_FLAGS", default="") }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we adding NVCC flags here? UCXX shouldn't need NVCC to compile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the conda recipe envvar updates consistent across all the PRs, mostly because I wasn't sure which recipes use which compilers.
When using the build cluster, the rapids-configure-sccache script from gha-tools sets PARALLEL_LEVEL=<a_very_large_number> and NVCC_APPEND_FLAGS=-t=100 to maximize total parallelism.
Since it's just an envvar, it doesn't hurt to include it here. If UCXX ever does add any nvcc targets, you won't have to worry about needing to update the recipe in this way.
b7d772f to
9107611
Compare
|
Does anyone know what's going on here? I wasn't seeing this test fail a few days ago, but now it happens consistently and AFAIK I didn't change anything that should cause this. Maybe a UCX update broke something? |
This reverts commit ed0e835.
|
I think the issue is that (base) root@831ed948ef31:/# pip install --extra-index-url https://pypi.anaconda.org/rapidsai-wheels-nightly/simple 'cudf-cu13==25.12.*,>=0.0.0a0'
(base) root@831ed948ef31:/# python
Python 3.13.9 | packaged by conda-forge | (main, Oct 22 2025, 23:33:35) [GCC 14.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cudf
>>> cudf.datasets.timeseries()
id name x y
timestamp
2000-01-01 00:00:00 <NA> <NA> <NA> <NA>
2000-01-01 00:00:01 <NA> <NA> <NA> <NA>
2000-01-01 00:00:02 <NA> <NA> <NA> <NA>
2000-01-01 00:00:03 <NA> <NA> <NA> <NA>
2000-01-01 00:00:04 <NA> <NA> <NA> <NA>
... ... ... ... ...
2000-01-30 23:59:56 <NA> <NA> <NA> <NA>
2000-01-30 23:59:57 <NA> <NA> <NA> <NA>
2000-01-30 23:59:58 <NA> <NA> <NA> <NA>
2000-01-30 23:59:59 <NA> <NA> <NA> <NA>
2000-01-31 00:00:00 <NA> <NA> <NA> <NA>
[2592001 rows x 4 columns]
|
|
Should be fixed by rapidsai/cudf@7d54b71 |
|
Merged after discussing with @vyasr, who said cudf is unlikely to have a fix today. For context, this is the exact backtrace of the failing tests: |
|
rapidsai/cudf#20709 should fix the failing UCXX test |
Description
RAPIDS has deployed an autoscaling cloud build cluster that can be used to accelerate building large RAPIDS projects.
This PR updates the conda and wheel builds to use the build cluster.
This contributes to rapidsai/build-planning#228.