refactor: build wheels and conda packages using Python limited API#7758
refactor: build wheels and conda packages using Python limited API#7758rapids-bot[bot] merged 11 commits intorapidsai:mainfrom
Conversation
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdded matrix_filter to CI jobs and adjusted wheel publish lookup; introduced stable-ABI (abi3) wheel build support and related flags; switched artifact downloads to package-name-based resolution; exported RAPIDS_PACKAGE_NAME, RAPIDS_PY_API, and RAPIDS_PY_VERSION; updated conda recipe and install/publish logic for abi3 wheels. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@ci/build_docs.sh`:
- Line 8: The assignment to PYTHON_CHANNEL contains mismatched nested double
quotes around the call to rapids-package-name which breaks the shell syntax;
update the command substitution so nested quotes are either escaped or the inner
string uses single quotes (e.g., call rapids-package-name with 'conda_python' or
escape the inner quotes) so the invocation of rapids-download-from-github and
rapids-package-name executes correctly; adjust the line that sets PYTHON_CHANNEL
accordingly and keep the surrounding command substitution intact.
In `@ci/test_python_common.sh`:
- Line 14: The command assigning PYTHON_CHANNEL has broken quoting and
mismatched braces; update the call so rapids-package-name is invoked with single
quotes around conda_python and the subshell/parens are balanced, e.g. call
rapids-download-from-github with "$(rapids-package-name 'conda_python' cuml
--stable --cuda "$RAPIDS_CUDA_VERSION")" so the inner quotes no longer break the
outer quoted argument; modify the line that sets PYTHON_CHANNEL accordingly
(referencing PYTHON_CHANNEL, rapids-download-from-github, and
rapids-package-name).
🧹 Nitpick comments (2)
conda/recipes/cuml/recipe.yaml (2)
11-13: Consider parameterizing or documentingpy_runtime_latest.The
py_runtime_latest: "3.13"is hardcoded, which will require manual updates when new Python versions are released. Consider:
- Documenting this in a comment, or
- Fetching it from an environment variable with a fallback default
This is a minor maintenance consideration for the future.
121-126: Confirmpip_check: falseis intentional for abi3 packages.Disabling pip dependency checking (
pip_check: false) during tests may mask dependency issues. If this is intentional due to abi3/stable ABI constraints or known compatibility gaps, consider adding a brief comment explaining why.The use of
py_runtime_latestfor the test Python version is appropriate—testing on the latest Python validates abi3 forward compatibility.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@build.sh`:
- Around line 365-370: The pip install invocation is missing
CUML_EXTRA_PYTHON_ARGS so flags like --singlegpu and --codecov are dropped;
update the python -m pip install call to include the CUML_EXTRA_PYTHON_ARGS
variable alongside "${PYTHON_ARGS_FOR_INSTALL[@]}" and "${REPODIR}"/python/cuml,
ensuring SKBUILD_EXTRA_CMAKE_ARGS and SKBUILD_CMAKE_ARGS are unchanged and still
used for CMake arguments.
🧹 Nitpick comments (1)
build.sh (1)
359-363: Apply major.minor normalization toRAPIDS_PY_VERSIONbefore constructing the ABI tag.The current substitution
cp${RAPIDS_PY_VERSION//./}assumesRAPIDS_PY_VERSIONcontains only major.minor (e.g.,3.11). While the dependency matrix currently enforces this, the code is fragile—if a patch version were ever passed (e.g.,3.10.13), it would generate an invalid ABI tag (cp31013instead ofcp311). This same pattern appears in two files:build.sh(lines 359–363) andci/build_wheel_cuml.sh(line 39). Normalize both to explicitly extract major.minor before generating the ABI tag:Proposed fix
- RAPIDS_PY_API="cp${RAPIDS_PY_VERSION//./}" + RAPIDS_PY_VERSION_MM="$(cut -d. -f1,2 <<< "${RAPIDS_PY_VERSION}")" + RAPIDS_PY_API="cp${RAPIDS_PY_VERSION_MM//./}"
cfd76f3 to
9d7d93d
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
conda/recipes/cuml/recipe.yaml (1)
108-108:⚠️ Potential issue | 🟡 MinorAdd a lower bound on the Python run requirement for consistency with build requirements.
The
py_abi_minvariable is already defined in this recipe and used in the build requirements (lines 85-86), but the run requirements at line 108 lack this constraint. For abi3 packages, the runtime Python version should match the minimum ABI version established at build time:- - python + - python >=${{ py_abi_min }}This ensures conda respects the minimum Python version constraint and maintains consistency with the build-time requirements.
🤖 Fix all issues with AI agents
In `@ci/build_wheel.sh`:
- Around line 48-51: If the stable_abi flag is true but RAPIDS_PY_API is unset,
fail fast instead of silently skipping the setting: in the block that checks [[
"${stable_abi}" == "true" ]] (where RAPIDS_PIP_WHEEL_ARGS is appended), add a
guard that tests -z "${RAPIDS_PY_API:-}" and calls exit 1 after printing an
explanatory error to stderr (e.g., using >&2) so the build stops when --stable
is specified without RAPIDS_PY_API; otherwise continue to append
--config-settings="skbuild.wheel.py-api=${RAPIDS_PY_API}" to
RAPIDS_PIP_WHEEL_ARGS as before.
- Around line 72-73: The RAPIDS_PACKAGE_NAME is always computed with the literal
`--stable` flag; change the logic that sets RAPIDS_PACKAGE_NAME so it
conditionally includes `--stable` only when the stable ABI flag (e.g.,
`stable_abi` or `STABLE_ABI`) is true/yes; update the `rapids-package-name
wheel_python cuml` invocation to build the argument list dynamically (include
`--stable` when stable_abi is truthy, omit it otherwise) so the resulting
RAPIDS_PACKAGE_NAME correctly reflects the ABI mode.
In `@ci/test_wheel_integrations.sh`:
- Around line 9-12: Add a GPU availability check at the start of the integration
test script: before using RAPIDS_CUDA_VERSION and downloading CUDA wheels
(around the block with RAPIDS_PY_CUDA_SUFFIX, CUML_WHEELHOUSE,
LIBCUML_WHEELHOUSE), run a probe like checking nvidia-smi or /proc/driver/nvidia
to verify at least one GPU is present; if the probe fails, exit nonzero with a
clear message like "No GPUs detected; integration tests require a GPU" so the
script fails fast and avoids spurious test errors.
🧹 Nitpick comments (2)
conda/recipes/cuml/recipe.yaml (2)
11-13: Hardcodedpy_runtime_latestwill need manual updates.The hardcoded
"3.13"value will require a manual update when Python 3.14 becomes the latest stable release. Consider adding a comment noting this, or alternatively, sourcing this value from an environment variable if the CI infrastructure can provide it.This is a minor maintenance consideration and acceptable for now.
121-126: Testing on latest Python (3.13) while building onpy_abi_minis correct abi3 validation.The test strategy correctly validates that the abi3 extension built against the minimum version works on the latest runtime.
Clarify the reason for
pip_check: false.Disabling pip check can hide dependency metadata issues. If this is intentional (e.g., known conda-forge/pip interop issues), consider adding a comment explaining why. If not needed, enabling pip check provides an additional layer of dependency validation.
| # Add py-api setting for stable ABI builds | ||
| if [[ "${stable_abi}" == "true" ]] && [[ -n "${RAPIDS_PY_API:-}" ]]; then | ||
| RAPIDS_PIP_WHEEL_ARGS+=(--config-settings="skbuild.wheel.py-api=${RAPIDS_PY_API}") | ||
| fi |
There was a problem hiding this comment.
Fail fast if --stable is set but RAPIDS_PY_API is unset.
Right now this silently drops the stable-ABI config. Add an explicit error so the build doesn’t produce the wrong wheel.
🔧 Proposed fix
-# Add py-api setting for stable ABI builds
-if [[ "${stable_abi}" == "true" ]] && [[ -n "${RAPIDS_PY_API:-}" ]]; then
- RAPIDS_PIP_WHEEL_ARGS+=(--config-settings="skbuild.wheel.py-api=${RAPIDS_PY_API}")
-fi
+# Add py-api setting for stable ABI builds
+if [[ "${stable_abi}" == "true" ]]; then
+ if [[ -z "${RAPIDS_PY_API:-}" ]]; then
+ echo "RAPIDS_PY_API must be set when --stable is used." >&2
+ exit 1
+ fi
+ RAPIDS_PIP_WHEEL_ARGS+=(--config-settings="skbuild.wheel.py-api=${RAPIDS_PY_API}")
+fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Add py-api setting for stable ABI builds | |
| if [[ "${stable_abi}" == "true" ]] && [[ -n "${RAPIDS_PY_API:-}" ]]; then | |
| RAPIDS_PIP_WHEEL_ARGS+=(--config-settings="skbuild.wheel.py-api=${RAPIDS_PY_API}") | |
| fi | |
| # Add py-api setting for stable ABI builds | |
| if [[ "${stable_abi}" == "true" ]]; then | |
| if [[ -z "${RAPIDS_PY_API:-}" ]]; then | |
| echo "RAPIDS_PY_API must be set when --stable is used." >&2 | |
| exit 1 | |
| fi | |
| RAPIDS_PIP_WHEEL_ARGS+=(--config-settings="skbuild.wheel.py-api=${RAPIDS_PY_API}") | |
| fi |
🤖 Prompt for AI Agents
In `@ci/build_wheel.sh` around lines 48 - 51, If the stable_abi flag is true but
RAPIDS_PY_API is unset, fail fast instead of silently skipping the setting: in
the block that checks [[ "${stable_abi}" == "true" ]] (where
RAPIDS_PIP_WHEEL_ARGS is appended), add a guard that tests -z
"${RAPIDS_PY_API:-}" and calls exit 1 after printing an explanatory error to
stderr (e.g., using >&2) so the build stops when --stable is specified without
RAPIDS_PY_API; otherwise continue to append
--config-settings="skbuild.wheel.py-api=${RAPIDS_PY_API}" to
RAPIDS_PIP_WHEEL_ARGS as before.
| RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen "${RAPIDS_CUDA_VERSION}")" | ||
| CUML_WHEELHOUSE=$(RAPIDS_PY_WHEEL_NAME="cuml_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-github python) | ||
| CUML_WHEELHOUSE=$(rapids-download-from-github "$(rapids-package-name "wheel_python" cuml --stable --cuda "$RAPIDS_CUDA_VERSION")") | ||
| LIBCUML_WHEELHOUSE=$(RAPIDS_PY_WHEEL_NAME="libcuml_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-github cpp) | ||
| RAPIDS_TESTS_DIR=${RAPIDS_TESTS_DIR:-"${PWD}/test-results"} |
There was a problem hiding this comment.
Add a GPU availability check before running integration tests.
This script runs GPU-backed tests but doesn’t verify GPU presence. Fail early with a clear message to avoid misleading errors.
🔧 Proposed fix
LIBCUML_WHEELHOUSE=$(RAPIDS_PY_WHEEL_NAME="libcuml_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-github cpp)
RAPIDS_TESTS_DIR=${RAPIDS_TESTS_DIR:-"${PWD}/test-results"}
mkdir -p "${RAPIDS_TESTS_DIR}"
+
+rapids-logger "Checking GPU availability"
+if ! nvidia-smi; then
+ rapids-logger "GPU not available; aborting integration tests."
+ exit 1
+fi🤖 Prompt for AI Agents
In `@ci/test_wheel_integrations.sh` around lines 9 - 12, Add a GPU availability
check at the start of the integration test script: before using
RAPIDS_CUDA_VERSION and downloading CUDA wheels (around the block with
RAPIDS_PY_CUDA_SUFFIX, CUML_WHEELHOUSE, LIBCUML_WHEELHOUSE), run a probe like
checking nvidia-smi or /proc/driver/nvidia to verify at least one GPU is
present; if the probe fails, exit nonzero with a clear message like "No GPUs
detected; integration tests require a GPU" so the script fails fast and avoids
spurious test errors.
There was a problem hiding this comment.
We can safely assume that the scripts are executed on the right architecture.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@ci/build_wheel_cuml.sh`:
- Around line 37-43: The script formats RAPIDS_PY_API from RAPIDS_PY_VERSION but
will hard-fail under set -u if RAPIDS_PY_VERSION is unset; add an explicit
preflight guard that checks RAPIDS_PY_VERSION (use parameter expansion like
${RAPIDS_PY_VERSION:-} to avoid unbound errors), and if empty print a clear
stderr message and exit non-zero before computing RAPIDS_PY_API so the
subsequent use of RAPIDS_PY_API and export is safe; reference the
RAPIDS_PY_VERSION and RAPIDS_PY_API variables and the block that sets/exports
RAPIDS_PY_API.
- Around line 53-54: The call that sets RAPIDS_PACKAGE_NAME uses
rapids-package-name with --cuda but omits the CUDA version; update the
invocation that assigns RAPIDS_PACKAGE_NAME (the line using rapids-package-name
wheel_python cuml --stable --cuda) to pass the existing RAPIDS_CUDA_VERSION
variable (e.g., --cuda "$RAPIDS_CUDA_VERSION") so it matches other CI scripts
and the CUDA version used earlier in the script.
jameslamb
left a comment
There was a problem hiding this comment.
Just some tiny suggestions, otherwise all looks good to me. Exciting!
|
/merge |
…apidsai#7758) - **refactor(gha): build one python package per arch x cuda** - **feat(wheel): build wheels with limited API** - **feat(conda): build conda packages with limited API** General testing done in rapidsai#7659 Opening a new PR to not disrupt any ongoing downstream testing using artifacts from the original draft PRs. Resolves rapidsai#7659 xref rapidsai/build-planning#42 Ops-Bot-Merge-Barrier: true Authors: - Gil Forsyth (https://github.com/gforsyth) Approvers: - James Lamb (https://github.com/jameslamb) URL: rapidsai#7758
General testing done in #7659
Opening a new PR to not disrupt any ongoing downstream testing using artifacts from the original draft PRs.
Resolves #7659
xref rapidsai/build-planning#42
Ops-Bot-Merge-Barrier: true