Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
13825b5
ensure 'torch' CUDA wheels are installed in CI
jameslamb Mar 9, 2026
698f115
help git understand the diff
jameslamb Mar 9, 2026
1c457b8
use rapids-generate-pip-constraints, fix typo
jameslamb Mar 9, 2026
066d5c4
handle the fallback case better, other fixes
jameslamb Mar 9, 2026
6f73e44
echo wheel name
jameslamb Mar 9, 2026
271eb7e
more pin fiddling
jameslamb Mar 9, 2026
5a4064e
fix validation script
jameslamb Mar 9, 2026
0d7215e
just wheels changes
jameslamb Mar 10, 2026
fc30204
even fewer changes
jameslamb Mar 10, 2026
97e2c02
revert gitignore changes
jameslamb Mar 10, 2026
355d5aa
add 'torch_only'
jameslamb Mar 10, 2026
4aad5b4
testing
jameslamb Mar 10, 2026
426c5ff
more updates
jameslamb Mar 10, 2026
7ac88d3
make 'torch' optional everywhere
jameslamb Mar 10, 2026
104b8bf
more changes
jameslamb Mar 10, 2026
4b479f7
more torch fixes... unconditional references in argument defaults
jameslamb Mar 10, 2026
d055f9d
merge main
jameslamb Mar 10, 2026
7bbf218
handle more unconditional 'torch' references (this time in type hints)
jameslamb Mar 10, 2026
4cacebf
revert
jameslamb Mar 10, 2026
36843b6
check in debugging code temporarily
jameslamb Mar 11, 2026
4952952
merge main
jameslamb Mar 11, 2026
11ed00e
classes that inherit from 'torch' also need to handle the dependency …
jameslamb Mar 11, 2026
b1cb02c
remove debugging code
jameslamb Mar 11, 2026
ca6e314
fix typo with pytest.importorskip()
jameslamb Mar 11, 2026
2f3d4f8
more fixes
jameslamb Mar 11, 2026
22fb749
fix more imports
jameslamb Mar 11, 2026
79b7854
pytest params need to be lazy too
jameslamb Mar 11, 2026
2633d4f
pre-commit
jameslamb Mar 12, 2026
6039796
more testing fixes
jameslamb Mar 12, 2026
005a890
work around nvJitLink symbol issues, fix a few more test skips, other…
jameslamb Mar 12, 2026
22ded28
revert temporary testing stuff
jameslamb Mar 12, 2026
bbe4c97
remove comment
jameslamb Mar 12, 2026
72779bd
Merge branch 'release/26.04' into torch-testing
jameslamb Mar 12, 2026
2cf1e1c
Merge branch 'release/26.04' into torch-testing
jameslamb Mar 12, 2026
2192089
Apply suggestion from @jameslamb
jameslamb Mar 13, 2026
b827cc2
fix copy-paste mistakes
jameslamb Mar 13, 2026
6a958e6
standardize dependencies.yaml filters
jameslamb Mar 13, 2026
2c3d0d0
Update ci/validate_wheel.sh
jameslamb Mar 13, 2026
40cdfa8
Update pyproject.toml
jameslamb Mar 13, 2026
41c5277
Apply suggestion from @jameslamb
jameslamb Mar 13, 2026
eed447c
one more import
jameslamb Mar 13, 2026
79a6efe
fix
jameslamb Mar 13, 2026
a61a427
make optional imports lazy
alexbarghi-nv Mar 16, 2026
96201b6
fix module check - meant to change to use find_spec
alexbarghi-nv Mar 16, 2026
b8276e2
Merge branch 'release/26.04' of github.com:rapidsai/cugraph-gnn into …
jameslamb Mar 17, 2026
456857a
handle dotted imports, make ruff selections explicit
jameslamb Mar 17, 2026
0afcdd7
more import-time patching
jameslamb Mar 17, 2026
27f8fdd
remove unnecessary CUDA_MAJOR
jameslamb Mar 17, 2026
abf7313
Merge branch 'release/26.04' into torch-testing
jameslamb Mar 17, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,18 @@ repos:
rev: v0.14.3
hooks:
- id: ruff-check
args: [--fix]
args: [--fix, --config, "pyproject.toml"]
- id: ruff-format
args: [--config, "pyproject.toml"]
- repo: https://github.com/asottile/yesqa
rev: v1.3.0
hooks:
- id: yesqa
additional_dependencies:
- flake8==7.1.1
exclude: |
(?x)
python/pylibwholegraph/pylibwholegraph/_doctor_check[.]py$
- repo: https://github.com/pre-commit/mirrors-clang-format
rev: v20.1.4
hooks:
Expand Down
52 changes: 52 additions & 0 deletions ci/download-torch-wheels.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
#!/bin/bash
# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

# [description]
#
# Downloads a CUDA variant of 'torch' from the correct index, based on CUDA major version.
#
# This exists to avoid using 'pip --extra-index-url', which has these undesirable properties:
#
# - allows for CPU-only 'torch' to be downloaded from pypi.org
# - allows for other non-torch packages like 'numpy' to be downloaded from the PyTorch indices
# - increases solve complexity for 'pip'
#

set -e -u -o pipefail

TORCH_WHEEL_DIR="${1}"

# skip download attempt on CUDA versions where we know there isn't a 'torch' CUDA wheel.
CUDA_MAJOR="${RAPIDS_CUDA_VERSION%%.*}"
CUDA_MINOR=$(echo "${RAPIDS_CUDA_VERSION}" | cut -d'.' -f2)
if \
{ [ "${CUDA_MAJOR}" -eq 12 ] && [ "${CUDA_MINOR}" -lt 9 ]; } \
|| { [ "${CUDA_MAJOR}" -eq 13 ] && [ "${CUDA_MINOR}" -gt 0 ]; } \
|| [ "${CUDA_MAJOR}" -gt 13 ];
then
rapids-logger "Skipping 'torch' wheel download. (requires CUDA 12.9+ or 13.0, found ${RAPIDS_CUDA_VERSION})"
exit 0
fi

# Ensure CUDA-enabled 'torch' packages are always used.
#
# Downloading + passing the downloaded file as a requirement forces the use of this
# package and ensures 'pip' considers all of its requirements.
#
# Not appending this to PIP_CONSTRAINT, because we don't want the torch '--extra-index-url'
# to leak outside of this script into other 'pip {download,install}'' calls.
rapids-dependency-file-generator \
--output requirements \
--file-key "torch_only" \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION};dependencies=${RAPIDS_DEPENDENCIES};require_gpu=true" \
| tee ./torch-constraints.txt

rapids-pip-retry download \
--isolated \
--prefer-binary \
--no-deps \
-d "${TORCH_WHEEL_DIR}" \
--constraint "${PIP_CONSTRAINT}" \
--constraint ./torch-constraints.txt \
'torch'
4 changes: 2 additions & 2 deletions ci/run_cugraph_pyg_pytests.sh
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
#!/bin/bash
# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION.
# SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION.
# SPDX-License-Identifier: Apache-2.0

set -euo pipefail

# Support invoking run_cugraph_pyg_pytests.sh outside the script directory
cd "$(dirname "$(realpath "${BASH_SOURCE[0]}")")"/../python/cugraph-pyg/cugraph_pyg

pytest --cache-clear --benchmark-disable "$@" .
pytest -rs --cache-clear --benchmark-disable "$@" .

# Used to skip certain examples in CI due to memory limitations
export CI=true
Expand Down
4 changes: 2 additions & 2 deletions ci/run_pylibwholegraph_pytests.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
#!/bin/bash
# SPDX-FileCopyrightText: Copyright (c) 2024, NVIDIA CORPORATION.
# SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION.
# SPDX-License-Identifier: Apache-2.0

set -euo pipefail

# Support invoking run_pytests.sh outside the script directory
cd "$(dirname "$(realpath "${BASH_SOURCE[0]}")")"/../python/pylibwholegraph/pylibwholegraph/

pytest --cache-clear --forked --import-mode=append "$@" tests
pytest -rs --cache-clear --forked --import-mode=append "$@" tests
64 changes: 53 additions & 11 deletions ci/test_wheel_cugraph-pyg.sh
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,30 @@ LIBWHOLEGRAPH_WHEELHOUSE=$(RAPIDS_PY_WHEEL_NAME="libwholegraph_${RAPIDS_PY_CUDA_
PYLIBWHOLEGRAPH_WHEELHOUSE=$(rapids-download-from-github "$(rapids-package-name "wheel_python" pylibwholegraph --stable --cuda "$RAPIDS_CUDA_VERSION")")
CUGRAPH_PYG_WHEELHOUSE=$(RAPIDS_PY_WHEEL_NAME="${package_name}_${RAPIDS_PY_CUDA_SUFFIX}" RAPIDS_PY_WHEEL_PURE="1" rapids-download-wheels-from-github python)

CUDA_MAJOR="${RAPIDS_CUDA_VERSION%%.*}"
# generate constraints (possibly pinning to oldest support versions of dependencies)
rapids-generate-pip-constraints test_cugraph_pyg "${PIP_CONSTRAINT}"

if [[ "${CUDA_MAJOR}" == "12" ]]; then
PYTORCH_INDEX="https://download.pytorch.org/whl/cu126"
PIP_INSTALL_ARGS=(
--prefer-binary
--constraint "${PIP_CONSTRAINT}"
--extra-index-url 'https://pypi.nvidia.com'
"${LIBWHOLEGRAPH_WHEELHOUSE}"/*.whl
"$(echo "${PYLIBWHOLEGRAPH_WHEELHOUSE}"/pylibwholegraph_"${RAPIDS_PY_CUDA_SUFFIX}"*.whl)"
"$(echo "${CUGRAPH_PYG_WHEELHOUSE}"/cugraph_pyg_"${RAPIDS_PY_CUDA_SUFFIX}"*.whl)[test]"
)

# ensure a CUDA variant of 'torch' is used (if one is available)
TORCH_WHEEL_DIR="$(mktemp -d)"
./ci/download-torch-wheels.sh "${TORCH_WHEEL_DIR}"

# 'cugraph-pyg' is still expected to be importable
# and testable in an environment where 'torch' isn't installed.
torch_downloaded=true
if [ -z "$(ls -A ${TORCH_WHEEL_DIR} 2>/dev/null)" ]; then
rapids-echo-stderr "No 'torch' wheels downloaded."
torch_downloaded=false
else
PYTORCH_INDEX="https://download.pytorch.org/whl/cu130"
PIP_INSTALL_ARGS+=("${TORCH_WHEEL_DIR}"/torch-*.whl)
fi

# notes:
Expand All @@ -30,12 +48,7 @@ fi
# its dependencies are available from pypi.org
#
rapids-pip-retry install \
-v \
--extra-index-url "${PYTORCH_INDEX}" \
--extra-index-url 'https://pypi.nvidia.com' \
"${LIBWHOLEGRAPH_WHEELHOUSE}"/*.whl \
"$(echo "${PYLIBWHOLEGRAPH_WHEELHOUSE}"/pylibwholegraph_"${RAPIDS_PY_CUDA_SUFFIX}"*.whl)" \
"$(echo "${CUGRAPH_PYG_WHEELHOUSE}"/cugraph_pyg_"${RAPIDS_PY_CUDA_SUFFIX}"*.whl)[test]"
"${PIP_INSTALL_ARGS[@]}"

# RAPIDS_DATASET_ROOT_DIR is used by test scripts
export RAPIDS_DATASET_ROOT_DIR="$(realpath datasets)"
Expand All @@ -47,5 +60,34 @@ popd
# Enable legacy behavior of torch.load for examples relying on ogb
export TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD=1

rapids-logger "pytest cugraph-pyg (single GPU)"
if [[ "${torch_downloaded}" == "true" ]]; then
# TODO: remove this when RAPIDS wheels and 'torch' CUDA wheels have compatible package requirements
#
# * https://github.com/rapidsai/cugraph/issues/5443
# * https://github.com/rapidsai/build-planning/issues/257
# * https://github.com/rapidsai/build-planning/issues/255
#
CUDA_MAJOR="${RAPIDS_CUDA_VERSION%%.*}"
CUDA_MINOR=$(echo "${RAPIDS_CUDA_VERSION}" | cut -d'.' -f2)
if [[ "${CUDA_MAJOR}" == "13" ]]; then
pip install \
--upgrade \
"nvidia-nvjitlink>=${CUDA_MAJOR}.${CUDA_MINOR}"
fi

# 'torch' is an optional dependency of 'cugraph-pyg'... confirm that it's actually
# installed here and that we've installed a package with CUDA support.
rapids-logger "Confirming that PyTorch is installed"
python -c "import torch; assert torch.cuda.is_available()"

rapids-logger "pytest cugraph-pyg (single GPU, with 'torch')"
./ci/run_cugraph_pyg_pytests.sh
fi

rapids-logger "import cugraph-pyg (no 'torch')"
./ci/uninstall-torch-wheels.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason you couldn't do the no-torch tests before the torch tests? That would save you the trouble of uninstalling torch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because to run the tests without torch, we still need to install cugraph-pyg and its dependencies, any of which might pull in torch as a requirement.

Imperative code that force-uninstalls torch right here afterwards is easier and less error-prone than trying to be careful about not letting torch into the environment at the beginning, in my opinion.


python -c "import cugraph_pyg; print(f'cugraph-pyg version: {cugraph_pyg.__version__}')"

rapids-logger "pytest cugraph-pyg (no 'torch')"
./ci/run_cugraph_pyg_pytests.sh
71 changes: 57 additions & 14 deletions ci/test_wheel_pylibwholegraph.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,7 @@
# SPDX-FileCopyrightText: Copyright (c) 2023-2026, NVIDIA CORPORATION.
# SPDX-License-Identifier: Apache-2.0

set -e # abort the script on error
set -o pipefail # piped commands propagate their error
set -E # ERR traps are inherited by subcommands
set -euo pipefail

# Delete system libnccl.so to ensure the wheel is used.
# (but only do this in CI, to avoid breaking local dev environments)
Expand All @@ -18,23 +16,68 @@ RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
LIBWHOLEGRAPH_WHEELHOUSE=$(RAPIDS_PY_WHEEL_NAME="libwholegraph_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-github cpp)
PYLIBWHOLEGRAPH_WHEELHOUSE=$(rapids-download-from-github "$(rapids-package-name "wheel_python" pylibwholegraph --stable --cuda "$RAPIDS_CUDA_VERSION")")

# determine pytorch source
if [[ "${CUDA_MAJOR}" == "12" ]]; then
PYTORCH_INDEX="https://download.pytorch.org/whl/cu126"
else
PYTORCH_INDEX="https://download.pytorch.org/whl/cu130"
fi
RAPIDS_TESTS_DIR=${RAPIDS_TESTS_DIR:-"${PWD}/test-results"}
RAPIDS_COVERAGE_DIR=${RAPIDS_COVERAGE_DIR:-"${PWD}/coverage-results"}
mkdir -p "${RAPIDS_TESTS_DIR}" "${RAPIDS_COVERAGE_DIR}"

# generate constraints (possibly pinning to oldest support versions of dependencies)
rapids-generate-pip-constraints test_pylibwholegraph "${PIP_CONSTRAINT}"

PIP_INSTALL_ARGS=(
--prefer-binary
--constraint "${PIP_CONSTRAINT}"
"$(echo "${PYLIBWHOLEGRAPH_WHEELHOUSE}"/pylibwholegraph*.whl)[test]"
"${LIBWHOLEGRAPH_WHEELHOUSE}"/*.whl
)

# ensure a CUDA variant of 'torch' is used (if one is available)
TORCH_WHEEL_DIR="$(mktemp -d)"
./ci/download-torch-wheels.sh "${TORCH_WHEEL_DIR}"

# 'pylibwholegraph' is still expected to be importable
# and testable in an environment where 'torch' isn't installed.
torch_downloaded=true
if [ -z "$(ls -A ${TORCH_WHEEL_DIR} 2>/dev/null)" ]; then
rapids-echo-stderr "No 'torch' wheels downloaded."
torch_downloaded=false
else
PIP_INSTALL_ARGS+=("${TORCH_WHEEL_DIR}"/torch-*.whl)
fi

# echo to expand wildcard before adding `[extra]` requires for pip
rapids-logger "Installing Packages"
rapids-pip-retry install \
--extra-index-url ${PYTORCH_INDEX} \
"$(echo "${PYLIBWHOLEGRAPH_WHEELHOUSE}"/pylibwholegraph*.whl)[test]" \
"${LIBWHOLEGRAPH_WHEELHOUSE}"/*.whl \
'torch>=2.3'
"${PIP_INSTALL_ARGS[@]}"


if [[ "${torch_downloaded}" == "true" ]]; then
# TODO: remove this when RAPIDS wheels and 'torch' CUDA wheels have compatible package requirements
#
# * https://github.com/rapidsai/cugraph/issues/5443
# * https://github.com/rapidsai/build-planning/issues/257
# * https://github.com/rapidsai/build-planning/issues/255
#
CUDA_MAJOR="${RAPIDS_CUDA_VERSION%%.*}"
CUDA_MINOR=$(echo "${RAPIDS_CUDA_VERSION}" | cut -d'.' -f2)
if [[ "${CUDA_MAJOR}" == "13" ]]; then
pip install \
--upgrade \
"nvidia-nvjitlink>=${CUDA_MAJOR}.${CUDA_MINOR}"
fi
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch CUDA 13 wheels pin to nvidia-nvjitlink==13.0.55. Because RAPIDS wheels were built against CTK 13.1, at runtime in an environment like that they raise loading errors like this:

OSError: /pyenv/versions/3.11.15/lib/python3.11/site-packages/libcugraph/lib64/libcugraph_mg.so: undefined symbol: __nvJitLinkGetErrorLog_13_1, version libnvJitLink.so.13

We know it's safe to use a newer nvidia-nvjitlink with an older CTK, and the permanent fix for this is a larger effort that's in progress (rapidsai/build-planning#257), so for now proposing just hacking a newer nvidia-nvjitlink into the environment here to ensure we're testing these libraries against torch.


# 'torch' is an optional dependency of 'pylibwholegraph'... confirm that it's actually
# installed here and that we've installed a package with CUDA support.
rapids-logger "Confirming that PyTorch is installed"
python -c "import torch; assert torch.cuda.is_available()"

rapids-logger "pytest pylibwholegraph (with 'torch')"
./ci/run_pylibwholegraph_pytests.sh
fi

rapids-logger "import pylibwholegraph (no 'torch')"
./ci/uninstall-torch-wheels.sh

python -c "import pylibwholegraph; print(f'pylibwholegraph version: {pylibwholegraph.__version__}')"

rapids-logger "pytest pylibwholegraph"
rapids-logger "pytest pylibwholegraph (no 'torch')"
./ci/run_pylibwholegraph_pytests.sh
16 changes: 16 additions & 0 deletions ci/uninstall-torch-wheels.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash
# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

set -euo pipefail

pip uninstall --yes 'torch'

# 'pytest' leaves behind some pycache files in site-packages/torch that make 'import torch'
# seem to "work" even though there's not really a package there, leading to errors like
# "module 'torch' has no attribute 'distributed'"
#
# For the sake of testing, just fully delete 'torch' from site-packages to simulate an environment
# where it was never installed.
SITE_PACKAGES=$(python -c "import site; print(site.getsitepackages()[0])")
rm -rf "${SITE_PACKAGES}/torch"
19 changes: 19 additions & 0 deletions ci/validate_wheel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,22 @@ rapids-logger "validate packages with 'twine'"
twine check \
--strict \
"$(echo ${wheel_dir_relative_path}/*.whl)"

rapids-logger "validating that the wheel doesn't depend on 'torch' (even in an extra)"
WHEEL_FILE="$(echo ${wheel_dir_relative_path}/*.whl)"

# NOTE: group of specifiers after 'torch' to avoid a false positive like 'torch-geometric'
# Use '|| true' so grep not finding any matches (exit 1) does not kill the script under set -e
unzip -p "${WHEEL_FILE}" '*.dist-info/METADATA' \
| grep -E '^Requires-Dist:.*torch[><=!~ ]+.*' \
| tee matches.txt || true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the tee? You could pipe the output into a variable and then echo it instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tee specifically to avoid the complexity of storing the output in a variable just to echo it.


if [[ -s ./matches.txt ]]; then
echo -n "Wheel '${WHEEL_FILE}' appears to depend on 'torch'. Remove that dependency. "
echo -n "We prefer to not declare a 'torch' dependency and allow it to be managed separately, "
echo "to ensure tight control over the variants installed (including for DLFW builds)."
exit 1
else
echo "No dependency on 'torch' found"
exit 0
fi
Loading
Loading