cuml-cpu: fix import issues, enable conda import tests#6400
cuml-cpu: fix import issues, enable conda import tests#6400rapids-bot[bot] merged 32 commits intorapidsai:branch-25.04from jameslamb:ci/import-tests
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # | ||
| from __future__ import annotations |
There was a problem hiding this comment.
Unfortunately, using from __future__ import annotations like this seems to not to be enough to avoid these errors for return type hints.
Traceback (most recent call last):
File "lib/python3.10/site-packages/cuml/internals/base_return_types.py", line 72, in _get_base_return_type
type_hints = typing.get_type_hints(attr)
File "lib/python3.10/typing.py", line 1871, in get_type_hints
value = _eval_type(value, globalns, localns)
File "lib/python3.10/typing.py", line 327, in _eval_type
return t._evaluate(globalns, localns, recursive_guard)
File "lib/python3.10/typing.py", line 694, in _evaluate
eval(self.__forward_code__, globalns, localns),
File "<string>", line 1, in <module>
File "lib/python3.10/site-packages/cuml/internals/safe_imports.py", line 86, in __getattr__
raise UnavailableError(cls._msg)
cuml.internals.safe_imports.UnavailableError: cudf is not installed in non GPU-enabled installations
During handling of the above exception, another exception occurred:
...
Traceback (most recent call last):
File "/opt/conda/conda-bld/test_tmp/run_test.py", line 2, in <module>
import cuml
File "lib/python3.10/site-packages/cuml/__init__.py", line 116, in <module>
from cuml.linear_model.linear_regression import LinearRegression
File "lib/python3.10/site-packages/cuml/linear_model/__init__.py", line 21, in <module>
from cuml.linear_model.logistic_regression import LogisticRegression
File "logistic_regression.pyx", line 27, in init cuml.linear_model.logistic_regression
File "lib/python3.10/site-packages/cuml/preprocessing/__init__.py", line 17, in <module>
from cuml.preprocessing.LabelEncoder import LabelEncoder
File "lib/python3.10/site-packages/cuml/preprocessing/LabelEncoder.py", line 56, in <module>
class LabelEncoder(Base):
File "lib/python3.10/site-packages/cuml/internals/base_helpers.py", line 118, in __new__
classDict[attributeName] = _wrap_attribute(
File "lib/python3.10/site-packages/cuml/internals/base_helpers.py", line 40, in _wrap_attribute
return_type = _get_base_return_type(class_name, attribute)
File "lib/python3.10/site-packages/cuml/internals/base_return_types.py", line 100, in _get_base_return_type
assert False, "Shouldn't get here"
AssertionError: Shouldn't get here
This is being reached:
cuml/python/cuml/cuml/internals/base_return_types.py
Lines 99 to 101 in c6a03b7
I think from processing this return type hint:
Which triggers this:
Which I think leads to it looking like the cudf import is getting "used", leading to:
cuml.internals.safe_imports.UnavailableError: cudf is not installed in non GPU-enabled installations
That return type of -> cudf.Series has been there for 6+ years, since it was added back in cfa7f8d ... so I'm not sure what's changed here.
There was a problem hiding this comment.
Same problem instead using string-literal type hints to defer evaluation like this:
def inverse_transform(self, y: "cudf.Series") -> "cudf.Series":I think what we have here is that something like this is being run:
import typing
typing.get_type_hints(cuml.preprocessing.LabelEncoder.LabelEncoder, "transform")And that's triggering an attribute access in GPU-only import cudf:
cuml/python/cuml/cuml/internals/safe_imports.py
Lines 85 to 86 in 29871eb
| from dask_cudf import Series as dcSeries | ||
| from dask.dataframe import Series as daskSeries | ||
| from dask.dataframe import DataFrame as daskDataFrame | ||
| from cudf import Series |
There was a problem hiding this comment.
cudf is not a run: dependency of cuml-cpu, so it can't be imported unconditionally here. This change and the one below uses the gpu_only_import() wrapped version instead.
| return None | ||
| raise AssertionError( | ||
| f"Failed to determine return type for {attr} (class = '${class_name}'). This is a bug in cuML, please report it." | ||
| ) |
There was a problem hiding this comment.
I hit this during testing, whenever the return type hint on a function was from a module wrapped in gpu_only_import().
Proposing here to make the error message more informative. I think RuntimeError is a more appropriate choice here, but I chose AssertionError to preserve the existing behavior that this raises an AssertionError.
| return "base" | ||
| except Exception: | ||
| assert False, "Shouldn't get here" | ||
| return None |
There was a problem hiding this comment.
Notice that this return None was unreachable because of the assert False before it.
| return y.cat.codes | ||
|
|
||
| def inverse_transform(self, y: cudf.Series) -> cudf.Series: | ||
| def inverse_transform(self, y: "cudf.Series"): |
There was a problem hiding this comment.
Using strings or from __future__ import annotations for is enough to avoid an import-time attribute access for argument type hints... but not for return types. See #6400 (comment)
|
This PR is further evidence why we need to revise and simplify the type reflection system and to not use type hints at runtime to inform behavior. This effort is captured in #5022 . |
gforsyth
left a comment
There was a problem hiding this comment.
Nice work tracking down all these nitty failure points, @jameslamb !
They're a big improvement, certainly. The I think that's out of scope for this PR but would be a good followup if cuml wants to continue supporting CPU-only workloads. |
Yes, we are pursuing a decision on this. |
|
/merge |
Fixes #6403
This project publishes a conda package,
cuml-cpu, which does what it sounds like... allows the use of cuML on systems without a GPU.This proposes some updates to packaging for
cuml-cpu:Notes for Reviewers
Why all these changes in Python code?
See some of the challenges I faced documented in #6400 (comment).
In short,
import cumlwhen it was installed viacuml-cpuwill break at import time whenever modules imported withcuml.internals.safe_imports.gpu_only_import()are used in any of the following ways:Like this:
How long has this been broken? What's the root cause?
It seems like something changed within 25.04... earlier versions of cuML are not affected by these issues: #6403 (comment)
I don't know what the root cause is. Maybe some changes to
cuml's top-level imports in 25.04 is now pulling in the modules with these problems at runtime, when previously it wasn't? I'm really not sure.Benefits of these Changes
This adds a bit of test coverage in CI, minimally verifying that
cuml-cpuis installable and thatimport cumlworks in an environment without a GPU.Inspired by:
cuvs: cuvs-bench-cpu: avoid 'mkl' dependency on aarch64 cuvs#750How I tested this
Saw stuff like this in
conda-python-buildjobs, confirming that the import tests were running and passing: