Skip to content

Remove safe_imports.py#6588

Merged
rapids-bot[bot] merged 45 commits intorapidsai:branch-25.06from
jcrist:remove-safe-imports
Apr 25, 2025
Merged

Remove safe_imports.py#6588
rapids-bot[bot] merged 45 commits intorapidsai:branch-25.06from
jcrist:remove-safe-imports

Conversation

@jcrist
Copy link
Copy Markdown
Member

@jcrist jcrist commented Apr 25, 2025

As part of removing cuml-cpu support, we no longer need to guard imports for CPU or GPU builds. I originally set out to remove only the gpu_only_import* calls, but as I was going I found lots of crud and ended up removing all the guarded imports in safe_imports.py (along with that entire file). This was mostly mechanical work, aided by a helper script, patience, and a background podcast.

Usage of these functions fell into a few camps:

  • Guarding GPU imports for non-gpu builds (gpu_only_import/gpu_only_import_from). These can be fully removed - all our builds are now GPU builds.
  • Guarding CPU-only imports for non-cpu builds (cpu_only_import/cpu_only_import_from). Afaict these never made much sense, all cpu dependencies are also dependencies of the GPU-only builds.
  • Guarding some optional dependencies (safe_import/safe_import_from). Usage of this was pretty sparse, in most cases the logic was clearer when I inlined the try/except call in the calling code.
  • Guarding scipy imports as if scipy was an optional dependency. scipy is not an optional dependency for cuml, and hasn't been for at least a while. There's still some guarded imports from has_scipy checks in a few places, but I removed all those that were handled by the functions removed in this PR. Since sklearn requires scipy (and we're moving towards requiring sklearn), there's no chance we'll ever want to make it optional, and there were plenty of unguarded imports of scipy already. Might as well lean all in. Same goes for guarded pandas imports - we require pandas, no need to guard them.
  • Guarding nvtx imports. I've made a new cuml.internals.nvtx module to handle that more simply and uniformly.

There are a few places where there were code changes made as well. These are mostly nits dealing with switching to using namespaced versions of the imports (cudf.DataFrame instead of CudfDataFrame), resulting in simpler and more uniform imports.

If in the future we decide we want to make some dependencies optional, I'd encourage us to do so in a non-magical way. Take what was done in cuml.internals.nvtx as an example of a decent pattern (isolating the existence check to one location, and providing clear consistent alternatives in that same spot). Littering the code with optional imports makes it easy to miss one, and also harder to reason about.

Part of #6523.

@jcrist jcrist requested a review from a team as a code owner April 25, 2025 18:12
@jcrist jcrist requested review from bdice and teju85 April 25, 2025 18:12
@github-actions github-actions Bot added the Cython / Python Cython or Python issue label Apr 25, 2025
@jcrist jcrist added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 25, 2025
@jcrist jcrist self-assigned this Apr 25, 2025
Copy link
Copy Markdown
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments. Overall it looks great. This makes the cuML import system and library design make so much more sense to me.

Comment thread python/cuml/cuml/_thirdparty/sklearn/preprocessing/_imputation.py
Comment thread python/cuml/cuml/ensemble/randomforestclassifier.pyx Outdated
@jcrist
Copy link
Copy Markdown
Member Author

jcrist commented Apr 25, 2025

/merge

@rapids-bot rapids-bot Bot merged commit 972d126 into rapidsai:branch-25.06 Apr 25, 2025
74 checks passed
@jcrist jcrist deleted the remove-safe-imports branch April 25, 2025 21:22
Ofek-Haim pushed a commit to Ofek-Haim/cuml that referenced this pull request May 13, 2025
As part of removing `cuml-cpu` support, we no longer need to guard imports for CPU or GPU builds. I originally set out to remove only the `gpu_only_import*` calls, but as I was going I found lots of crud and ended up removing _all_ the guarded imports in `safe_imports.py` (along with that entire file). This was mostly mechanical work, aided by a helper script, patience, and a background podcast.

Usage of these functions fell into a few camps:

- Guarding GPU imports for non-gpu builds (`gpu_only_import`/`gpu_only_import_from`). These can be fully removed - all our builds are now GPU builds.
- Guarding CPU-only imports for non-cpu builds (`cpu_only_import`/`cpu_only_import_from`). Afaict these never made much sense, all cpu dependencies are also dependencies of the GPU-only builds.
- Guarding some optional dependencies (`safe_import`/`safe_import_from`). Usage of this was pretty sparse, in most cases the logic was clearer when I inlined the try/except call in the calling code.
- Guarding `scipy` imports as if `scipy` was an optional dependency. `scipy` is not an optional dependency for `cuml`, and hasn't been for at least a while. There's still some guarded imports from `has_scipy` checks in a few places, but I removed all those that were handled by the functions removed in this PR. Since sklearn requires scipy (and we're moving towards requiring sklearn), there's no chance we'll ever want to make it optional, and there were plenty of unguarded imports of scipy already. Might as well lean all in. Same goes for guarded `pandas` imports - we require pandas, no need to guard them.
- Guarding `nvtx` imports. I've made a new `cuml.internals.nvtx` module to handle that more simply and uniformly.

There are a few places where there were code changes made as well. These are mostly nits dealing with switching to using namespaced versions of the imports (`cudf.DataFrame` instead of `CudfDataFrame`), resulting in simpler and more uniform imports.

If in the future we decide we want to make some dependencies optional, I'd encourage us to do so in a non-magical way. Take what was done in `cuml.internals.nvtx` as an example of a decent pattern (isolating the existence check to one location, and providing clear consistent alternatives in that same spot). Littering the code with optional imports makes it easy to miss one, and also harder to reason about.

Part of rapidsai#6523.

Authors:
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#6588
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants