Skip to content

Deprecate predict_model in cuml.ensemble/cuml.dask.ensemble#7155

Merged
rapids-bot[bot] merged 5 commits intorapidsai:branch-25.10from
jcrist:deprecate-predict-model
Aug 29, 2025
Merged

Deprecate predict_model in cuml.ensemble/cuml.dask.ensemble#7155
rapids-bot[bot] merged 5 commits intorapidsai:branch-25.10from
jcrist:deprecate-predict-model

Conversation

@jcrist
Copy link
Copy Markdown
Member

@jcrist jcrist commented Aug 28, 2025

The predict methods of our ensemble estimators (cuml.ensemble.RandomForestClassifier, cuml.ensemble.RandomForestRegressor, and the dask equivalents) include an old predict_model kwarg. This kwarg may be one of ("GPU", "CPU"), with "GPU" as the default.

On single node:

  • predict_model="GPU": used FIL on GPU to do inference
  • predict_model="CPU": used the internal decision tree representation used for fitting (so tied to the GPU), but does the predict on CPU. This is less performant and still doesn't let users do inference without a GPU.

On multi node (dask):

  • predict_model="GPU": used FIL on GPU to do inference (broadcasting the complete model to every worker node)
  • predict_model="CPU": used the internal decision tree repr from fitting, but with some very inefficient post processing (looping over the full length of the data! building up a python list of responses!). Using broadcast_data=True seems to better match the intent here (more efficient with small datasets), and is available with the default FIL codepath.

This PR:

  • Deprecates predict_model everywhere. The FIL codepath is always taken. A warning is raised if predict_model is explicitly set, but the old codepaths no longer exist.
  • Deletes the old codepaths. We could also delete some C++ used for inference, but I left that out for now.
  • Updates the tests accordingly (including tests for the deprecated behavior).

No existing code should be broken by this (at most users will see a warning telling them the kwargs are going away), and all use cases should still be supported without the predict_model kwarg.

This PR is part of cleaning up the memory management and internals of our cuml.ensemble estimators.

@jcrist jcrist self-assigned this Aug 28, 2025
@jcrist jcrist requested a review from a team as a code owner August 28, 2025 20:05
@jcrist jcrist added improvement Improvement / enhancement to an existing function breaking Breaking change labels Aug 28, 2025
@jcrist jcrist requested a review from dantegd August 28, 2025 20:05
@github-actions github-actions Bot added the Cython / Python Cython or Python issue label Aug 28, 2025
Copy link
Copy Markdown
Contributor

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please address my comments and suggestions.

Comment thread python/cuml/cuml/dask/ensemble/base.py
Comment thread python/cuml/tests/dask/test_dask_random_forest.py
Comment thread python/cuml/tests/test_random_forest.py
@jcrist
Copy link
Copy Markdown
Member Author

jcrist commented Aug 29, 2025

/merge

@rapids-bot rapids-bot Bot merged commit 21d95a4 into rapidsai:branch-25.10 Aug 29, 2025
82 checks passed
@jcrist jcrist deleted the deprecate-predict-model branch August 29, 2025 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Breaking change Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants