Add Out-of-Bag (OOB) Score Support to RandomForest by csadorf · Pull Request #7401 · rapidsai/cuml

csadorf · 2025-10-28T21:39:35Z

Summary

Implements out-of-bag (OOB) scoring for RandomForestClassifier and RandomForestRegressor, enabling users to estimate model performance without requiring a separate validation set.

Closes #7395

Changes

C++ Layer

Modified fit() functions to accept optional bootstrap_masks parameter for storing per-tree bootstrap sample indicators
Updated RandomForest::fit() to capture and store bootstrap masks when oob_score=True

Python Layer

Added oob_score parameter (boolean only) to Random Forest estimators
Implemented _compute_oob_score() method that leverages FIL's predict_per_tree() for efficient OOB predictions
Added oob_score_ and oob_decision_function_ (or oob_prediction_) attributes
Validates that oob_score is boolean (custom scorer functions not supported)
Added proper attribute transfer for pickle and CPU interop

Metrics

Classifier: Uses accuracy score on OOB predictions
Regressor: Uses R² score on OOB predictions

Limitations

Custom scorer functions (callable oob_score) are not supported - only boolean values accepted
Multi-output targets not supported for OOB scoring

Testing

Added comprehensive tests covering:

Binary and multi-class classification OOB scoring
Regression OOB scoring
Error handling for invalid configurations
Comparison with scikit-learn baseline

copy-pr-bot · 2025-10-28T21:39:38Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

csadorf · 2025-10-28T21:42:02Z

/ok to test ba0c33a

csadorf · 2025-10-30T01:53:31Z

/ok to test 25ab340

csadorf · 2025-10-30T02:23:20Z

/ok to test c349e02

Not needed since we can trigger the bootstrap_mask storage by providing a non-null pointer.

csadorf · 2025-10-30T15:36:48Z

/ok to test 4dcfd8c

viclafargue

Thanks for working on this! LGTM, just some minor suggestions for the Python portion.

…::par.on." This reverts commit 5dae6a5.

…ores

jcrist

LGTM, nice work! Left two small nits, but otherwise !

csadorf · 2025-11-03T23:41:40Z

/merge

## Summary Implements out-of-bag (OOB) scoring for `RandomForestClassifier` and `RandomForestRegressor`, enabling users to estimate model performance without requiring a separate validation set. Closes rapidsai#7395 ## Changes ### C++ Layer - Modified `fit()` functions to accept optional `bootstrap_masks` parameter for storing per-tree bootstrap sample indicators - Updated `RandomForest::fit()` to capture and store bootstrap masks when `oob_score=True` ### Python Layer - Added `oob_score` parameter (boolean only) to Random Forest estimators - Implemented `_compute_oob_score()` method that leverages FIL's `predict_per_tree()` for efficient OOB predictions - Added `oob_score_` and `oob_decision_function_` (or `oob_prediction_`) attributes - Validates that `oob_score` is boolean (custom scorer functions not supported) - Added proper attribute transfer for pickle and CPU interop ### Metrics - **Classifier**: Uses accuracy score on OOB predictions - **Regressor**: Uses R² score on OOB predictions ## Limitations - Custom scorer functions (callable `oob_score`) are not supported - only boolean values accepted - Multi-output targets not supported for OOB scoring ## Testing Added comprehensive tests covering: - Binary and multi-class classification OOB scoring - Regression OOB scoring - Error handling for invalid configurations - Comparison with scikit-learn baseline Authors: - Simon Adorf (https://github.com/csadorf) Approvers: - Victor Lafargue (https://github.com/viclafargue) - Divye Gala (https://github.com/divyegala) - Jim Crist-Harif (https://github.com/jcrist) URL: rapidsai#7401

github-actions Bot added Cython / Python Cython or Python issue CUDA/C++ labels Oct 28, 2025

github-actions Bot assigned csadorf Oct 28, 2025

csadorf added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 28, 2025

csadorf force-pushed the fea/support-rf-oob-scores branch from 83224e2 to bb46ce7 Compare October 30, 2025 01:15

csadorf added 17 commits October 30, 2025 09:27

initial draft

26b6150

Refactor to use single fit_treelite function

e3de44c

move cupy imports to top of module

21c4970

use a device array for the mask

6c78fa9

use thrust for mask creation

b1416bb

Deduplicate the oob_score computation

6ae6e2f

Remove properties for oob_score_ and oob_decision_function_

8d02729

improve sklearn compatibility

92d514f

update scikit-learn xfail list

4cbda15

Revise documented cuml.accel limitations

7a35527

Remove RF_params.oob_score

8d2a2ba

Not needed since we can trigger the bootstrap_mask storage by providing a non-null pointer.

Revert whitespace changes.

9009c5f

Fixup tests

9881ffe

Support sklearn roundtrip.

6ebb813

Move import to top of module.

15a0a1d

Do not use sklearn.utils.multiclass.type_of_target

a06c9a2

Use thrust::scatter approach

4dcfd8c

csadorf force-pushed the fea/support-rf-oob-scores branch from c349e02 to 4dcfd8c Compare October 30, 2025 15:36

Improve handling of lacking multi-output support and update xfail list.

ba34098

viclafargue approved these changes Oct 31, 2025

View reviewed changes

Comment thread python/cuml/cuml/ensemble/randomforest_common.pyx Outdated

Comment thread python/cuml/tests/test_sklearn_import_export.py Outdated

csadorf added 18 commits October 31, 2025 09:34

Do not persist boostrap_masks.

c11e4ce

Improve language in limiation docs.

409e8c9

Inline compute_oob_score_metric

44a1a0a

Remove obsolete bootstrap_masks vector attribute.

1e8646b

Use raft::resource::get_thrust_policy instead of thrust::cuda::par.on.

5dae6a5

Improve code comment on per_tree prediciton shape

120ee3e

Inline _validate_target_array

b687757

Revert "Use raft::resource::get_thrust_policy instead of thrust::cuda…

335ee72

…::par.on." This reverts commit 5dae6a5.

Make persistent oob arrays CumlArrayDescriptor and order explicit.

240ed0d

Raise UnsupportedOnGPU for a callable oob_score.

2815ce2

Improve output_type handling in RandomForest cuml.accel.

e833904

Remove assert strings

14549d0

Parametrize sklearn-import-export tests on oob_score.

a6b03c9

Check oob_score_ equality.

e5f3dcd

Fixup doc-string documentation.

06f25b4

Merge remote-tracking branch 'origin/main' into fea/support-rf-oob-sc…

09f687c

…ores

Use rmm:exec_policy(s) instead of thrust::cuda::par.on(s)

3be794e

Remove more xfails.

cb23443

divyegala approved these changes Oct 31, 2025

View reviewed changes

jcrist approved these changes Oct 31, 2025

View reviewed changes

Comment thread python/cuml/cuml/accel/_wrappers/sklearn/ensemble.py Outdated

Comment thread python/cuml/cuml/ensemble/randomforest_common.pyx Outdated

csadorf added 5 commits October 31, 2025 17:05

Do not broadly catch ValueError.

75e0e7a

Tiny update to error message.

d2b5a61

Document that cuml.accel will fallback RF for multi-output targets.

c874432

Remove passing test from xfail list.

1bc4f32

Merge branch 'main' into fea/support-rf-oob-scores

9f4a712

rapids-bot Bot merged commit d5cfd77 into rapidsai:main Nov 3, 2025
106 checks passed

csadorf deleted the fea/support-rf-oob-scores branch November 3, 2025 23:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Out-of-Bag (OOB) Score Support to RandomForest#7401

Add Out-of-Bag (OOB) Score Support to RandomForest#7401
rapids-bot[bot] merged 41 commits intorapidsai:mainfrom
csadorf:fea/support-rf-oob-scores

csadorf commented Oct 28, 2025 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented Oct 28, 2025

Uh oh!

csadorf commented Oct 28, 2025

Uh oh!

csadorf commented Oct 30, 2025

Uh oh!

csadorf commented Oct 30, 2025

Uh oh!

csadorf commented Oct 30, 2025

Uh oh!

viclafargue left a comment

Uh oh!

Uh oh!

Uh oh!

jcrist left a comment

Uh oh!

Uh oh!

Uh oh!

csadorf commented Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

csadorf commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

C++ Layer

Python Layer

Metrics

Limitations

Testing

Uh oh!

copy-pr-bot Bot commented Oct 28, 2025

Uh oh!

csadorf commented Oct 28, 2025

Uh oh!

csadorf commented Oct 30, 2025

Uh oh!

csadorf commented Oct 30, 2025

Uh oh!

csadorf commented Oct 30, 2025

Uh oh!

viclafargue left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jcrist left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

csadorf commented Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

csadorf commented Oct 28, 2025 •

edited

Loading