Make `get_param_names` a class method on single GPU estimators to match Scikit-learn closer by dantegd · Pull Request #6101 · rapidsai/cuml

dantegd · 2024-10-07T22:43:15Z

Small difference between our estimators and Scikit-learn is that get_param_names are a classmethod in sklearn, and not in ours. This can make a few corner cases fail for using our estimators when Scikit-learn like estimators are expected. This PR fixes that.

Note: This will not include dask-based estimators for the time being since they depend on introspection at object creation time.

divyegala

Why were none of the Cython files changed?

dantegd · 2024-10-08T00:54:42Z

lol because I wrote a script to do this and only checked .py files instead of .pyx files too

betatim · 2024-10-14T09:18:04Z

Do you want to make them private at the same time? On the one hand, then we'd be 100% the same. On the other hand, the fact that they are private in scikit-learn makes me wonder if this matters (as they aren't part of the public API)?

dantegd · 2024-10-16T14:53:20Z

@betatim good point, I think matching as close as possible (i.e. entirely) is a good idea

…assmethod

betatim · 2024-11-11T11:26:42Z

+### Implementing `_get_param_names()`

-To support cloning, estimators need to implement the function `get_param_names()`. The returned value should be a list of strings of all estimator attributes that are necessary to duplicate the estimator. This method is used in `Base.get_params()` which will collect the collect the estimator param values from this list and pass this dictionary to a new estimator constructor. Therefore, all strings returned by `get_param_names()` should be arguments in `__init__()` otherwise an invalid argument exception will be raised. Most estimators implement `get_param_names()` similar to:
+To support cloning, estimators need to implement the function `_get_param_names()`. The returned value should be a list of strings of all estimator attributes that are necessary to duplicate the estimator. This method is used in `Base.get_params()` which will collect the collect the estimator param values from this list and pass this dictionary to a new estimator constructor. Therefore, all strings returned by `_get_param_names()` should be arguments in `__init__()` otherwise an invalid argument exception will be raised. Most estimators implement `_get_param_names()` similar to:


Suggested change

To support cloning, estimators need to implement the function `_get_param_names()`. The returned value should be a list of strings of all estimator attributes that are necessary to duplicate the estimator. This method is used in `Base.get_params()` which will collect the collect the estimator param values from this list and pass this dictionary to a new estimator constructor. Therefore, all strings returned by `_get_param_names()` should be arguments in `__init__()` otherwise an invalid argument exception will be raised. Most estimators implement `_get_param_names()` similar to:

To support cloning, estimators need to implement the function `_get_param_names()`. The returned value should be a list of strings of all estimator attributes that are necessary to duplicate the estimator. This method is used in `Base.get_params()` which will collect the estimator param values from this list and pass this dictionary to a new estimator constructor. Therefore, all strings returned by `_get_param_names()` should be arguments in `__init__()` otherwise an invalid argument exception will be raised. Most estimators implement `_get_param_names()` similar to:

betatim · 2024-11-11T11:28:06Z

It is a big rename diff :D

Looks good from a quick check. One thing I noticed: some are classmethods (the majority) but some aren't. Oversight? If on purpose it is maybe worth adding a comment to the ones that aren't to help people from the future understand why they are different.

dantegd · 2024-11-11T22:02:20Z

@betatim the ones that haven't changed to be class methods are the dask-based estimators, currently they depend on some runtime behavior, I would suggest we do those on a follow up

betatim · 2024-11-12T13:59:52Z

Ok. Maybe they don't need a comment then.

Time to merge?

dantegd · 2024-11-12T16:12:12Z

/merge

ENH Make get_param_names a class method to match Scikit-learn

d63e222

dantegd requested a review from a team as a code owner October 7, 2024 22:43

dantegd requested review from cjnolet and divyegala October 7, 2024 22:43

github-actions Bot added the Cython / Python Cython or Python issue label Oct 7, 2024

divyegala reviewed Oct 8, 2024

View reviewed changes

ENH Make get_param_names a class method in cython files too

4a1a7bc

dantegd changed the title ~~Make get_param_names a class method to match Scikit-learn closer~~ Make get_param_names a class method on single GPU estimators to match Scikit-learn closer Oct 8, 2024

dantegd added 3 commits October 7, 2024 20:18

FIX remove changes to dask estimators

9eb0255

Merge branch 'branch-24.12' into 2412-fix-classmethod

1086f75

Style fixes

abf81bd

dantegd added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 8, 2024

dantegd added 2 commits October 7, 2024 21:28

FIX self to cls in qn.pyx

f43e580

FIX final typo fix hopefully

c902164

singhmanas1 assigned dantegd Oct 8, 2024

divyegala added 4 commits November 6, 2024 13:46

Merge remote-tracking branch 'upstream/branch-24.12' into 2412-fix-cl…

00c14d6

…assmethod

passing tests

d46c31d

Merge branch 'branch-24.12' into 2412-fix-classmethod

24a8047

Update ESTIMATOR_GUIDE.md

009546c

betatim reviewed Nov 11, 2024

View reviewed changes

wphicks approved these changes Nov 12, 2024

View reviewed changes

rapids-bot Bot merged commit 8e195fb into rapidsai:branch-24.12 Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `get_param_names` a class method on single GPU estimators to match Scikit-learn closer #6101

Make `get_param_names` a class method on single GPU estimators to match Scikit-learn closer #6101
rapids-bot[bot] merged 11 commits intorapidsai:branch-24.12from
dantegd:2412-fix-classmethod

dantegd commented Oct 7, 2024 •

edited

Loading

Uh oh!

divyegala left a comment

Uh oh!

dantegd commented Oct 8, 2024

Uh oh!

betatim commented Oct 14, 2024

Uh oh!

dantegd commented Oct 16, 2024

Uh oh!

betatim Nov 11, 2024

Uh oh!

betatim commented Nov 11, 2024

Uh oh!

dantegd commented Nov 11, 2024

Uh oh!

betatim commented Nov 12, 2024

Uh oh!

dantegd commented Nov 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

dantegd commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

divyegala left a comment

Choose a reason for hiding this comment

Uh oh!

dantegd commented Oct 8, 2024

Uh oh!

betatim commented Oct 14, 2024

Uh oh!

dantegd commented Oct 16, 2024

Uh oh!

betatim Nov 11, 2024

Choose a reason for hiding this comment

Uh oh!

betatim commented Nov 11, 2024

Uh oh!

dantegd commented Nov 11, 2024

Uh oh!

betatim commented Nov 12, 2024

Uh oh!

dantegd commented Nov 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dantegd commented Oct 7, 2024 •

edited

Loading