Implement Ridge .solver_ estimated attribute#6415
Implement Ridge .solver_ estimated attribute#6415rapids-bot[bot] merged 4 commits intorapidsai:branch-25.04from
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
1074686 to
0b70832
Compare
|
/ok to test |
0b70832 to
3e91f72
Compare
|
/ok to test |
jcrist
left a comment
There was a problem hiding this comment.
2 general comments:
- Presumably this is fixing a bug/increasing compatibility within
cuml.accel. Can you add a test for this behavior? - From the comments in this PR alone I cannot tell what issue this PR resolves or why it's being made. In the future can you try and include comments about why a PR is being made, why the approach you took solves the issue, etc...? This helps immensely with reviewing PRs and disseminating knowledge about why things are the way they are across the team. It also helps when future us run
git blameto find out why this code was written :)
|
I just implemented a few additional basic tests and it turns out that the The cuml/python/cuml/cuml/linear_model/ridge.pyx Line 254 in 3a8ea8c which then triggers this error: Lines 214 to 215 in 3a8ea8c @dantegd Do you have more context for this? |
3e91f72 to
c88113e
Compare
| """Test that the solver attribute is translated correctly.""" | ||
| model = Ridge(solver=solver, random_state=42) | ||
| assert ( | ||
| model.solver == expected |
There was a problem hiding this comment.
The fact that the attribute isn't equal to the value the user passed in is a curiosity of the current proxying implementation. How can we note that in the test so that when we change the implementation we know to modify this test/delete it? Leave a comment for people from the future?
There was a problem hiding this comment.
It is and it isn't. The skl implementation will also have a deviation in case that the default "auto" value is selected.
I further do not think that this is an artifact of our proxy implementation, but an artifact of the fundamental function of the acceleration mode. We can decide that the acceleration mode must faithfully obey all parameters 100%, including the solver selection in which case we would have to fall-back to CPU mode in case that a solver is selected that is not supported on the GPU. However, if we deem that selecting an equivalent suitable solver is acceptable then I don't think we should pretend that a specific skl-solver was used, when – in fact – it wasn't.
To be clear, I am not arguing that this is the right approach, but I am arguing that the deviation is not just an artifact of the current proxy layer implementation. I don't think that we need any further comment to remind us here, because this test will fail if we make an engineering decision to change the behavior.
There was a problem hiding this comment.
The solver attribute will be set to "auto" in scikit-learn. The solver_ attribute will contain the name of the solver that was actually used.
In [20]: from sklearn.linear_model import Ridge
In [21]: r = Ridge()
In [22]: from sklearn.datasets import make_classification, make_regression
In [23]: X, y = make_regression()
In [24]: r.fit(X, y)
Out[24]: Ridge()
In [25]: r.solver
Out[25]: 'auto'
In [26]: r.solver_
Out[26]: 'cholesky'I don't think we need to select the same solver, but I do think we should work towards passing the scikit-learn common tests which require that all constructor arguments are stored in attributes and that those remain unchanged.
There was a problem hiding this comment.
I don't understand your point, that is the exact behavior I was describing:
The skl implementation will also have a deviation in case that the default "auto" value is selected.
There was a problem hiding this comment.
I somehow missed the first line/mixed it with the second of the test parametrisation :-/ so I thought that you pass "auto" but then the attribute is set to "eig".
8596029 to
66529e3
Compare
- Changed default solver from 'eig' to 'auto', allowing automatic selection of 'eig'. - Updated documentation to reflect new solver options: 'auto', 'eig', 'svd', and 'cd'. - Refactored solver selection logic into a new method `_select_solver` for better clarity and maintainability.
f18df58 to
2d10d1b
Compare
|
/merge |
Adds the
.solver_estimated attribute in addition to the.solverhyperparameter.Switches the default cuml
solverhyperparameter from "eig" to "auto" (backwards-compatible).