Forward merge release/25.12 into main#7531
Merged
gforsyth merged 8 commits intorapidsai:mainfrom Nov 25, 2025
Merged
Conversation
This: - Cleans up `SGD`, `MBSGDClassifier`, and `MBSGDRegressor`, following all the guidelines in rapidsai#7317. - Adds a new `fit_sgd` function to handle fitting a linear model using SGD. This was the last part of rapidsai#6938 sans deprecation/removal of the solver classes themselves. - Removes the undocumented `solver_model` attribute in favor of storing the fitted attributes on the models themselves. - Adds support for all label types to `MBSGDClassifier`, bringing it in line with our other classifiers - Adds a validation check to `MBSGDClassifier` to ensure it's fitting a binary classification problem, since multiclass is currently not supported. - Removes the `SGD.predictClass` method. This method is now unused. It didn't validate the `SGD` represented a classification problem, didn't handle non [0, 1] classes, and didn't match any standard method name or interface. Our other solvers only support regression problems, with the caller required to convert the output to solve a classification problem when needed. I dropped it as a breaking change here since I doubt anyone is using it, but could back off to a deprecation if people feel strongly. Dropping it lets us rip out `target_dtype` sooner/easier. Breaking Change Summary: - Removal of `SGD.predictClass` - `MBSGDClassifier.classes_` is now always a `numpy.ndarray` (mirroring the recent work on our other classifiers) With this cleanup, `target_dtype` is no longer used. After this is in we can remove that bit from our api decorators/base class to simplify our internals further. Part of rapidsai#7317. Fixes rapidsai#6938. Authors: - Jim Crist-Harif (https://github.com/jcrist) Approvers: - Simon Adorf (https://github.com/csadorf) URL: rapidsai#7504
Follow up to rapidsai#7440, use S3 as the preprocessor cache location. Contributes to rapidsai/build-planning#228 Authors: - Paul Taylor (https://github.com/trxcllnt) Approvers: - Nate Rock (https://github.com/rockhowse) - Bradley Dice (https://github.com/bdice) URL: rapidsai#7510
This now provides exact consistency. Fixes rapidsai#5147. Authors: - Jim Crist-Harif (https://github.com/jcrist) Approvers: - Anupam (https://github.com/aamijar) - Divye Gala (https://github.com/divyegala) - Simon Adorf (https://github.com/csadorf) URL: rapidsai#7518
With all the work in rapidsai#7317, we're now at a point where `target_dtype` is no longer used. This PR removes `target_dtype` and all supporting infra. This simplifies our decorators and base class, and reduces the amount of state stored on an estimator. Since this was all private implementation details, this is not a breaking change. Authors: - Jim Crist-Harif (https://github.com/jcrist) Approvers: - Anupam (https://github.com/aamijar) - Simon Adorf (https://github.com/csadorf) URL: rapidsai#7516
…dsai#7481) Closes rapidsai#7143 This PR improves memory usage in UMAP when given a precomputed knn graph. Previously, a user-given knn graph will occupy GPU memory throughout the full UMAP pipeline even though it is not needed in later steps of UMAP. In this PR, if the user-given knn graph is on host memory, we keep it on host memory and copy to device at the cpp level to allow better memory management. ### This PR with precomputed knn graph on CPU <img width="808" height="313" alt="Screenshot 2025-11-12 at 7 00 33 PM" src="https://github.com/user-attachments/assets/6c752f62-a1b2-4fb1-a44d-d86ed468915b" /> ### Before with precomputed knn graph on CPU <img width="828" height="316" alt="Screenshot 2025-11-12 at 7 01 12 PM" src="https://github.com/user-attachments/assets/8237fdd4-e0bb-48f5-bc46-71878ce14b33" /> Authors: - Jinsol Park (https://github.com/jinsolp) Approvers: - Philip Hyunsu Cho (https://github.com/hcho3) - Simon Adorf (https://github.com/csadorf) - Tarang Jain (https://github.com/tarang-jain) URL: rapidsai#7481
Dropping xgboost from our CI for now while upstream builds are fixed. Stopgap for rapidsai#7520. Supersedes rapidsai#7523. Authors: - Jim Crist-Harif (https://github.com/jcrist) Approvers: - https://github.com/jakirkham URL: rapidsai#7526
## Summary This PR improves documentation quality and consistency across cuML's Dask multi-GPU estimators, adds a comprehensive multi-GPU guide, and fixes two minor bugs in KNeighborsClassifier and RandomForestClassifier. ## Changes ### New Documentation - Added `dask_multigpu_guide.ipynb` - comprehensive guide for multi-GPU usage with Dask ### Documentation Improvements - Standardized terminology: "multi-node multi-GPU", "Dask cuDF DataFrame" - Fixed docstring formatting (parameter underlines, spacing, capitalization) - Removed "experimental" language from stable APIs - Added known limitations: - PCA: `random_state` parameter not supported in MNMG - LogisticRegression: labels must be float32 dtype with code example - UMAP: clarified this is for distributed inference only, not training - Improved class docstrings with clearer descriptions - Fixed typos and improved grammar throughout ### Bug Fixes - **KNeighborsClassifier**: Added CuPy array support for label handling - **RandomForestClassifier**: Fixed `unique()` handling for Dask Arrays vs DataFrames Closes rapidsai#7309. Fixes rapidsai#3663. Authors: - Simon Adorf (https://github.com/csadorf) Approvers: - Jim Crist-Harif (https://github.com/jcrist) URL: rapidsai#7499
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
csadorf
approved these changes
Nov 25, 2025
Member
Author
|
/merge nosquash |
Contributor
|
Could not determine original ForwardMerger PR from branch name. The branch name should follow the pattern |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I accidentally
/mergemerged #7519, this re-does it and should fix things. Should resolve conflicts in #7529.