-
Notifications
You must be signed in to change notification settings - Fork 72
[REVIEW]Add support and tests for cuML and XGBoost #330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #330 +/- ##
==========================================
+ Coverage 95.69% 95.70% +0.01%
==========================================
Files 65 65
Lines 2854 2863 +9
Branches 534 536 +2
==========================================
+ Hits 2731 2740 +9
+ Misses 75 74 -1
- Partials 48 49 +1
Continue to review full report at Codecov.
|
|
CC: @charlesbluca , How do i go about adding |
|
You can open a PR adding these packages to the environment created in this dockerfile: https://github.com/rapidsai/dask-build-environment/blob/main/dask_sql.Dockerfile |
|
@charlesbluca , This is now ready for review. This PR is dependent on https://github.com/rapidsai/dask-build-environment/pull/16/files |
| X_d = X.repartition(npartitions=1).to_delayed() | ||
| if y is not None: | ||
| y_d = y.repartition(npartitions=1).to_delayed() | ||
| else: | ||
| y_d = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For more context around this see issue: rapidsai/cuml#4406 .
We were previously training on client which is:
a. Very inefficient and possibly problematic in multi-node clusters and heterogeneous setup.
b. Training non distributed xgboost models on dask collections is not supported.
charlesbluca
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work here @VibhuJawa 😄 a couple comments, I'm not the most knowledgeable on Dask ML stuff so would feel good getting a second review on this:
ChrisJar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
charlesbluca
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed this. but generally things look good here! Happy to merge this once cuML / XGBoost are successfully added to gpuCI and tests are passing
|
@GPUtester rerun tests . |
charlesbluca
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think we need to mark these as fixtures to expose them to tests?
|
rerun tests |
|
rerun tests |
Add support and tests for cuML and XGBoost
This PR addresses #309
TODO:
Ensure efficient path for non distributed models and allow single GPU
xgboostto succeedTest for singe GPU
cumlmodelTest for multi gpu
cumlmodelTest for singe gpu
xgboostmodelTest for multi gpu
xgboostmodelUpdate GPU-CI environment with the right libraries
See PR: https://github.com/rapidsai/dask-build-environment/pull/16/files
Follow Up work to enable predict with multi GPU cuML models:
cumlmodel failureSee issue: [BUG]Predict fails with multi GPU cuML models #332