Provide better documentation on running cuML estimators and algorithms in multi-node/multi-GPU contexts using Dask wrapper implementations.
Key areas to cover:
- Setup and configuration for distributed environments
- Available Dask-enabled estimators
- Code examples for common use cases
- Performance considerations and best practices
- When does it make sense to run on a single or multiple GPUs and nodes
Proposed implementation
We should add a page to the User Guide. We can leverage some of the existing content (notebooks/kmeans_mnmg_demo.ipynb, notebooks/random_forest_mnmg_demo.ipynb). However, we need to keep in mind that our docs build jobs do not have multiple GPUs available, which may make it difficult to directly include executable notebook content.
Provide better documentation on running cuML estimators and algorithms in multi-node/multi-GPU contexts using Dask wrapper implementations.
Key areas to cover:
Proposed implementation
We should add a page to the User Guide. We can leverage some of the existing content (
notebooks/kmeans_mnmg_demo.ipynb,notebooks/random_forest_mnmg_demo.ipynb). However, we need to keep in mind that our docs build jobs do not have multiple GPUs available, which may make it difficult to directly include executable notebook content.