Scaling Machine Learning Workloads on GPU Clusters
- Configured an NVIDIA GPU cluster with CUDA, cuDNN, and Jaxlib.
- Used Alpa to leverage model parallelism and statistical multiplexing to scale inference workloads across GPUs using Ray framework.
| Name | Name | Last commit date | ||
|---|---|---|---|---|