Skip to content

Commit 9b5901e

Browse files
authored
ci: set LD_LIBRARY_PATH in Docker images for correct cuBLAS detection (#2468)
<!-- .github/pull_request_template.md --> ## 📌 Description Summary * Add `LD_LIBRARY_PATH` to Docker images to ensure pip-installed `nvidia-cublas` takes precedence over system libraries * Fixes issues where incorrect cuBLAS versions could be loaded at runtime Example of what happens without prepending the path to `LD_LIBRARY_PATH` in our cu130 containers: ``` $ docker run --gpus all -it flashinfer/flashinfer-ci-cu130:20260131-a52eff1 Unable to find image 'flashinfer/flashinfer-ci-cu130:20260131-a52eff1' locally 20260131-a52eff1: Pulling from flashinfer/flashinfer-ci-cu130 Digest: sha256:582aeb35289cf804735a31727abe8ff37ae722fe6c7bd7fb8ddf50654429ff7a Status: Downloaded newer image for flashinfer/flashinfer-ci-cu130:20260131-a52eff1 ========== == CUDA == ========== CUDA Version 13.0.1 Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience. (py312) root@fdac9b9cd61e:/workspace# python -c "import torch; print(torch.matmul(torch.randn(128,128,device='cuda'), torch.randn(128,128,device='cuda')))" Traceback (most recent call last): File "<string>", line 1, in <module> RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)` (py312) root@fdac9b9cd61e:/workspace# export LD_LIBRARY_PATH=/opt/conda/envs/py312/lib/python3.12/site-packages/nvidia/cu13/lib/:$LD_LIBRARY_PATH (py312) root@fdac9b9cd61e:/workspace# python -c "import torch; print(torch.matmul(torch.randn(128,128,device='cuda'), torch.randn(128,128,device='cuda')))" tensor([[ 14.9044, 14.3420, 26.0861, ..., -10.4334, -4.5352, 4.2331], [ 1.9701, 13.6111, 1.0954, ..., 3.0715, -2.9266, 7.8847], [ 6.5089, -7.4811, -12.6226, ..., -5.3695, -4.4557, -22.4567], ..., [-12.0462, -2.0045, 15.7295, ..., -4.5688, 22.5680, -11.9852], [ -0.4228, 10.2761, 0.1951, ..., 16.5192, 12.7168, 0.9931], [ -0.2800, -5.7174, -2.9644, ..., 1.8484, -10.0042, -7.7290]], device='cuda:0') ``` <!-- What does this PR do? Briefly describe the changes and why they’re needed. --> ## 🔍 Related Issues <!-- Link any related issues here --> ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [ ] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [ ] I have installed the hooks with `pre-commit install`. - [ ] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [ ] Tests have been added or updated as needed. - [ ] All tests are passing (`unittest`, etc.). ## Reviewer Notes <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Updated Docker build configurations for CUDA 12.6, 12.8, 12.9, and 13.0 to set runtime library precedence so conda-installed NVIDIA cuBLAS libraries are favored over system libraries. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
1 parent c7761ad commit 9b5901e

4 files changed

Lines changed: 12 additions & 0 deletions

File tree

docker/Dockerfile.cu126

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,9 @@ RUN echo "source activate py312" >> ~/.bashrc
1919
ENV PATH="/opt/conda/bin:$PATH"
2020
ENV PATH="/opt/conda/envs/py312/bin:$PATH"
2121

22+
# Ensure pip-installed nvidia-cublas takes precedence over system libraries
23+
ENV LD_LIBRARY_PATH="/opt/conda/envs/py312/lib/python3.12/site-packages/nvidia/cublas/lib/:$LD_LIBRARY_PATH"
24+
2225
# Install torch and other python packages
2326
COPY requirements.txt /install/requirements.txt
2427
COPY docker/install/install_python_packages.sh /install/install_python_packages.sh

docker/Dockerfile.cu128

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,9 @@ RUN echo "source activate py312" >> ~/.bashrc
1919
ENV PATH="/opt/conda/bin:$PATH"
2020
ENV PATH="/opt/conda/envs/py312/bin:$PATH"
2121

22+
# Ensure pip-installed nvidia-cublas takes precedence over system libraries
23+
ENV LD_LIBRARY_PATH="/opt/conda/envs/py312/lib/python3.12/site-packages/nvidia/cublas/lib/:$LD_LIBRARY_PATH"
24+
2225
# Install torch and other python packages
2326
COPY requirements.txt /install/requirements.txt
2427
COPY docker/install/install_python_packages.sh /install/install_python_packages.sh

docker/Dockerfile.cu129

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,9 @@ RUN echo "source activate py312" >> ~/.bashrc
1919
ENV PATH="/opt/conda/bin:$PATH"
2020
ENV PATH="/opt/conda/envs/py312/bin:$PATH"
2121

22+
# Ensure pip-installed nvidia-cublas takes precedence over system libraries
23+
ENV LD_LIBRARY_PATH="/opt/conda/envs/py312/lib/python3.12/site-packages/nvidia/cublas/lib/:$LD_LIBRARY_PATH"
24+
2225
# Triton
2326
ENV TRITON_PTXAS_PATH="/usr/local/cuda/bin/ptxas"
2427

docker/Dockerfile.cu130

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,9 @@ RUN echo "source activate py312" >> ~/.bashrc
1919
ENV PATH="/opt/conda/bin:$PATH"
2020
ENV PATH="/opt/conda/envs/py312/bin:$PATH"
2121

22+
# Set LD_LIBRARY_PATH to ensure pip-installed nvidia-cublas takes precedence over system libraries
23+
ENV LD_LIBRARY_PATH="/opt/conda/envs/py312/lib/python3.12/site-packages/nvidia/cu13/lib/:$LD_LIBRARY_PATH"
24+
2225
# Triton
2326
ENV TRITON_PTXAS_PATH="/usr/local/cuda/bin/ptxas"
2427

0 commit comments

Comments
 (0)