Commit 9b5901e
authored
ci: set LD_LIBRARY_PATH in Docker images for correct cuBLAS detection (#2468)
<!-- .github/pull_request_template.md -->
## 📌 Description
Summary
* Add `LD_LIBRARY_PATH` to Docker images to ensure pip-installed
`nvidia-cublas` takes precedence over system libraries
* Fixes issues where incorrect cuBLAS versions could be loaded at
runtime
Example of what happens without prepending the path to `LD_LIBRARY_PATH`
in our cu130 containers:
```
$ docker run --gpus all -it flashinfer/flashinfer-ci-cu130:20260131-a52eff1
Unable to find image 'flashinfer/flashinfer-ci-cu130:20260131-a52eff1' locally
20260131-a52eff1: Pulling from flashinfer/flashinfer-ci-cu130
Digest: sha256:582aeb35289cf804735a31727abe8ff37ae722fe6c7bd7fb8ddf50654429ff7a
Status: Downloaded newer image for flashinfer/flashinfer-ci-cu130:20260131-a52eff1
==========
== CUDA ==
==========
CUDA Version 13.0.1
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
(py312) root@fdac9b9cd61e:/workspace# python -c "import torch; print(torch.matmul(torch.randn(128,128,device='cuda'), torch.randn(128,128,device='cuda')))"
Traceback (most recent call last):
File "<string>", line 1, in <module>
RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`
(py312) root@fdac9b9cd61e:/workspace# export LD_LIBRARY_PATH=/opt/conda/envs/py312/lib/python3.12/site-packages/nvidia/cu13/lib/:$LD_LIBRARY_PATH
(py312) root@fdac9b9cd61e:/workspace# python -c "import torch; print(torch.matmul(torch.randn(128,128,device='cuda'), torch.randn(128,128,device='cuda')))"
tensor([[ 14.9044, 14.3420, 26.0861, ..., -10.4334, -4.5352, 4.2331],
[ 1.9701, 13.6111, 1.0954, ..., 3.0715, -2.9266, 7.8847],
[ 6.5089, -7.4811, -12.6226, ..., -5.3695, -4.4557, -22.4567],
...,
[-12.0462, -2.0045, 15.7295, ..., -4.5688, 22.5680, -11.9852],
[ -0.4228, 10.2761, 0.1951, ..., 16.5192, 12.7168, 0.9931],
[ -0.2800, -5.7174, -2.9644, ..., 1.8484, -10.0042, -7.7290]],
device='cuda:0')
```
<!-- What does this PR do? Briefly describe the changes and why they’re
needed. -->
## 🔍 Related Issues
<!-- Link any related issues here -->
## 🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.
### ✅ Pre-commit Checks
- [ ] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [ ] I have installed the hooks with `pre-commit install`.
- [ ] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.
> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).
## 🧪 Tests
- [ ] Tests have been added or updated as needed.
- [ ] All tests are passing (`unittest`, etc.).
## Reviewer Notes
<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Updated Docker build configurations for CUDA 12.6, 12.8, 12.9, and
13.0 to set runtime library precedence so conda-installed NVIDIA cuBLAS
libraries are favored over system libraries.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->1 parent c7761ad commit 9b5901e
4 files changed
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
22 | 25 | | |
23 | 26 | | |
24 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
22 | 25 | | |
23 | 26 | | |
24 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
22 | 25 | | |
23 | 26 | | |
24 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
22 | 25 | | |
23 | 26 | | |
24 | 27 | | |
| |||
0 commit comments