Share my installation experience, which may be suitable for any errors similar to "cuSolver internal error", I hope it can help you. #33134
Unanswered
YukonKong
asked this question in
Show and tell
Replies: 1 comment
-
|
Thanks for sharing! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Here is the translation of the provided text into natural-sounding English:
❌ Error similar to
jaxlib.xla_extension.XlaRuntimeError: INTERNAL: cuSolver internal error⚠ Core underlying reason: Version mismatch between
cuda,cudnn,jax, andjaxlib!☀ The most helpful information comes from the official documentation:
https://jax.readthedocs.io/en/latest/installation.html#nvidia-gpu
👉 Forget about the globally installed CUDA and cuDNN you might have set up using traditional methods. Use the
pipinstallation recommended in the documentation to save yourself the trouble.Here's the specific approach:
0. The only thing you need to confirm is whether your GPU device and driver version meet the requirements:
1. Remove all CUDA-related environment variables from your
~/.bashrcfile, such asCUDA_HOME,PATH, andLD_LIBRARY_PATH. A system-wide CUDA installation is no longer needed.This step is crucial. Often, the installation command is correct, but scripts still find the wrong version of system-level CUDA tools based on these environment variables, which leads to errors.
2. Use the
pipcommand provided in the documentation for installation. For example:pip install --upgrade "jax[cuda12_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.htmlYou can modify the CUDA and JAX versions as needed.
That's it. This method will install CUDA and cuDNN at the environment level, just like any other library.
中文表述:
❌ 错误类似于 jaxlib.xla_extension.XlaRuntimeError: INTERNAL: cuSolver internal error
⚠ 核心唯一原因:[ cuda | cudnn | jax | jaxlib ]这些家伙的版本不匹配!
☀ 最有用的资料来自官方文档:
https://docs.jax.dev/en/latest/installation.html#nvidia-gpu
👉忘记已经使用传统方法在设备上安装的全局cuda和cudnn,使用文档中推荐的 pip installation 避免自找麻烦
具体做法为:
0、唯一需要确认的是你的GPU设备与驱动版本是否符合要求:
1、在~/.bashrc 中删除所有有关cuda的环境变量,例如CUDA_HOME、PATH、LD_LIBRARY_PATH,不再需要系统级的cuda。
这点非常重要,通常大家的安装命令是正确的,但是脚本仍然根据环境变量找到了错误版本的系统级cuda系列工具,从而导致出错
2、使用文档中给出的pip命令安装,比如:
pip install --upgrade "jax[cuda12_pip]==0.5.0" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html,cuda版本与jax版本修改为你需要的这样就可以了,这种方法会安装环境级的cuda和cudnn,类似于其他库一样
Beta Was this translation helpful? Give feedback.
All reactions