-
Notifications
You must be signed in to change notification settings - Fork 305
update PyTorch easyblock to avoid RPATH linking to CUDA stubs library in libcaffe2_nvrtc.so #2622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…syBuild's --rpath option is used. This is copied from what the generic cmakemake.py easyblock does
…used. It should ALWAYS be set, since without --rpath PyTorch will try to set a RUNPATH that includes the CUDA stubs directory - and this causes obvious problems for any build, regardless of whether --rpath is set
|
Ok, this causes problems during testing: |
|
Now trying to see if setting only |
|
Alternatively, as a targetted patch, I'm wondering if we can't just do |
|
A more generic patch could be (inspired by the one in pytorch/pytorch#35418): (probably put in This relies on the fact that CMAKE won't RPATH |
|
Ok, so the above patch doesn't resolve the issue. I'm not sure if the syntax is somehow wrong, or whether implicit link directories are RPATH-ed after all, but it ends up with the stubs in the RPATH again [EDIT] Correction, setting So, two options left:
|
…tion. It does still set it at compile time, which is needed for the tensorboard test from the test suite to pass
|
Test report was succesful, but erroneously ended up in a different PR (the one from an unlrelated EasyConfig). I'll try to rerun it and now reference the right PR... But anyway, a gist of a succesful build is here: https://gist.github.com/2ffa26ffd965c13934bdcc72661382f1 |
|
Ok, so I misunderstood what |
|
Ok, so:
Thus, a more targetted patch was required and I implemented this as a patch for the EasyConfig easybuilders/easybuild-easyconfigs#14382. It doesn't make sense to do that at the EasyBlock level after all. If I find the time, I will try to push the patch upstream and see if it also works for them (upstream PyTorch had some CI problems with an earlier patch that was proposed for this issue pytorch/pytorch#37737). Closing this PR. |
Set
CMAKE_SKIP_RPATH=ONfor all PyTorch builds. This avoids the issue described at easybuilders/easybuild-easyconfigs#14359Note that the
cmakemake.pyEasyBlock setsCMAKE_SKIP_RPATH=ONonly when--rpathis used. We don't use the same condition here, but always set it, since regardless of whether--rpathis used, the PyTorch build will get anRPATHset due the CMAKE configuration set here https://github.com/pytorch/pytorch/blob/36449ea93134574c2a22b87baad3de0bf8d64d42/cmake/Dependencies.cmake#L16This will result in the
libcaffe2_nvrtc.sopicking up on the CUDAstubslibrary, rather than the actual driver.