-
Notifications
You must be signed in to change notification settings - Fork 310
add custom easyblock for NVHPC (aka PGI) #2190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This exposes options suggested by NVHPC's install scripts and the created modulefiles
boegel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also take care of the code style suggestions.
Don't hesitate to reach out if you need any help!
|
(Snapshot at end of work day today. Will continue after additional input. Note: Not tested on system.) |
Default is now None, to be overwritten either via the easyconfig or the command line (--try-amend=default_cuda_version='11.0'). If it is not overwritten manually, it is tried to be guessed by requesting the version of a (potential) CUDA module. If that also fails, EB is error'd out with a descriptive message.
|
Some more work done. @boegel can you have a look at the open questions? |
easybuild/easyblocks/n/nvhpc.py
Outdated
| after=default_compute_capability | ||
| ) | ||
| ) | ||
| default_compute_capability.replace(".", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assignment is missing here, this can't work? .replace returns a new string value...
Also, I hate to be pedantic, but you're now just assuming you have a string value in default_compute_capability, which may not be the case (and if it's not, the crash will be pretty ugly).
So I'd add a small check:
if isinstance(default_compute_capability, str):
default_compute_capability = default_compute_capability.replace('.', '')
else:
raise EasyBuildError("Unexpected non-string value encountered for compute capability: %s",
default_compute_capability)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented in 2c91709
easybuild/easyblocks/n/nvhpc.py
Outdated
| 'dirs': [os.path.join(prefix, 'compilers', 'bin'), os.path.join(prefix, 'compilers', 'lib'), | ||
| os.path.join(prefix, 'compilers', 'include'), os.path.join(prefix, 'compilers', 'man')] | ||
| } | ||
| custom_commands = ["%s -v" % custom_paths["files"][0]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would check all commands, and just use the command name (not the path):
compiler_cmds = ['nvc', 'nvc++', 'nvfortran']
custom_paths = {
'files': [os.path.join(prefix, 'compilers', 'bin', x) for x in compiler_cmds + ['siterc']],
...
custom_commands = ["%s -v" % x for x in compiler_cmds]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented in c54b719.
|
@AndiH Please also revisit easybuilders/easybuild-easyconfigs#11391, even though the tests can't pass there as long as this easyblock isn't merged, since that's the test case for this easyblock (we'll merge both in quick succession). We're getting there, I'm confident we can get this merged really soon now... |
|
@boegel Absolutely! I wanted to get the EasyBlock done first before I implement the easyconfig. I am working on that next! |
|
I think I've tackled all the remaining open issues! |
|
@AndiH Tested this using a tweaked version of easybuilders/easybuild-easyconfigs#11391 (only changed the |
|
@boegel Did you specify a Compute Capability (via CLI or via easyconfig)? (I test some combinations meanwhile!) |
Plus, proper quotes for hint for easyconfig
|
Ok, this is a bit weird, but fixed in e4c6c34. If |
|
@AndiH Getting Thanks for fixing that issue! I'm running some final tests now, but I think this PR and easybuilders/easybuild-easyconfigs#11391 are good to go (I'll deal with the failing tests in the latter once this PR is merged). |
|
Great! 🕺 I try to work soon on NVHPC 20.9, which has been released meanwhile! |
|
lgtm, tested with easybuilders/easybuild-easyconfigs#11391 Thanks a lot for the effort and patience @AndiH ! |
PGI beyond 20.4 is not called PGI any more but NVIDIA HPC SDK (NVHPC), with 20.7 being the first version; https://developer.nvidia.com/hpc-sdk.
This pull request is part of a set of three pull requests to introduce NVHPC to EasyBuild (1, 2, 3). The stack has been successfully deployed at JSC and is under use by our HPC users.
The EasyBuild files are created after discussions with the NVHPC product owners, taking into account HPC practices and EasyBuild conventions.
This request is for pulling the NVHPC EasyBlock. It is based on the PGI EasyBlock but cleaned-up and extended for NVHPC.
Additional options are included; for example
default_cuda_version,compute_capabilityand severalmodule_*options. The former two are passed to the NVHPC'sinstallscript which configures NVHPC for usage with a certain CUDA or device Compute Capability.The
module_*options refer to options being passed to the LMod module where additional directories beyond the default directories are included into the environment. As an example,nvhpc_nvshmem_basediradds the directory of NVHPC in which NVSHMEM is included intoCPATHandLD_LIBRARY_PATHwith their corresponding sub-directories. The options exist to enable HPC centres to rather provide their own installations of the packaged libraries.