Skip to content

Conversation

@Flamefire
Copy link
Contributor

(created using eb --new-pr)

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire

Overview of tested easyconfigs (in order)

  • SUCCESS NCCL-2.18.3-GCCcore-12.3.0-CUDA-12.1.1.eb
  • SUCCESS NCCL-2.12.12-GCCcore-11.3.0-CUDA-11.7.0.eb
  • SUCCESS CUDA-12.8.0.eb
  • SUCCESS GDRCopy-2.4.4-GCCcore-14.2.0.eb
  • SUCCESS UCX-CUDA-1.18.0-GCCcore-14.2.0-CUDA-12.8.0.eb
  • SUCCESS NCCL-2.26.6-GCCcore-14.2.0-CUDA-12.8.0.eb

Build succeeded for 6 out of 6 (3 easyconfigs in total)
i7008 - Linux Rocky Linux 8.9 (Green Obsidian), x86_64, AMD EPYC 7702 64-Core Processor (zen2), Python 3.9.18
See https://gist.github.com/Flamefire/e42f268b96b4ac5ce22eb07dede5d9b8 for a full test report.

@boegel boegel changed the title Set NCCL_HOME in NCCL easyblock Set $NCCL_HOME in NCCL easyblock Jul 2, 2025
@boegel boegel added this to the release after 5.1.1 milestone Jul 2, 2025
@boegel boegel changed the title Set $NCCL_HOME in NCCL easyblock enhance custom easyblock for NCCL so it defines $NCCL_HOME in generated module file Aug 13, 2025
@boegel
Copy link
Member

boegel commented Aug 13, 2025

Test report by @boegel

Overview of tested easyconfigs (in order)

Build succeeded for 1 out of 2 (2 easyconfigs in total)
node3300.joltik.os - Linux RHEL 9.4, x86_64, Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz (cascadelake), 1 x NVIDIA Tesla V100-SXM2-32GB, 570.133.20, Python 3.9.18
See https://gist.github.com/boegel/d949e35b478862768dd31456469a22f6 for a full test report.

@Flamefire
Copy link
Contributor Author

UCX-CUDA/1.14.1-GCCcore-12.3.0-CUDA-12.1.1 is missing?

BTW: LMod coupled with an LLM? Nice!

@boegel
Copy link
Member

boegel commented Aug 13, 2025

UCX-CUDA/1.14.1-GCCcore-12.3.0-CUDA-12.1.1 is missing?

BTW: LMod coupled with an LLM? Nice!

It's a proof-of-concept thingie, see https://github.com/boegel/easybuild-llm

@boegel
Copy link
Member

boegel commented Aug 13, 2025

Test report by @boegel

Overview of tested easyconfigs (in order)

  • SUCCESS NCCL-2.18.3-GCCcore-12.3.0-CUDA-12.1.1.eb
  • SUCCESS NCCL-2.20.5-GCCcore-13.2.0-CUDA-12.4.0.eb

Build succeeded for 2 out of 2 (2 easyconfigs in total)
node3300.joltik.os - Linux RHEL 9.4, x86_64, Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz (cascadelake), 1 x NVIDIA Tesla V100-SXM2-32GB, 570.133.20, Python 3.9.18
See https://gist.github.com/boegel/dca377dd44f8227eb9c7deb0e699c5b5 for a full test report.

Copy link
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel boegel merged commit 7a91cfb into easybuilders:develop Aug 13, 2025
17 checks passed
@Flamefire Flamefire deleted the 20250617121442_new_pr_nccl branch August 13, 2025 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants