-
Notifications
You must be signed in to change notification settings - Fork 772
{ai}[foss/2024a] PyTorch v2.9.1 w/ CUDA 12.6.0 #24365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
{ai}[foss/2024a] PyTorch v2.9.1 w/ CUDA 12.6.0 #24365
Conversation
|
Diff of new easyconfig(s) against existing ones is too long for a GitHub comment. Use |
9849937 to
15b85aa
Compare
|
Test report by @Flamefire |
|
Test report by @boegel |
| postinstallpatches = [('triton_test.py', 'test/triton_test.py')] | ||
|
|
||
| checksums = [ | ||
| {'triton_test.py': '0d8b4556a76268b000d6023a1abaee801d179db3aed51e781c06854858490cc8'}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checksum of easybuild/easyconfigs/t/Triton/triton_test.py in develop branch is 02a3390a5dbe27385358ab319cf10972cd8b51aca599a6809efea612a90ecdba ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, yes. The upload failed because --update-pr couldn't handle Python files (tried to parse as easyconfig to find destination folder).
Will fix on Monday
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update added
|
I'm also seeing a crash with the |
|
That's why the checksum changed: Breaking change in Triton 3.5 and I updated the test script accordingly. It is in #24793 |
|
I had to use tlparse 0.4.0 (also separate PR in #24882) as the older one isn't compatible with PyTorch output, see pytorch/pytorch@92c2dae
Not sure if this causes conflicts in EB. The alternative is to drop this dependency as it is optional |
2b8bc42 to
64a4d67
Compare
|
Rebased to remove EasyConfigs present in develop from this branch. Also added 2 more patches to avoid remaining failures. |
|
Test report by @Flamefire |
01180ef to
381c028
Compare
|
Test report by @Flamefire |
…es: PyTorch-2.9.0_fix-nccl-test-env.patch, PyTorch-2.9.0_readd-support-for-nvidia-cutlass-python-package.patch
9e1d5c8 to
867d1a2
Compare
252c2c3 to
4dc0cc6
Compare
07c1976 to
117a394
Compare
(created using
eb --new-pr)Early draft. Compiles but not tested
Requires: