Skip to content

Conversation

@davidberard98
Copy link
Contributor

This PR addresses a compilation issue when targeting RTX 5090 GPUs with compute capability 120. Previously, the function getMMAVersionSafe would trigger an assertion failure for compute capabilities beyond 110. This update ensures that GPUs with compute capability of 120 fall under a valid MMA version category, preventing unnecessary assertion failures.

Changes
Updated the compute capability check in getMMAVersionSafe to handle GPUs with compute capability up to 129.
Assigned MMA version {2} for these cases to maintain compatibility with customer's gpus only supporting version 2.

Motivation
Certain GPUs, including models from the NVIDIA 50 series, have a compute capability of 120, which was previously unhandled, causing compilation failures. This fix ensures compatibility with such GPUs.

New contributor declaration

  • I am not making a trivial change, such as fixing a typo in a comment.

  • [x ] I have written a PR description following these rules.

  • I have run pre-commit run --from-ref origin/main --to-ref HEAD.

  • Select one of the following.

    • I have added tests.
      • /test for lit tests
      • /unittest for C++ tests
      • /python/test for end-to-end tests
    • This PR does not need a test because FILL THIS IN.
  • Select one of the following.

    • I have not added any lit tests.
  • The lit tests I have added follow these best practices, including the "tests should be minimal" section. (Usually running Python code
    and using the instructions it generates is not minimal.)

New contributor declaration

  • I am not making a trivial change, such as fixing a typo in a comment.

  • I have written a PR description following these
    rules.

  • I have run pre-commit run --from-ref origin/main --to-ref HEAD.

  • Select one of the following.

    • I have added tests.
      • /test for lit tests
      • /unittest for C++ tests
      • /python/test for end-to-end tests
    • This PR does not need a test because FILL THIS IN.
  • Select one of the following.

    • I have not added any lit tests.
    • The lit tests I have added follow these best practices,
      including the "tests should be minimal" section. (Usually running Python code
      and using the instructions it generates is not minimal.)

…triton-lang#6131)

This PR addresses a compilation issue when targeting RTX 5090 GPUs with
compute capability 120. Previously, the function getMMAVersionSafe would
trigger an assertion failure for compute capabilities beyond 110. This
update ensures that GPUs with compute capability of 120 fall under a
valid MMA version category, preventing unnecessary assertion failures.

Changes
Updated the compute capability check in getMMAVersionSafe to handle GPUs
with compute capability up to 129.
Assigned MMA version {2} for these cases to maintain compatibility with
customer's gpus only supporting version 2.

Motivation
Certain GPUs, including models from the NVIDIA 50 series, have a compute
capability of 120, which was previously unhandled, causing compilation
failures. This fix ensures compatibility with such GPUs.

<!---
The core Triton is a small number of people, and we receive many PRs
(thank
you!).  To help us review your code more quickly, **if you are a new
contributor (less than 3 PRs merged) we ask that you complete the
following
tasks and include the filled-out checklist in your PR description.**

Complete the following tasks before sending your PR, and replace `[ ]`
with
`[x]` to indicate you have done them.
-->

# New contributor declaration
- [x] I am not making a trivial change, such as fixing a typo in a
comment.

- [x ] I have written a PR description following these
  [rules](https://cbea.ms/git-commit/#why-not-how).

- [ ] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`.

- Select one of the following.
  - [ ] I have added tests.
    - `/test` for `lit` tests
    - `/unittest` for C++ tests
    - `/python/test` for end-to-end tests
  - [x] This PR does not need a test because `FILL THIS IN`.

- Select one of the following.
  - [x] I have not added any `lit` tests.
- [ ] The `lit` tests I have added follow these [best
practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices),
including the "tests should be minimal" section. (Usually running Python
code
    and using the instructions it generates is not minimal.)
@davidberard98
Copy link
Contributor Author

cc @atalman

@davidberard98 davidberard98 marked this pull request as ready for review May 9, 2025 16:01
@davidberard98 davidberard98 requested a review from ptillet as a code owner May 9, 2025 16:01
Copy link
Collaborator

@atalman atalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@atalman atalman merged commit b79de50 into triton-lang:release/3.3.x May 13, 2025
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request May 21, 2025
Triton is pointing to latest triton pin : https://github.com/triton-lang/triton/tree/release/3.3.x
XPU pointing to latest XPU pin: https://github.com/intel/intel-xpu-backend-for-triton/commits/release/3.3.x/

This version contains the fix for: Compilation Issue for RTX 5090 GPUs with Compute Capability = 120. triton-lang/triton#6771
Pull Request resolved: #153951
Approved by: https://github.com/davidberard98
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants