[Bugfix] fix DeepSeek R1 with CUTLASS MLA Broken on B200 #33637

chaunceyjiang · 2026-02-03T03:21:52Z

Purpose

FIX #33627

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: chaunceyjiang <[email protected]>

gemini-code-assist

Code Review

This pull request aims to fix an issue with DeepSeek R1 using CUTLASS MLA on B200 GPUs by providing a default value for q_pad_num_heads. However, the current implementation is not correct as the default value is set in a place where it will not be used by the padding logic. The fix needs to be applied in a different location to be effective. I've left a critical comment explaining the issue with the current approach.

vllm/v1/attention/backends/mla/cutlass_mla.py

Signed-off-by: chaunceyjiang <[email protected]>

mergify · 2026-02-03T03:43:14Z

Hi @chaunceyjiang, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: chaunceyjiang <[email protected]>

chaunceyjiang · 2026-02-03T09:41:38Z

/cc @MatthewBonanni @LucasWilkinson PTAL.

MatthewBonanni

Thanks for the fix! LGTM once comment is addressed

vllm/v1/attention/backends/mla/cutlass_mla.py

Co-authored-by: Matthew Bonanni <[email protected]> Signed-off-by: Chauncey <[email protected]>

MatthewBonanni · 2026-02-03T18:42:22Z

Ah actually, I think the current state of the PR will force q_pad_num_heads to be None rather than MAX_HEADS as it should. Can you verify correctness?

MatthewBonanni · 2026-02-03T18:50:27Z

I think the proper fix would be:

Remove q_pad_num_heads entirely from the argument list of MLAAttention.__init__().
When calling impl_cls() in MLAAttention.__init__(), don't pass q_pad_num_heads.
Assign self.q_pad_num_heads = getattr(self.impl, "q_pad_num_heads", None) after the impl has been constructed
Leave cutlass_mla.py as is

[Bugfix] fix DeepSeek R1 with CUTLASS MLA Broken on B200

e1e4c3b

Signed-off-by: chaunceyjiang <[email protected]>

chaunceyjiang requested a review from pavanimajety as a code owner February 3, 2026 03:21

github-project-automation bot added this to DeepSeek V3/R1 Feb 3, 2026

github-project-automation bot moved this to Backlog in DeepSeek V3/R1 Feb 3, 2026

mergify bot added deepseek Related to DeepSeek models nvidia v1 labels Feb 3, 2026

github-project-automation bot added this to NVIDIA Feb 3, 2026

mergify bot added the bug Something isn't working label Feb 3, 2026

gemini-code-assist bot reviewed Feb 3, 2026

View reviewed changes

vllm/v1/attention/backends/mla/cutlass_mla.py Outdated Show resolved Hide resolved

[Bugfix] fix DeepSeek R1 with CUTLASS MLA Broken on B200

95465b2

Signed-off-by: chaunceyjiang <[email protected]>

[Bugfix] fix DeepSeek R1 with CUTLASS MLA Broken on B200

6ebaef6

Signed-off-by: chaunceyjiang <[email protected]>

MatthewBonanni approved these changes Feb 3, 2026

View reviewed changes

vllm/v1/attention/backends/mla/cutlass_mla.py Outdated Show resolved Hide resolved

Apply suggestions from code review

fd36754

Co-authored-by: Matthew Bonanni <[email protected]> Signed-off-by: Chauncey <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] fix DeepSeek R1 with CUTLASS MLA Broken on B200 #33637

[Bugfix] fix DeepSeek R1 with CUTLASS MLA Broken on B200 #33637

chaunceyjiang commented Feb 3, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

mergify bot commented Feb 3, 2026

Uh oh!

chaunceyjiang commented Feb 3, 2026

Uh oh!

MatthewBonanni left a comment

Uh oh!

Uh oh!

MatthewBonanni commented Feb 3, 2026

Uh oh!

MatthewBonanni commented Feb 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Bugfix] fix DeepSeek R1 with CUTLASS MLA Broken on B200 #33637

Are you sure you want to change the base?

[Bugfix] fix DeepSeek R1 with CUTLASS MLA Broken on B200 #33637

Conversation

chaunceyjiang commented Feb 3, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify bot commented Feb 3, 2026

Uh oh!

chaunceyjiang commented Feb 3, 2026

Uh oh!

MatthewBonanni left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MatthewBonanni commented Feb 3, 2026

Uh oh!

MatthewBonanni commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chaunceyjiang commented Feb 3, 2026 •

edited by github-actions bot

Loading

MatthewBonanni commented Feb 3, 2026 •

edited

Loading