[Bug]: DeepSeek R1 with CUTLASS MLA Broken on B200

### Your current environment

<details>
<summary>The output of <code>python collect_env.py</code></summary>

```text
Your output of `python collect_env.py` here
```

</details>


### 🐛 Describe the bug

```bash
launch_mtp:
	chg run --gpus {{GPUS}} -- vllm serve {{MODEL}} -tp {{GPUS}} --speculative_config '{"num_speculative_tokens":1, "method":"deepseek_mtp"}' --port {{PORT}} --enforce-eager --attention-backend CUTLASS_MLA
```

I get:
```bash
(Worker_TP2 pid=404489) ERROR 02-02 20:46:37 [multiproc_executor.py:772]     super().__init__(
(Worker_TP2 pid=404489) ERROR 02-02 20:46:37 [multiproc_executor.py:772] TypeError: vllm.model_executor.layers.attention.mla_attention.MLACommonImpl.__init__() got multiple values for keyword argument 'q_pad_num_heads'
[rank0]:[W202 20:46:37.582305111 ProcessGroupNCCL.cpp:1524] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
```


### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: DeepSeek R1 with CUTLASS MLA Broken on B200 #33627

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: DeepSeek R1 with CUTLASS MLA Broken on B200 #33627

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions