-
-
Notifications
You must be signed in to change notification settings - Fork 13.1k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Your current environment
The output of python collect_env.py
Your output of `python collect_env.py` here
🐛 Describe the bug
launch_mtp:
chg run --gpus {{GPUS}} -- vllm serve {{MODEL}} -tp {{GPUS}} --speculative_config '{"num_speculative_tokens":1, "method":"deepseek_mtp"}' --port {{PORT}} --enforce-eager --attention-backend CUTLASS_MLAI get:
(Worker_TP2 pid=404489) ERROR 02-02 20:46:37 [multiproc_executor.py:772] super().__init__(
(Worker_TP2 pid=404489) ERROR 02-02 20:46:37 [multiproc_executor.py:772] TypeError: vllm.model_executor.layers.attention.mla_attention.MLACommonImpl.__init__() got multiple values for keyword argument 'q_pad_num_heads'
[rank0]:[W202 20:46:37.582305111 ProcessGroupNCCL.cpp:1524] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Status
Backlog