Skip to content

VLLM does not support EAGLE Spec Decode when deploying EAGLE-Qwen2-7B-Instruct model #135

@crownz248

Description

@crownz248

I can successfully deploy llama3-8b-instruct with EAGLE. But there is a problem when deploying qwen2-7b-instruct with EAGLE.

I have converted the EAGLE-Qwen2-7B-Instruct model according tovllm/model_executor/models/eagle.py:L126.

I encountered another error below:

AssertionError: Attempted to load weight (torch.Size([3584])) into parameter (torch.Size([3584, 7168]))
I lookup to the code vllm/model_executor/models/eagle.py:L139 which is shown as below:

def load_weights(self, weights: Iterable[Tuple[str, torch.Tensor]]):
            ...
            elif name.startswith("fc."):
                weight_loader = getattr(self.fc.weight, "weight_loader",
                                        default_weight_loader)
                weight_loader(self.fc.weight, loaded_weight)
            ...

I think you only consider the name varieble startswith 'fc.' can only be 'fc.weight', but the fc layer of eagle-qwen2 has bias attribute, which means the name varieble can be 'fc.bias'.

Moreover, the qkv_proj layer of EAGLE-Qwen2-7B-Instruct also have bias.

I hope you can fix this in the upcoming upgrade!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions