VLLM does not support EAGLE Spec Decode when deploying EAGLE-Qwen2-7B-Instruct model

I can successfully deploy llama3-8b-instruct with EAGLE. But there is a problem when deploying qwen2-7b-instruct with EAGLE.

I have converted the EAGLE-Qwen2-7B-Instruct model according to[vllm/model_executor/models/eagle.py:L126](https://github.com/vllm-project/vllm/blob/8fae5ed7f6bfd63b81310fcb24b310d9205c9687/vllm/model_executor/models/eagle.py#L126). 

I encountered another error below:

`AssertionError: Attempted to load weight (torch.Size([3584])) into parameter (torch.Size([3584, 7168]))`
I lookup to the code [vllm/model_executor/models/eagle.py:L139](https://github.com/vllm-project/vllm/blob/8fae5ed7f6bfd63b81310fcb24b310d9205c9687/vllm/model_executor/models/eagle.py#L139) which is shown as below:

```
def load_weights(self, weights: Iterable[Tuple[str, torch.Tensor]]):
            ...
            elif name.startswith("fc."):
                weight_loader = getattr(self.fc.weight, "weight_loader",
                                        default_weight_loader)
                weight_loader(self.fc.weight, loaded_weight)
            ...
```

I think you only consider the name varieble startswith 'fc.' can only be 'fc.weight', but the fc layer of eagle-qwen2 has bias attribute, which means the name varieble can be 'fc.bias'.

Moreover, the qkv_proj layer of EAGLE-Qwen2-7B-Instruct also have bias. 

I hope you can fix this in the upcoming upgrade!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

VLLM does not support EAGLE Spec Decode when deploying EAGLE-Qwen2-7B-Instruct model #135

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

VLLM does not support EAGLE Spec Decode when deploying EAGLE-Qwen2-7B-Instruct model #135

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions