Skip to content
This repository was archived by the owner on Sep 4, 2025. It is now read-only.

Commit 48c2775

Browse files
Merge pull request #137 from HabanaAI/private/kzawora/moe_constraints
Add constraints for HPU UnquantizedFusedMoEMethod
2 parents 2c3a95d + 030a2cb commit 48c2775

File tree

1 file changed

+4
-0
lines changed
  • vllm/model_executor/layers/fused_moe

1 file changed

+4
-0
lines changed

vllm/model_executor/layers/fused_moe/layer.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,10 @@ def forward_hpu(self, x: torch.Tensor, w1: torch.Tensor, w2: torch.Tensor,
108108
router_logits: torch.Tensor, top_k: int, renormalize: bool,
109109
use_grouped_topk: bool, num_expert_group: Optional[int],
110110
topk_group: Optional[int]):
111+
assert not use_grouped_topk, 'use_grouped_topk must be False on HPU'
112+
assert num_expert_group is None, ('num_expert_group is '
113+
'not supported on HPU')
114+
assert topk_group is None, 'topk_group is not supported on HPU'
111115
return static_fused_moe(x, w1, w2, router_logits, top_k)
112116

113117
def forward_cpu(self, *args, **kwargs):

0 commit comments

Comments
 (0)