Skip to content

Commit d5c1fcb

Browse files
chaojun-zhangxuebwang-amd
authored andcommitted
Fix Fused MoE LoRA Triton kernel bug (vllm-project#28450)
Signed-off-by: chaojun-zhang <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
1 parent 9ac7938 commit d5c1fcb

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

vllm/lora/ops/triton_ops/fused_moe_lora_op.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ def _get_ptr(lora_weights: list[torch.Tensor], device: torch.device):
2626
tensor_ptrs = []
2727
for lora_weight in lora_weights:
2828
tensor_ptrs.append(lora_weight.data_ptr())
29-
ptr_tensor = torch.tensor(tensor_ptrs, device=device)
29+
ptr_tensor = torch.tensor(tensor_ptrs, device=device, dtype=torch.uint64)
3030

3131
_LORA_PTR_DICT[key] = ptr_tensor
3232
return _LORA_PTR_DICT.get(key)
@@ -85,6 +85,7 @@ def _fused_moe_lora_kernel(
8585
GROUP_SIZE_M: tl.constexpr,
8686
SPLIT_K: tl.constexpr,
8787
USE_GDC: tl.constexpr,
88+
launch_pdl: tl.constexpr,
8889
IS_PRIMARY: tl.constexpr,
8990
):
9091
pid = tl.program_id(axis=0)

0 commit comments

Comments
 (0)