Skip to content

Commit b532a5f

Browse files
fix moe-ep accuracy issue for fp8 (#2489)
1 parent a0592c0 commit b532a5f

1 file changed

Lines changed: 4 additions & 0 deletions

File tree

python/sglang/srt/layers/ep_moe/layer.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -644,6 +644,10 @@ def process_weights_after_loading(self, layer: Module) -> None:
644644
"QuantConfig has static quantization, but found "
645645
"activation scales are None."
646646
)
647+
layer.w13_weight_scale = torch.nn.Parameter(
648+
torch.max(layer.w13_weight_scale, dim=1).values,
649+
requires_grad=False,
650+
)
647651
return
648652

649653
def apply(

0 commit comments

Comments
 (0)