Skip to content

Commit 7334abc

Browse files
committed
pass activation arg
Signed-off-by: Felix Marty <[email protected]>
1 parent 90a01bb commit 7334abc

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

vllm/model_executor/layers/quantization/quark/quark_moe.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -388,6 +388,8 @@ def apply(
388388
scoring_func=scoring_func,
389389
e_score_correction_bias=e_score_correction_bias)
390390

391+
# We pass `per_channel_quant=True` as OCP MXFP4 quantization is a
392+
# per-token quantization scheme (with groups of `OCP_MX_BLOCK_SIZE`).
391393
out = fused_experts(
392394
x,
393395
layer.w13_weight,
@@ -405,5 +407,6 @@ def apply(
405407
a2_scale=None,
406408
block_shape=None,
407409
per_channel_quant=True,
410+
activation=activation,
408411
)
409412
return out

0 commit comments

Comments
 (0)