support Qwen3-MoE-w4afp8 by zhilingjiang · Pull Request #9147 · sgl-project/sglang

zhilingjiang · 2025-08-13T06:54:30Z

Motivation

Follow #8118. Base on #7762.

Modifications

This PR primarily implements the adaptation of SGLang for Qwen3-MoE-w4afp8 quantized models. The key enhancements include:

Support for Qwen3-MoE’s w4afp8-block quantization format: SGLang can now load and run models that have been quantized using the w4afp8-block format.
Support for loading Qwen-MoE static quantization calibration parameters: SGLang is now capable of loading and utilizing the static quantization calibration parameters for Qwen-MoE models, ensuring correct inference behavior after quantization.

Accuracy Tests

You can download the Qwen3-30B-A3B-w4afp8-block model here.
https://huggingface.co/zhilingjiang/Qwen3-30B-A3B-w4afp8-block-dynamic

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

gemini-code-assist · 2025-08-13T07:15:51Z

Warning

Gemini encountered an error creating the summary. You can try again by commenting /gemini summary.

ZhuJiaqi9905 · 2025-08-16T03:27:01Z

Hi, it is a nice work. However, it seems that there are too many diff in the git commit. Could you please fix it to help us understand your code? I think that in order to support Qwen-235B-w4afp8 in TP mode, we should special handling "weight interleave scales" which should not be 4, and modify the sgl-kernel.

support Qwen3-MoE-w4afp8

7f1f5bc

merrymercy requested review from iforgetmyname and ping1jing2 as code owners November 29, 2025 07:06

merrymercy requested review from AniZpZ, DarkSharpness, ishandhanani, key4ng and yhyang201 as code owners November 29, 2025 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support Qwen3-MoE-w4afp8#9147

support Qwen3-MoE-w4afp8#9147
zhilingjiang wants to merge 1 commit intosgl-project:mainfrom
zhilingjiang:feat/w4afp8-tp

zhilingjiang commented Aug 13, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Aug 13, 2025

Uh oh!

ZhuJiaqi9905 commented Aug 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhilingjiang commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Checklist

Uh oh!

gemini-code-assist bot commented Aug 13, 2025

Uh oh!

ZhuJiaqi9905 commented Aug 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhilingjiang commented Aug 13, 2025 •

edited

Loading