-
Notifications
You must be signed in to change notification settings - Fork 48
Add dynamic_quant_for_gaudi2.py script to convert model #387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: aice/v1.22.0
Are you sure you want to change the base?
Conversation
example cmd:
python dynamic_quant_for_gaudi2.py -i /data/Qwen3-VL-30B-A3B-Instruct \
-o /data/Qwen3-VL-30B-A3B-Instruct-FP8-G2-Dynamic
Signed-off-by: Chen, Wenbin <[email protected]>
|
@czhu15 @Wei-Lin-Intel Please hep to review |
Wei-Lin-Intel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest to change the script name with MLLM or multimodal to indicate the usage, because we have visual exclusion here.
Signed-off-by: Chen, Wenbin <[email protected]>
I change script name to dynamic_quant_multimodal_for_gaudi2.py |
ranzhejiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have changed to "quant_scheme": "channel"
Signed-off-by: Chen, Wenbin <[email protected]>
|
Signed-off-by: Chen, Wenbin <[email protected]>
Wei-Lin-Intel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
czhu15
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM expect some comments to improve the description on this tool.
|
|
||
| if __name__ == "__main__": | ||
| parser = argparse.ArgumentParser( | ||
| description="Convert tensors to float8 format." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be good to add more detailed description on this tool. e.g. highlight this tool converts a normal bf16 checkpoint to fp8 format that can run on Gaudi2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR is updated, thanks.
| quantization_config["activation_scheme"] = "dynamic" | ||
| quantization_config["fmt"] = "e4m3" | ||
| quantization_config["quant_method"] = "fp8" | ||
| quantization_config["quant_scheme"] = "channel" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only support channel wise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if yes, pls also add it in this tool's description part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR is updated, thanks.
Signed-off-by: Chen, Wenbin <[email protected]>
This script dynamic quant qwen3 dense and qwen3 moe model.
example cmd: