-
Notifications
You must be signed in to change notification settings - Fork 5.9k
added support for fake_quantize_dequantize_abs_max op in quantization… #30896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your contribution! |
|
✅ This PR's description meets the template requirements! |
| weight_scale[j] = std::max(weight_scale[j], abs_max); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quant_axis等于其他值呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是参考的fake_quantize_op.cc里面对于该OP的实现,里面只有两种quant_axis
|
Sorry to inform you that aafb48e's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
cryoco
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
shangzhizhou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
cryoco
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PaddlePaddle#30896) * added support for fake_quantize_dequantize_abs_max op in quantization inference pass * remove const_cast to pass ci * remove compare operator to pass ci-coverage * added detailed error message for unregistered tensorrt_subgrah_pass
PR types
Bug fixes
PR changes
Others
Describe
added support for fake_quantize_dequantize_abs_max op in quantization inference and compute the channel wise weight scale directly from weights