-
Notifications
You must be signed in to change notification settings - Fork 5.9k
[Sparse conv] Implement implicit gemm algo for SubmConv3D #62747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
|
Sorry to inform you that ee0d63d's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
064892d to
10eb558
Compare
7a5bec7 to
dcc9b0b
Compare
| counter->set_dims({1}); | ||
| } | ||
|
|
||
| void Conv3dImplicitGemmInferMeta(const MetaTensor& x, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ci的代码没有覆盖到这个OP,可以针对这个OP增加单测
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| // std::vector<int>* spatial_range; | ||
|
|
||
| // destructor | ||
| ~KmapCache() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
单测中没有执行这个析构,可以增加一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我增加了单测,本地跑单测会跑到这个析构,但是CI仍然显示没有跑到。
|
Sorry to inform you that 4851677's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
qingqing01
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后续需要更新中文文档
| weight_attr=None, | ||
| bias_attr=None, | ||
| data_format="NDHWC", | ||
| backend=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对用户暴露的接口需增加注释
| weight_attr=None, | ||
| bias_attr=None, | ||
| data_format="NDHWC", | ||
| backend=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the newly introduced arg backend should be documented in the docstring.
jzhang533
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…le#62747) * sparse conv: implement implicit gemm algo
PR Category
Performance Optimization
PR Types
Performance
Description
Support implicit GEMM algorithm for SubmConv3D.
Usage:
Perf:
GPU: 3080
Prec: FP16
case: single SubmConv
nnz=214202 dense_shape=[1, 1, 4608, 4608, 32] kernel_size=[1, 3, 3] stride=1 in_channel=out_channel=32
(This perf numbers do not include the overhead of hashmap/rulebook creation, which I assume has been cached.)
Note:
subm==True and stride==1 and dilation==1in the code.The input must be 3D (NDHWC), and kernel must has dims=3. For 2D case, please insert zeros to to the D dimention of indices and setkernel sizesto (1, 3, 3).