-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Add reduce_merge files for reduce op #32697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your contribution! |
paddle/fluid/operators/reduce_ops/reduce_merge/reduce_functor_op.h
Outdated
Show resolved
Hide resolved
|
建议文件放在reduce_ops下即可,后缀改为.cuh |
|
注意clang-format |
07f509e to
8378a5e
Compare
5a68611 to
1c44bae
Compare
…a malloc and writed more notes
1c44bae to
f83712c
Compare
zhangting2020
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议PR代码接入一个算子,跑下完整的CI,不然光看代码看不出是否有问题。#32804 正在给reduce_sum添加fp16 kernel,这个实现当前没有针对fp16进行优化,先不要改reduce_sum。
| #include "paddle/fluid/operators/amp/fp16_type_traits.h" | ||
| #include "paddle/fluid/platform/device_context.h" | ||
| #include "paddle/fluid/platform/hostdevice.h" | ||
| #include "paddle/fluid/platform/macros.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
极小化include的头文件,看起来以上头文件都没有实际依赖。
| } | ||
|
|
||
| template <typename T, size_t ElementCount, typename VectorLikeType> | ||
| static inline paddle::framework::Array<T, ElementCount> from( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个函数是从什么转到什么?单个的From函数一般是用作类的成员函数,如果是单独的函数,最好命名更清楚一些。
| vec.size(), ElementCount)); | ||
| size_t n = static_cast<size_t>(vec.size()); | ||
| paddle::framework::Array<T, ElementCount> ret; | ||
| for (size_t i = 0; i < n; ++i) ret[i] = vec[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不要写在同一行,加{}。
|
|
||
| template <typename Tx, typename Ty, int BlockDim, typename ReduceOp, | ||
| typename TransformOp, int kRank, int kReduceRank> | ||
| static void launchKernel(const Tx* x_data, Ty* y_data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
launchKernel -> LaunchKernel,首字母大写。
|
|
||
| template <typename Tx, typename Ty, int BlockDim, typename ReduceOp, | ||
| typename TransformOp> | ||
| static void launchReduceKernel(const Tx* x_data, Ty* y_data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
launchReduceKernel -> LaunchReduceKernel
Xreki
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 先合入这个pr,下个pr尽快接入算子进行ci验证。
PR types
Performance optimization
PR changes
APIs
Describe
add reduce_op.h and functor.h in reduce_merge for reduce op
性能测试数据: