implementation of broadcast add backward by reduce#34143
implementation of broadcast add backward by reduce#34143Xreki merged 2 commits intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
75c6a5f to
ceed224
Compare
There was a problem hiding this comment.
LastDIm 和Any都走同一套流程是不是可以直接删除LastDim选项,这样能减少一个判断
There was a problem hiding this comment.
这个判断是必要的,因为对应的index计算不同,对性能倒也没太大影响
There was a problem hiding this comment.
感觉和下面那个函数功能重复了,是否可以复用GetReduceDim
There was a problem hiding this comment.
这两个函数功能不同,这个是根据broadcast 的输入输出dim去计算reduce dim的,GetReduceDim这个函数是已知reduce dim,计算真正reduce的dim
There was a problem hiding this comment.
这个函数其他reduce功能的op用不到吧?先放elementwise_op_function.h里面?将reduce_op.h都include进来,引入了很多用不到的代码,会影响编译速度吗?
2c253af to
3e5561a
Compare
3e5561a to
505bf46
Compare
There was a problem hiding this comment.
感觉这些判断、TensorCopy逻辑,与ElementwiseAddGradKernel、elementwise_add_grad中的判断有很多重复之处,能不能为CUDADeviceContext特化实现ElementwiseAddGradKernel,简化下代码逻辑?
There was a problem hiding this comment.
感觉这是我们inplace逻辑的bug。。。先这样处理吧。。。
There was a problem hiding this comment.
这个函数其他reduce功能的op用不到吧?先放elementwise_op_function.h里面?将reduce_op.h都include进来,引入了很多用不到的代码,会影响编译速度吗?
cb9999c to
a8c58e0
Compare
PR types
Performance optimization
PR changes
OPs
Describe
使用reduce实现broadcast add 反向,并且优化reduce最低维很小的时候的性能。