Skip to content

use IndexList to improve performance of dot op#25133

Closed
zhangting2020 wants to merge 1 commit intoPaddlePaddle:developfrom
zhangting2020:dot_perf
Closed

use IndexList to improve performance of dot op#25133
zhangting2020 wants to merge 1 commit intoPaddlePaddle:developfrom
zhangting2020:dot_perf

Conversation

@zhangting2020
Copy link
Contributor

@zhangting2020 zhangting2020 commented Jun 19, 2020

PR types

Performance optimization

PR changes

OPs

Describe

Using IndexList instead of arrays of indices can speed up CPU and GPU performance. This PR use it to improve performance of dot op. For more information, please refer to #25132

performace

GPU: v100, cuda10

op input shape before after speed up
dot [1000, 1000] 0.210704 ms 0.125058 ms 1.7x

由于修改错误,以上数据存在问题,实际没有提升,此PR暂时关闭

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@zhangting2020 zhangting2020 changed the title use IndexList to improve performance use IndexList to improve performance of dot op Jun 19, 2020
#else
Eigen::IndexList<Eigen::type2index<1>> axis;
#endif
out.device(dev) = (x * y).sum(axis);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个功能,是不是可以在eigen.h里面封装一个公用的函数啊,我看#25132 中也有类似的逻辑。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants