Skip to content

use IndexList to improve performance of instance_norm op#25132

Merged
zhangting2020 merged 3 commits intoPaddlePaddle:developfrom
zhangting2020:instance_norm_perf
Oct 12, 2020
Merged

use IndexList to improve performance of instance_norm op#25132
zhangting2020 merged 3 commits intoPaddlePaddle:developfrom
zhangting2020:instance_norm_perf

Conversation

@zhangting2020
Copy link
Contributor

@zhangting2020 zhangting2020 commented Jun 19, 2020

PR types

Performance optimization

PR changes

OPs

Describe

IndexList in Eigen is used to encode a set of Tensor dimensions/indices. The indices in the list can be known at compile time or at runtime. A mix of static and dynamic indices can also be provided if needed. The tensor code will attempt to take advantage of the indices that are known at compile time to optimize the code it generates. Using IndexList instead of arrays of indices can speed up CPU and GPU performance.

Note:

  • Eigen on windows is not updated, so there is no IndexList. We need to use arrays of indices instead. That's why the EIGEN_HAS_INDEX_LIST is used in the code.

Performance

CPU

op input shape before after speed up
instance_norm [1, 64, 128, 128] 41.2712 ms 3.21222 ms 13x
instance_norm_grad [1, 64, 128, 128] 149.14 ms 11.949 ms 12x
instance_norm [1, 128, 64, 64] 20.6193 ms 1.61026 ms 13x
instance_norm_grad [1, 128, 64, 64] 74.4767 ms 5.19748 ms 14x
instance_norm [1, 256, 32, 32] 10.308 ms 0.821658 ms 13x
instance_norm_grad [1, 256, 32, 32] 37.1751 ms 2.60926 ms 14x

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@zhangting2020 zhangting2020 changed the title use IndexList to improve performance use IndexList to improve performance of instance_norm op Jun 19, 2020
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the compiler is older

多老的编译器?比如gcc4.8能否支持呢?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是支持的。此行注释已删除

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以直接替换成IndexList么?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前由于windows上eigen版本未升级,因此直接替换为IndexList会导致编译错误。待windows也升级了eigen后,可以直接使用IndexList

@zhangting2020 zhangting2020 force-pushed the instance_norm_perf branch 3 times, most recently from 2dddb65 to 662ee4a Compare September 13, 2020 14:07
Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhangting2020 zhangting2020 merged commit 16999ae into PaddlePaddle:develop Oct 12, 2020
chen-zhiyu pushed a commit to chen-zhiyu/Paddle that referenced this pull request Oct 15, 2020
…e#25132)

* use IndexList to improve performance, test=develop

* remove EIGEN_HAS_INDEX_LIST, test=develop

* use IndexList only when EIGEN_HAS_INDEX_LIST is true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants