[hybrid performance] optim the grad fuse for pipeline mode by sorting the grad by dtype by FeixLiu · Pull Request #35070 · PaddlePaddle/Paddle

FeixLiu · 2021-08-23T03:03:32Z

PR types

Performance optimization

PR changes

Others

Describe

Sorted the grad by dtype before coalescing, can decrease the number of coalescing op.
Besides, by reducing the number of coalesce op, the number of c_allreduce_sum op can also be reduced.

All test are based on Ernie3.0, pp=dp=mp=2
fp16_allreduce=True
optimize_cast=True

Decrease in number of coalesce op

	number before this opt	number after this opt	gain
card 0/4	48	6	-700%
card 1/5	48	6	-700%
card 2/6	120	12	-900%
card 3/7	120	12	-900%
one dp group	336	36	-833%

Loss diff

Performance

throughput before this opt	throughput after this opt	gain
39146	39232	+ 0.2%

paddle-bot-old · 2021-08-23T03:03:36Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

wangxicoding

LGTM。性能再给一下

python/paddle/fluid/optimizer.py

gongweibao

看comment。

gongweibao

LGTM

…de by sorting the grad by dtype (PaddlePaddle#35070)

…de by sorting the grad by dtype (#35070) (#35300)

…eline mode by sorting the grad by dtype (PaddlePaddle#35070) (PaddlePaddle#35300)" This reverts commit e69cc21.

PaddlePaddle#35116) (PaddlePaddle#35301)" This reverts commit 2931df5. Revert "[cherry-pick][hybrid performance] optim npu coalesce set constant (PaddlePaddle#35105) (PaddlePaddle#35302)" This reverts commit 12260bd. Revert "[cherry-pick][hybrid performance] optim the grad fuse for pipeline mode by sorting the grad by dtype (PaddlePaddle#35070) (PaddlePaddle#35300)" This reverts commit e69cc21. Revert "[cherry-pick][hybrid performance] Grad fuse for gradient merge under pipeline mode (PaddlePaddle#35004) (PaddlePaddle#35299)" This reverts commit e931cd1. Revert "Add flags to control whether to check Nan value of hccl_allreduce_sum. (PaddlePaddle#35093) (PaddlePaddle#35298)" This reverts commit d4948bc. Revert "[hybrid] Fix row parallel linear bias (PaddlePaddle#35186) (PaddlePaddle#35297)" This reverts commit b36fb03. Revert "[hybrid][npu] fix npu clear float status in pipeline (PaddlePaddle#35165) (PaddlePaddle#35295)" This reverts commit 167685e. Revert "[hybrid npu] fix npu found_finite in hybrid (PaddlePaddle#35134) (PaddlePaddle#35291)" This reverts commit e64105f. Revert "[cherry-pick][Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer (PaddlePaddle#34965) (PaddlePaddle#35296)" This reverts commit 6fb58ae. Revert "[cherry-pick] NPU use squared_l2_norm in GradientClipByGlobalNorm (PaddlePaddle#34836) (PaddlePaddle#35289)" This reverts commit 38c27d5.

FeixLiu added 2 commits August 23, 2021 09:24

sorted the grad by dtype

8bc83c8

supports all dtype

0271c0f

wangxicoding approved these changes Aug 23, 2021

View reviewed changes

wangxicoding requested review from fuyinno4 and gongweibao August 23, 2021 05:27

gongweibao reviewed Aug 23, 2021

View reviewed changes

python/paddle/fluid/optimizer.py Show resolved Hide resolved

gongweibao requested changes Aug 23, 2021

View reviewed changes

gongweibao approved these changes Aug 23, 2021

View reviewed changes

gongweibao merged commit fad4b3b into PaddlePaddle:develop Aug 23, 2021

FeixLiu deleted the grad_merge_fuse_optim branch August 25, 2021 02:09

FeixLiu added a commit to FeixLiu/Paddle that referenced this pull request Aug 31, 2021

[cherry-pick][hybrid performance] optim the grad fuse for pipeline mo…

e94136e

…de by sorting the grad by dtype (PaddlePaddle#35070)

wangxicoding pushed a commit that referenced this pull request Aug 31, 2021

[cherry-pick][hybrid performance] optim the grad fuse for pipeline mo…

e69cc21

…de by sorting the grad by dtype (#35070) (#35300)

FeixLiu added a commit to FeixLiu/Paddle that referenced this pull request Sep 2, 2021

Revert "[cherry-pick][hybrid performance] optim the grad fuse for pip…

7e1dee7

…eline mode by sorting the grad by dtype (PaddlePaddle#35070) (PaddlePaddle#35300)" This reverts commit e69cc21.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[hybrid performance] optim the grad fuse for pipeline mode by sorting the grad by dtype#35070

[hybrid performance] optim the grad fuse for pipeline mode by sorting the grad by dtype#35070
gongweibao merged 2 commits intoPaddlePaddle:developfrom
FeixLiu:grad_merge_fuse_optim

FeixLiu commented Aug 23, 2021 •

edited

Loading

Uh oh!

paddle-bot-old bot commented Aug 23, 2021

Uh oh!

wangxicoding left a comment

Uh oh!

Uh oh!

gongweibao left a comment

Uh oh!

gongweibao left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

FeixLiu commented Aug 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR types

PR changes

Describe

Decrease in number of coalesce op

Loss diff

Performance

Uh oh!

paddle-bot-old bot commented Aug 23, 2021

Uh oh!

wangxicoding left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gongweibao left a comment

Choose a reason for hiding this comment

Uh oh!

gongweibao left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

FeixLiu commented Aug 23, 2021 •

edited

Loading