[NPU] flatten params and grads, fuse grad_clip and optimizer op by zhiqiu · Pull Request #33461 · PaddlePaddle/Paddle

zhiqiu · 2021-06-09T09:12:53Z

PR types

Performance optimization

PR changes

OPs

Describe

[NPU] flatten params and grads, fuse grad_clip and optimizer op

For example, ernie-3.0 model has 300+ parameters, and thus 300+ gradients of parameters.

Each training step, the program has to perform grad_clip the gradient and update the parameter. So, there are 300+ grad_clip operators and 300+ optimizer operators.

This PR tries to flatten all the parameters into one continuous memory space and also flatten the gradients. After that, some of the gradient clip and optimizer can be done by 1 time on the flattened parameter/gradient.

Currently, Adam + ClipByGlobalNorm is supported.

Performance

ernie-3.0, bs=20480, training speed： 20684->23583 tokens/s, +13.8%

paddle-bot-old · 2021-06-09T09:13:00Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… dev/fuse_all_opt

zhangting2020 · 2021-06-17T11:13:21Z

paddle/optimzier/optimizer.py是不是也需要同步修改？

zhangting2020 · 2021-06-17T11:29:10Z

python/paddle/fluid/optimizer.py

+        """
+        Args:
+            flatten_param_grads (bool, optional): Whether to flatten all the parameters and grads. 
+                If true, the parameters and gradients will be coalesce to continue mempry, 


continue mempry -> contiguous memory

done, thanks

enable npu alignment

3b9361b

zhiqiu added 3 commits June 16, 2021 10:22

support flatten_params/grads

2061371

support clip by global norm

cc9be7d

merge develop

5741648

zhiqiu requested review from phlrain and zhangting2020 June 17, 2021 05:41

zhiqiu added 5 commits June 17, 2021 07:07

remove memset in coalesce_tensor_op

5727fb8

fix npu kernel of sum op when input is one tensor

2eb00d9

add ut for flatten_param_grads+regularizer

11b2793

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

311a5f0

… dev/fuse_all_opt

fix ut

e4bc755

zhangting2020 reviewed Jun 17, 2021

View reviewed changes

fix typo

5bd299e

phlrain approved these changes Jun 21, 2021

View reviewed changes

zhiqiu merged commit c269a16 into PaddlePaddle:develop Jun 21, 2021

pangyoki mentioned this pull request Jan 10, 2023

realize flatten parameters and grads in optimizer, calls in Adam and Adamw optimizer #48810

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NPU] flatten params and grads, fuse grad_clip and optimizer op#33461

[NPU] flatten params and grads, fuse grad_clip and optimizer op#33461
zhiqiu merged 10 commits intoPaddlePaddle:developfrom
zhiqiu:dev/fuse_all_opt

zhiqiu commented Jun 9, 2021 •

edited

Loading

Uh oh!

paddle-bot-old bot commented Jun 9, 2021

Uh oh!

zhangting2020 commented Jun 17, 2021

Uh oh!

zhangting2020 Jun 17, 2021

Uh oh!

zhiqiu Jun 17, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zhiqiu commented Jun 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR types

PR changes

Describe

Performance

Uh oh!

paddle-bot-old bot commented Jun 9, 2021

Uh oh!

zhangting2020 commented Jun 17, 2021

Uh oh!

zhangting2020 Jun 17, 2021

Choose a reason for hiding this comment

Uh oh!

zhiqiu Jun 17, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zhiqiu commented Jun 9, 2021 •

edited

Loading