Skip to content

realize flatten parameters and grads in optimizer, calls in Adam and Adamw optimizer#48810

Closed
pangyoki wants to merge 11 commits intoPaddlePaddle:developfrom
pangyoki:optimization_support_flatten_param_grads
Closed

realize flatten parameters and grads in optimizer, calls in Adam and Adamw optimizer#48810
pangyoki wants to merge 11 commits intoPaddlePaddle:developfrom
pangyoki:optimization_support_flatten_param_grads

Conversation

@pangyoki
Copy link
Contributor

@pangyoki pangyoki commented Dec 6, 2022

PR types

New features

PR changes

APIs

Describe

背景

在NPU设备上的优化中,ernie模型的优化器阶段耗时非常长,主要原因是优化器阶段需要拉起非常多的小kernel。#33461 这个PR实现了flatten_params_grads的方法,调用coalesce_tensorop将所有的parameters和grads合并到一起,形成一个parameter和一个grad,这样就只用调用一次优化器op即可,极大的减少了kernel拉起的时间。

#33461 这个PR是在fluid/optimizer.py下实现了flatten_param_grads方法,且主要处理了静态图场景。

本PR工作

本PR将该方法迁移到paddle/optimizer/optimizer.py下,并对动态图模式下也做了处理。
但是,因为现在最终态_C_ops.coalesce_tensor看起来不是inplace的,对输出的flatten_param和flatten_grads的修改无法影响原parameters和grads。所以还是调用的_legacy_C_ops.coalesce_tensor方法。
而且,与静态图不同,动态图模式下的use_align属性暂时设置为False,因为align出来的值无法初始化为0,导致可能出现精度问题。在目前的NPU场景下,暂不处理use_align属性为True的情况。

后续TODO

  • 调用最终态coalesce_tensor op。
  • 动态图模式下use_align属性设置为True。

@paddle-bot
Copy link

paddle-bot bot commented Dec 6, 2022

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot
Copy link

paddle-bot bot commented Dec 6, 2022

❌ The PR is not created using PR's template. You can refer to this Demo.
Please use PR's template, it helps save our maintainers' time so that more developers get helped.

@pangyoki pangyoki closed this Jan 3, 2023
@pangyoki pangyoki reopened this Jan 3, 2023
@PaddlePaddle PaddlePaddle locked and limited conversation to collaborators Jan 3, 2023
@PaddlePaddle PaddlePaddle unlocked this conversation Jan 3, 2023
@pangyoki pangyoki changed the title [DO NOT MERGE] flatten optimization realize flatten parameters and grads in optimizer, calls in Adam and Adamw optimizer Jan 10, 2023
@paddle-bot paddle-bot bot closed this Jan 16, 2024
@paddle-bot
Copy link

paddle-bot bot commented Jan 16, 2024

Since you haven't replied for more than a year, we have closed this issue/pr.
If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up.
由于您超过一年未回复,我们将关闭这个issue/pr。
若问题未解决或有后续问题,请随时重新打开,我们会继续跟进。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant