update optimizer doc for 2.0#2424
Merged
MRXLT merged 22 commits intoPaddlePaddle:developfrom Sep 2, 2020
Merged
Conversation
TCChenlong
reviewed
Aug 22, 2020
| 论文中没有 ``epsilon`` 参数。但是,为了保持数值稳定性, 避免除0错误, 此处增加了这个参数。 | ||
|
|
||
| 参数: | ||
| - **learning_rate** (float|Variable,可选) - 学习率,用于参数更新的计算。可以是一个浮点型值或者一个值为浮点型的Variable,默认值为0.001 |
Collaborator
There was a problem hiding this comment.
loat|LearningRateDecay 还是 float|Tensor?
Contributor
Author
There was a problem hiding this comment.
改为loat|LearningRateDecay
TCChenlong
reviewed
Aug 31, 2020
TCChenlong
reviewed
Aug 31, 2020
| AdamW | ||
| ------------------------------- | ||
|
|
||
| .. py:class:: paddle.optimizer.AdamW(learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08, parameters=None, weight_decay=0.01, grad_clip=None, name=None, lazy_mode=False) |
Collaborator
There was a problem hiding this comment.
没有写【apply_decay_param_fun】
|
|
||
| 返回:当前步骤的学习率。 | ||
|
|
||
| 返回类型:float |
Collaborator
There was a problem hiding this comment.
不需要单独写返回类型,直接写到返回中即可,如:
返回:float,当前步骤的学习率。
|
|
||
| 返回:当前步骤的学习率。 | ||
|
|
||
| 返回类型:float |
|
|
||
| 返回: tuple(optimize_ops, params_grads),其中optimize_ops为参数优化OP列表;param_grads为由(param, param_grad)组成的列表,其中param和param_grad分别为参数和参数的梯度。在静态图模式下,该返回值可以加入到 ``Executor.run()`` 接口的 ``fetch_list`` 参数中,若加入,则会重写 ``use_prune`` 参数为True,并根据 ``feed`` 和 ``fetch_list`` 进行剪枝,详见 ``Executor`` 的文档。 | ||
|
|
||
| 返回类型: tuple |
|
|
||
| 返回:当前步骤的学习率。 | ||
|
|
||
| 返回类型:float |
|
|
||
| 返回: tuple(optimize_ops, params_grads),其中optimize_ops为参数优化OP列表;param_grads为由(param, param_grad)组成的列表,其中param和param_grad分别为参数和参数的梯度。在静态图模式下,该返回值可以加入到 ``Executor.run()`` 接口的 ``fetch_list`` 参数中,若加入,则会重写 ``use_prune`` 参数为True,并根据 ``feed`` 和 ``fetch_list`` 进行剪枝,详见 ``Executor`` 的文档。 | ||
|
|
||
| 返回类型: tuple |
|
|
||
| 返回:当前步骤的学习率。 | ||
|
|
||
| 返回类型:float |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
完善Adam、Adamax、Optimizer、RMSProp op
新增AdamW op
Optimizer类
参数parameter_list 变为 parameters
参数regularization 变为weight_decay,传入float类型时为L2Decay的系数
set_dict接口变为set_state_dict
动态图下新增step接口,替代minimize
AdamOptimzer变为Adam、AdamaxOptimizer变为Adamax、RMSPropOptimizer变为RMSProp,其余改动与基类Optimizer相同。
新增AdamW类
继承自DecoupledWeightDecay、Adam
英文文档PR:PaddlePaddle/Paddle#26288
英文文档修改PR:PaddlePaddle/Paddle#26711