Refine the gradient calculation errors caused by renaming in while_grad#27814
Merged
gfwm2013 merged 1 commit intoPaddlePaddle:developfrom Oct 12, 2020
Merged
Conversation
|
Thanks for your contribution! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR types
Bug fixes
PR changes
OPs
Describe
修复WhileOp反向计算中梯度计算错误的bug。
问题1.目前WhileOp的反向计算,对于既是其input又是output的变量的梯度值计算存在问题,如下面的代码:
在之前的逻辑中,最后fetch到的i.grad_name的值为1,x.grad_name的值为 0 ,结果结果是错误的。
在该PR修改之后, fetch到的i.grad_name的值为 2 ,x.grad_name的值为 1 ,计算正确。
问题2. 若反向计算中会使用前向计算中的变量时,目前的WhileOp会一直使用正向计算中最后一次的值
在之前的逻辑中,最后fetch到的i.grad_name的值为 6,x.grad_name的值为 0 ,结果结果是错误的。
在该PR修改之后, fetch到的i.grad_name的值为 3 ,x.grad_name的值为 2 ,计算正确。
修改思路:
问题1
问题2:
in_place操作与非in_place操作的不同,所以在保存变量之前,特地将涉及in_place操作的变量剔除,以保证梯度计算的正确性。