-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Gradient check #4333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gradient check #4333
Conversation
pengli09
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic LGTM. Review on certain details (see comments) from someone else is needed.
| def get_backward_op(cls, scope, op, no_grad_set): | ||
| backward_op = core.Operator.backward(op, no_grad_set) | ||
| for in_var_name in backward_op.input_vars(): | ||
| cls.create_variable(scope, in_var_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original version invokes var.get_tensor(), but it's not invoked in cls.create_variable(). Is it OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you checked this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure for this, maybe we can wait reviews from someone else. @jacquesqiao
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Always invoke get_tensor() in create_variable.
| for in_var_name in backward_op.input_vars(): | ||
| cls.create_variable(scope, in_var_name) | ||
| for out_var_name in backward_op.output_vars(): | ||
| cls.create_variable(scope, out_var_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above
| return backward_op | ||
|
|
||
| @classmethod | ||
| def _feed_var(cls, scope, var_name, value, place): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function LGTM. But I'm not familiar with the underlining code. Review from someone else is required.
|
|
||
| @classmethod | ||
| def feed_input(cls, scope, op, inputs, place): | ||
| for in_var_name, is_dup in Operator.get_op_inputs(op.type()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function LGTM. But I'm not familiar with the underlining code. Review from someone else is required.
| output_vals = np.append(output_vals, output_val) | ||
| # get dimension info, allocate memory | ||
| jacobian_matrix.resize( | ||
| (tensor_size, len(output_vals)), refcheck=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why refcheck=False?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This matrix is not shared memory with other object, so we can set refcheck to False safely. Otherwise the allocation will fail (python will throw exception). Please see https://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.ndarray.resize.html
| x_pos = origin_val + delta | ||
| tensor_to_check.set_float_element(i, x_pos) | ||
| concat_flatten_output(i, x_pos_jacobian) | ||
| if in_place: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this line. Review of someone else is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me too, just to keep consistent with original logic.
| for sub_var_name, _ in sub_vars: | ||
| out_var_tensor = cls.get_tensor_by_name(scope, | ||
| sub_var_name) | ||
| data = np.ones(out_var_tensor.shape(), dtype=np.float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is np.float64 ok for op using float32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the unittest works well, I think passing np.float64 to operator with float32 is ok.
| place=place) | ||
| else: | ||
| out_var_tensor = cls.get_tensor_by_name(scope, out_var_name) | ||
| data = np.ones(out_var_tensor.shape(), np.float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is np.float64 ok for op using float32?
| for i in xrange(len(out_var_shapes)): | ||
| accum_size[i] = cls.dim_to_size(out_var_shapes[i][1]) + \ | ||
| (accum_size[i - 1] if i > 0 else 0) | ||
| var_shape_idx[out_var_shapes[i][0]] = i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change out_var_shapes[i][0] to out_var_shapes[i][0].encode('utf-8') to keep consistent with line 264.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change tensor_name.encode('utf-8') to tensor_name
| out_grad_values = np.zeros(accum_size[-1], dtype=np.float64) | ||
| x_grad_jacobian = None | ||
|
|
||
| backward_op = cls.get_backward_op(scope, op, no_grad_set) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move it after line 301
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think reusing backward_op is more efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then move line 300-301 before this line. no_grad_set should be invoked before cls.get_backward_op
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, you are right and follow the comments.
pkuyym
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks and follow comments.
| for sub_var_name, _ in sub_vars: | ||
| out_var_tensor = cls.get_tensor_by_name(scope, | ||
| sub_var_name) | ||
| data = np.ones(out_var_tensor.shape(), dtype=np.float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the unittest works well, I think passing np.float64 to operator with float32 is ok.
| for i in xrange(len(out_var_shapes)): | ||
| accum_size[i] = cls.dim_to_size(out_var_shapes[i][1]) + \ | ||
| (accum_size[i - 1] if i > 0 else 0) | ||
| var_shape_idx[out_var_shapes[i][0]] = i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change tensor_name.encode('utf-8') to tensor_name
| x_pos = origin_val + delta | ||
| tensor_to_check.set_float_element(i, x_pos) | ||
| concat_flatten_output(i, x_pos_jacobian) | ||
| if in_place: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me too, just to keep consistent with original logic.
| out_grad_values = np.zeros(accum_size[-1], dtype=np.float64) | ||
| x_grad_jacobian = None | ||
|
|
||
| backward_op = cls.get_backward_op(scope, op, no_grad_set) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think reusing backward_op is more efficient.
| out_grad_values = np.zeros(accum_size[-1], dtype=np.float64) | ||
| x_grad_jacobian = None | ||
|
|
||
| backward_op = cls.get_backward_op(scope, op, no_grad_set) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then move line 300-301 before this line. no_grad_set should be invoked before cls.get_backward_op
| def get_backward_op(cls, scope, op, no_grad_set): | ||
| backward_op = core.Operator.backward(op, no_grad_set) | ||
| for in_var_name in backward_op.input_vars(): | ||
| cls.create_variable(scope, in_var_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you checked this?
pengli09
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All LGTM
|
感谢您给PaddlePaddle贡献代码。由于Paddle V1/V2版本已不再维护,相关代码也已从develop分支上删除,因此关闭您的PR,欢迎您向Paddle最新版-Fluid贡献代码。 |
Initial version and need refine.
fix #4282
fix #4134