Convolution operator#4042
Conversation
| from paddle.v2.framework.op import Operator | ||
|
|
||
|
|
||
| class TestConv2dOp(unittest.TestCase): |
There was a problem hiding this comment.
The operator python test framework has been refactored, please merge develop branch and change the unit test accordingly.
paddle/operators/gemm_conv_op.h
Outdated
| using Tensor = framework::Tensor; | ||
|
|
||
| template <typename Place, typename T> | ||
| class GemmConvKernel : public framework::OpKernel { |
There was a problem hiding this comment.
We'll write the 3D convolution later. Should we distinguish the names? GemmConvKernel->GemmConv2DKernel, GemmConvGradKernel -> GemmConv2dGradKernel, gemm_conv_op.h->gemm_conv2d_op.h, conv_op.cu->conv2d_op.cu
| namespace paddle { | ||
| namespace operators { | ||
|
|
||
| int outputSize(int input_size, int filter_size, int padding, int stride) { |
There was a problem hiding this comment.
This function is also used in conv3d, pooling2d, pooling3d. Should it be written in one place?
There was a problem hiding this comment.
I think this can be fixed in the next PR. At present, it is not sure where to put this function is better.
paddle/operators/conv_op.cc
Outdated
| REGISTER_OP(conv2d, ops::Conv2DOp, ops::Conv2DOpMaker, conv2d_grad, | ||
| ops::Conv2DOpGrad); | ||
|
|
||
| REGISTER_OP_CPU_KERNEL(conv2d, |
There was a problem hiding this comment.
The current build system requires the filename matches the registered operator name. Maybe rename them both to conv or conv2d.
| } | ||
| }; | ||
|
|
||
| class Conv2DOpMaker : public framework::OpProtoAndCheckerMaker { |
There was a problem hiding this comment.
Can we put Conv2DOpMaker and CPU implementation in a base class like ConvBase? so that CUDA gemm implementation and cudnn implementation can reuse the code.
There was a problem hiding this comment.
I think we do not need to write a Conv2DOpMaker for CudnnConv.
CudnnConv also can use the Conv2DOpMaker class.
paddle/operators/gemm_conv_op.h
Outdated
| auto t1 = framework::EigenVector<T>::Flatten(filter_grad); | ||
| t1.device(context.GetEigenDevice<Place>()) = t1.constant(static_cast<T>(0)); | ||
| auto t2 = framework::EigenVector<T>::Flatten(*input_grad); | ||
| t2.device(context.GetEigenDevice<Place>()) = t2.constant(static_cast<T>(0)); |
There was a problem hiding this comment.
Shouldn't the gradient be cleared here? The weights entered between different Op may be shared.
There was a problem hiding this comment.
The weights entered between different Op may be shared.
If the weights are shared, the framework is responsible for merge the two parts of the gradients.
The gradient tensor in other op is also cleared.
https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/lookup_table_op.h#L60
paddle/operators/conv_op.cc
Outdated
| int input_channels = in->dims()[1]; | ||
| int output_channels = filter->dims()[0]; | ||
|
|
||
| PADDLE_ENFORCE_EQ(in->dims().size(), 4, "Conv2DOp intput should be 4-D."); |
paddle/operators/conv_op.cc
Outdated
| void InferShape(const framework::InferShapeContext &ctx) const override { | ||
| auto in = ctx.Input<Tensor>("Input"); | ||
| auto filter = ctx.Input<Tensor>("Filter"); | ||
| auto out = ctx.Output<Tensor>("Output"); |
There was a problem hiding this comment.
Output<Tensor> -> Output<LoDTensor>
| )DOC"); | ||
| AddAttr<std::vector<int>>("strides", "strides of convolution operator."); | ||
| AddAttr<std::vector<int>>("paddings", "paddings of convolution operator."); | ||
| AddAttr<int>( |
There was a problem hiding this comment.
Put the attr before doc.
paddle/operators/conv_op.cc
Outdated
| auto in = ctx.Input<Tensor>("Input"); | ||
| auto filter = ctx.Input<Tensor>("Filter"); | ||
| auto d_in = ctx.Output<Tensor>(framework::GradVarName("Input")); | ||
| auto d_filter = ctx.Output<Tensor>(framework::GradVarName("Filter")); |
There was a problem hiding this comment.
Output< framework::LoDTensor>
| "when group=2, the first half of the filters are only connected to the " | ||
| "first half of the input channels, and the second half only connected " | ||
| "to the second half.") | ||
| .SetDefault(1); |
There was a problem hiding this comment.
Put AddAttr before AddComment(R"DOC )DOC").
paddle/operators/gemm_conv2d_op.h
Outdated
| context.Input<Tensor>(framework::GradVarName("Output")); | ||
| Tensor* input_grad = | ||
| context.Output<Tensor>(framework::GradVarName("Input")); | ||
| Tensor* filter_grad_ = |
There was a problem hiding this comment.
filter_grad_ -> filter_grad
There was a problem hiding this comment.
这个不行,后面定义了一个filter_grad变量。
paddle/operators/gemm_conv2d_op.h
Outdated
| auto t1 = framework::EigenVector<T>::Flatten(filter_grad); | ||
| t1.device(context.GetEigenDevice<Place>()) = t1.constant(static_cast<T>(0)); | ||
| auto t2 = framework::EigenVector<T>::Flatten(*input_grad); | ||
| t2.device(context.GetEigenDevice<Place>()) = t2.constant(static_cast<T>(0)); |
There was a problem hiding this comment.
Why need to zero memory for input_grad?
There was a problem hiding this comment.
I remember talking to you about this problem, and your suggestion is that operator needs = operation instead of += operation.
paddle/operators/gemm_conv2d_op.h
Outdated
| Tensor filter_grad_slice = | ||
| filter_grad.Slice<T>(g * out_step, (g + 1) * out_step); | ||
| math::matmul<Place, T>(out_grad_slice, false, col_matrix, true, T(1.0), | ||
| &filter_grad_slice, T(1.0), device_context); |
There was a problem hiding this comment.
2. `gradient w.r.t. weight` 和 `gradient w.r.t. input data` 的计算,是否需要提出来两个函数?
3. 需要考虑 `gradient w.r.t. weight` 或者 `gradient w.r.t. input data` 不计算的情况,类似 mul_op的情况: https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/mul_op.h#L76
There was a problem hiding this comment.
- Backward对于gradient w.r.t. input data的计算和通常的实现(Paddle老的,Caffe)不同,是否需要说明下?
这个不同指的是什么?
- gradient w.r.t. weight 和 gradient w.r.t. input data 的计算,是否需要提出来两个函数?
目前来看,提出来两个函数好像并没有什么用,caffe2里面也并没有去提出两个函数。
There was a problem hiding this comment.
这个不同指的是什么?
上面看comment错了,可以忽略吧 :)
| frowid][fcolid] | ||
| output_value += input_value * filter_value | ||
| output[batchid][outchannelid][rowid][ | ||
| colid] = output_value |
There was a problem hiding this comment.
There was a problem hiding this comment.
可以试一下,不过这个好像不支持group。
There was a problem hiding this comment.
不能用np.convolve
'Returns the discrete, linear convolution of two one-dimensional sequences.'
| self.check_output() | ||
|
|
||
| def test_check_grad(self): | ||
| self.check_grad(set(['Input', 'Filter']), 'Output') |
There was a problem hiding this comment.
需要类似mul_op一样的check,检车Input或者Filter不需要Grad的情况: https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/framework/tests/test_mul_op.py#L49
|
另外, convolution operator是否需要考虑加bias? |
|
Paddle的 |
|
暂时不考虑在GemmConv2DKernel里面支持bias, 多输入,activation等。跟这个GemmConv2DKernel一样的还会有一个CudnnConv2DKernel也不会支持bias,多输入,activation这些。后续可以考虑一个更上层的OpConvLayer,来支持bias,多输入这些。 |
|
fix #3691 |
No description provided.