MKLDNN conv2d kernel added #8451

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

luotao1 merged 4 commits into PaddlePaddle:develop from pzelazko-intel:fluid-mkldnn

Mar 7, 2018

Contributor

pzelazko-intel commented Feb 15, 2018

MKLDNN conv2d and pool2d OP kernels can be enabled with use_mkldnn OP flag - just like currenly present use_cudnn flag. It's set to True by default. use_cudnn flag has figher priority.

Beside unit tests, we validated these kernels by running training and interference on MNIST dataset and comparing results with caffe library.

CLAassistant commented Feb 15, 2018 •

edited

Loading

All committers have signed the CLA.

pzelazko-intel force-pushed the fluid-mkldnn branch 5 times, most recently from 721cf36 to 6216666 Compare

February 20, 2018 17:23

pzelazko-intel requested a review from tensor-tang

February 22, 2018 09:56

luotao1 added the Intel label

Contributor

luotao1 commented Feb 26, 2018

Can you divide this PR into three small PRs?

typo fix - TransFromNeeded -> TransformNeeded and
MKLDNNDeviceContext changes: e0531da and 7a358fa
MKLDNN conv2d OP kernels and unit test added
MKLDNN pool2d OP kernels and unit test added

jacquesqiao reviewed

View reviewed changes

paddle/fluid/framework/op_registry.h Outdated

Member

jacquesqiao Feb 26, 2018

should not add ; at the end, we want user use a macro like

SOME_MACRO();

luotao1 mentioned this pull request

pybind USE_OP_DEVICE_KERNEL(XXX, CUDNN) automatically #8590

Closed

luotao1 reviewed

View reviewed changes

paddle/fluid/operators/CMakeLists.txt Outdated

Contributor

luotao1 Feb 26, 2018

We will pybind USE_OP_DEVICE_KERNEL(XXX, CUDNN) automatically in #8590, in order to make operators/CMakelists.txt much cleaner.

Then, only one sentence op_library(pool_op DEPS pooling) will pybind CPU/CUDA/CUDNN/MKLDNN all device kernel.

Thus:

how about use conv_mkldnn_op.cc likes conv_cudnn_op.cu.cc instead of conv_op.mn.cc?
refine these codes after pybind USE_OP_DEVICE_KERNEL(XXX, CUDNN) automatically #8590 finished?

Contributor Author

pzelazko-intel Feb 26, 2018

@luotao1 Do you want my changes to be merged after #8590 is finished?

Contributor

luotao1 Feb 26, 2018

If #8590 doesn't be finished before your small PRs, you can merge your changes at first.

Contributor

luotao1 Feb 27, 2018

#8590 is finished and merged now.

tensor-tang reviewed

View reviewed changes

Contributor

tensor-tang left a comment •

edited

Loading

First of all, as synced with @pzelazko-intel, we will break this PR into some smaller ones.

As for current code, we also had a discussion.

The most important information is that the current implementation may not be the best efficient one, since the format is fixed as nchw and the transform functions is still under developing.

If anything is missing, @pzelazko-intel please point out.

paddle/fluid/operators/conv_op.cc Outdated

Contributor

tensor-tang Mar 1, 2018

Here we only make MKLDNN library enabled.

As synced with @pzelazko-intel, we would enable MKLDNN layout next time.
Then we would considerate transform function as well.

Contributor

luotao1 Mar 2, 2018

Could you add "TODO" in codes for reminding? Including:

enable MKLDNN layout
enable groups
something more.

Besides, could you not cover the previous commit next time, since we could not find the difference after your updating?

pzelazko Mar 2, 2018

OK, I'm going to add TODOs.
I've covered previous commits, because I wanted commit history to be clear.
Would I have opportunity to squash commits when merging?

Contributor

luotao1 Mar 2, 2018

Yes, we can squash commits when merging your PR.

paddle/fluid/operators/conv_op.mn.cc Outdated

Contributor

tensor-tang Mar 1, 2018

Will enable groups later.

Contributor Author

pzelazko-intel Mar 6, 2018

done

paddle/fluid/operators/conv_op.mn.cc Outdated

Contributor

tensor-tang Mar 1, 2018

The format is fixed as nchw, should support more. We will come back later.

paddle/fluid/operators/conv_op.mn.cc Outdated

Contributor

tensor-tang Mar 1, 2018

This Compute function is too long. We can think about breaking the code to smaller functions in mkldnn_herlper like cudnn_helper did.

Contributor Author

pzelazko-intel Mar 1, 2018

Please look at the answer below.

Contributor Author

pzelazko-intel Mar 6, 2018

done

paddle/fluid/operators/conv_op.mn.cc Outdated

Contributor

tensor-tang Mar 1, 2018

Only conv_pd is saved to context, I think we can save more, like engine, primitives, stream, etc.

Contributor Author

pzelazko-intel Mar 1, 2018

I'm going to refactor it as soon as we know how to handle data transferring between forward and backward in parallel mode (ParallelDo OP).

Contributor Author

pzelazko-intel Mar 6, 2018

refactoring is done

paddle/fluid/operators/conv_op.mn.cc Outdated

Contributor

tensor-tang Mar 1, 2018

I am not so sure if this output name is unique, especially under the scope filed.
I had a discussion with @QiJune before and did not get a formal conclusion at that time.
Maybe Baidu friends can give a better answer. This is just my concern and reminder.

Contributor

luotao1 Mar 2, 2018

I see that conv_cudnn_op.cu.cc also use Output.

Paddle/paddle/fluid/operators/conv_cudnn_op.cu.cc

Lines 35 to 42 in fc37482

    
           class CUDNNConvOpKernel : public framework::OpKernel<T> { 
        
            public: 
        
             void Compute(const framework::ExecutionContext& ctx) const override { 
        
               PADDLE_ENFORCE(platform::is_gpu_place(ctx.GetPlace()), 
        
                              "It must use CUDAPlace."); 
        
               auto* input = ctx.Input<Tensor>("Input"); 
        
               auto* filter = ctx.Input<Tensor>("Filter"); 
        
               auto* output = ctx.Output<Tensor>("Output");

Thus, why conv_mkldnn_op.cc should not use the same output name?

python/paddle/v2/fluid/tests/unittests/test_pool2d_op.py Outdated

Contributor

tensor-tang Mar 1, 2018

From my perspective, it's not appropriate we remove this TODO.
I suggest we remain it and let the owner to fix it, since we may miss some background information.

Contributor Author

pzelazko-intel Mar 1, 2018

I did not remove this line, it's been moved up.

pzelazko-intel force-pushed the fluid-mkldnn branch 2 times, most recently from 7882699 to edcf89a Compare

March 1, 2018 12:21

pzelazko-intel changed the title ~~MKLDNN conv2d and pool2d OP kernels added~~ MKLDNN conv2d kernel added

Contributor Author

pzelazko-intel commented Mar 1, 2018

Now in this PR I'm introducing only conv2d OP MKLDNN kernel.
After this PR is accepted, I'll create a new one for pool2d OP.

pzelazko-intel force-pushed the fluid-mkldnn branch from edcf89a to 0cd6662 Compare

March 1, 2018 13:38

Contributor

luotao1 commented Mar 2, 2018 •

edited

Loading

LGTM on following files:

paddle/fluid/operators/CMakeLists.txt
python/paddle/fluid/layers/nn.py
python/paddle/fluid/nets.py
python/paddle/fluid/tests/unittests/test_conv2d_op.py

@jacquesqiao Can you help review following files:

paddle/fluid/framework/operator.cc
paddle/fluid/framework/operator.h

@QiJune Can you help review following files:

paddle/fluid/platform/device_context.cc
paddle/fluid/platform/device_context.h

@tensor-tang Can you help review following files:

paddle/fluid/operators/conv_mkldnn_op.cc
paddle/fluid/operators/conv_op.cc

pzelazko-intel force-pushed the fluid-mkldnn branch 2 times, most recently from 9a3ecb8 to a4ab82d Compare

March 4, 2018 10:23

tensor-tang reviewed

View reviewed changes

Contributor

tensor-tang left a comment

My ARs for conv_mkldnn_op.cc and conv_op.cc

Just a reminder, as discussed before, the Compute functions are too large. Please do not forget it, since it's pretty important.

paddle/fluid/operators/conv_mkldnn_op.cc Outdated

Contributor

tensor-tang Mar 5, 2018

Dilation is also supported with MKLDNN conv, you can add one more TODO later.

Contributor Author

pzelazko-intel Mar 6, 2018

Done

paddle/fluid/operators/conv_mkldnn_op.cc Outdated

Contributor

tensor-tang Mar 5, 2018 •

edited

Loading

MKLDNN doesn't support dilation in convolution yet

I think this error message is not clear enough, as we know MKL-DNN itself supports groups.
This message would misleading Paddle team and users.
It's that we did not enable it on paddle yet.

As well as the below dilations.

Contributor Author

pzelazko-intel Mar 6, 2018

done

pzelazko-intel added 2 commits

March 5, 2018 18:24


          MKLDNN conv2 OP kernel added

d7d9b25


          TODOs added

06ca8d2

pzelazko-intel force-pushed the fluid-mkldnn branch from a4ab82d to 8090479 Compare

March 5, 2018 17:29


          mkldnn conv2d OP refactor

0e3f110

pzelazko-intel force-pushed the fluid-mkldnn branch from 8090479 to 0e3f110 Compare

March 5, 2018 20:56

jacquesqiao reviewed

View reviewed changes

paddle/fluid/framework/operator.cc Outdated

    
                return static_cast<proto::VarType::Type>(data_type);

              }

              bool OperatorWithKernel::CanCUDNNBeUsed(const ExecutionContext& ctx) const {

Member

jacquesqiao Mar 6, 2018

I think it's better to make CanCUDNNBeUsed and CanMKLDNNBeUsed two global functions.

Contributor Author

pzelazko-intel Mar 6, 2018

@jacquesqiao where would you propose to place these functions?

Member

jacquesqiao Mar 6, 2018

@pzelazko-intel according to this document https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/mkl/mkldnn_fluid.md#mkldnn_helper, we can put the interface there and add a mkldnn_helper.cc.

Contributor Author

pzelazko-intel Mar 6, 2018

done


          CanCUDNNBeUsed and CanMKLDNNBeUsed moved

dad7a02

pzelazko-intel force-pushed the fluid-mkldnn branch from fb95caa to dad7a02 Compare

March 6, 2018 11:50

Contributor

luotao1 commented Mar 6, 2018 •

edited

Loading

Please answer reviewers’ every comment. If you are to follow the comment, please write “Done”; please give a reason otherwise. See code-review.

Contributor Author

pzelazko-intel commented Mar 6, 2018

@luotao1 refactoring has been completed.
Also, I've added "done" comments where it's appropriate.

Member

QiJune commented Mar 6, 2018

@luotao1 DeviceContext part looks good to me.

Member

jacquesqiao commented Mar 6, 2018

framework part looks good to me, thanks! @pzelazko-intel

pzelazko-intel mentioned this pull request

ParallelDo with MKLDNN #8806

Closed

tensor-tang reviewed

View reviewed changes

paddle/fluid/platform/device_context.cc

    
                    device_contexts_.emplace(places[i],

                                             new platform::CPUDeviceContext(

                                                 boost::get<platform::CPUPlace>(places[i])));

              #endif

Contributor

tensor-tang Mar 7, 2018

Honestly, I have a little question here.
When PADDLE_WITH_MKLDNN is enabled, we will not have platform::CPUDeviceContext anyway .
Is that fine with you @jacquesqiao @QiJune ?

Member

QiJune Mar 7, 2018

@tensor-tang It seems that if PADDLE_WITH_GPU is enabled, there will be both CPUDeviceContext and CUDADeviceContext.
So, I think MKLDNNDeviceContext and CPUDeviceContext should be coexist.

Could you provide an example that has a two FC in the network, which one FC is CPU, and the other FC is MKLDNN? Just mnist demo will be fine.

We can see if "MKLDNN" is compatible with "CPU".

Contributor

tensor-tang Mar 7, 2018

Firstly, MKLDNNDeviceContext is inherited from CPUDeviceContext now
https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/platform/device_context.h#L113. So functionally, it can pass the CI on this version.

But from my perspective, I thought CPUDeviceContext should be always available no matter which third-party is added, not only MKLDNN here. Since we can not guarantee all the ops paddle supported are also supported by the third-party library. If this third-party context is not inherited from cpu context, it would be a problem.

So I just want to hear you voice. I am not sure is that proper.

Member

QiJune Mar 7, 2018

Yes, I think so.
Since MKLDNNDeviceContext is inherited from CPUDeviceContext, there will be no problem.
I have not think out a better choice, we can move on first.

Contributor

tensor-tang Mar 7, 2018

OK, Thx

tensor-tang reviewed

View reviewed changes

Contributor

tensor-tang left a comment

LGTM for MKLDNN part

luotao1 approved these changes

View reviewed changes

Contributor

luotao1 left a comment

Thanks for @pzelazko-intel work, and thanks @jacquesqiao @QiJune @tensor-tang review.

luotao1 merged commit 8c71ada into PaddlePaddle:develop

pzelazko-intel deleted the fluid-mkldnn branch

March 8, 2018 09:43

luotao1 mentioned this pull request

MKLDNN Relu Tanh Sqrt Abs activations added #9081

Merged

luotao1 mentioned this pull request

implement conv2d mkldnn kernel #7365

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

jacquesqiao jacquesqiao left review comments

tensor-tang tensor-tang left review comments

QiJune QiJune left review comments

luotao1 luotao1 approved these changes

+1 more reviewer

pzelazko pzelazko left review comments

Labels