Skip to content

Add LeakyRelu MKLDNN support#18656

Merged
luotao1 merged 1 commit intoPaddlePaddle:developfrom
grygielski:grygielski/leaky_relu_mkldnn
Jul 19, 2019
Merged

Add LeakyRelu MKLDNN support#18656
luotao1 merged 1 commit intoPaddlePaddle:developfrom
grygielski:grygielski/leaky_relu_mkldnn

Conversation

@grygielski
Copy link
Contributor

LeakyRelu MKLDNN support along with unit tests have been added.

@CLAassistant
Copy link

CLAassistant commented Jul 16, 2019

CLA assistant check
All committers have signed the CLA.

@jczaja jczaja requested review from Sand3r- and jczaja July 16, 2019 12:49
@luotao1 luotao1 added the Intel label Jul 16, 2019
@grygielski grygielski force-pushed the grygielski/leaky_relu_mkldnn branch 2 times, most recently from 123e958 to d5aad2a Compare July 16, 2019 13:34
@bingyanghuang
Copy link
Contributor

This PR is to fix #17654

@grygielski grygielski force-pushed the grygielski/leaky_relu_mkldnn branch from d5aad2a to 2d47bd7 Compare July 17, 2019 08:02
Copy link
Contributor

@jczaja jczaja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// input1: X
op->SetInput("X", Input("X"));
// input2: Out
op->SetInput("Out", Input("Out"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add an Input here?

Copy link
Contributor Author

@grygielski grygielski Jul 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because MKLDNN activation function grad functor expects to have input called "Out" to create hash key for caching. Without this change activation grad test crashed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create hash key for caching

Why should we add this input only for creating hash key, since this op doesn't need this input indeed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I did it as it felt like the fastest way to get around this error but you are right, this Input unneccesary so I will change the way MKLDNN key is generated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much! If we change the input, then many trained models before will fail after update the develop branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luotao1 Could you please explain why trained model will fail when we add additional input to grad op? This is just leaky relu grad op , so this grad op is not holding any params, so why extending inputs for grad op would cause failure? I'm asking because as described in #17960 I would like to extend all mkl-dnn grad ops to have X as input of grad ops as mkl-dnn backward implementations are requiring X to compute backward/grad.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phlrain Could you help see this problem?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jczaja @grygielski
Discussed with @phlrain, it's OK to add additional input to grad op as this PR does. But we could not addinput() in OpMaker().

Copy link
Contributor Author

@grygielski grygielski Jul 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so my sollution is acceptable? Because during key generation we would have to add another cases to check if specific operator grad has only X input what causes additional problems when adding MKLDNN support for further activations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is.

@luotao1 luotao1 requested a review from phlrain July 18, 2019 14:50
@luotao1
Copy link
Contributor

luotao1 commented Jul 18, 2019

Do you have any benchmark result for this mkldnn op on face model in #17654

@jczaja
Copy link
Contributor

jczaja commented Jul 18, 2019

@luotao1 When we wanted to test performance we noticed that face model no longer works on develop for us #18658 . So until we have demark model fixed we cannot test performance of leaky rely of mkl-dnn

@luotao1
Copy link
Contributor

luotao1 commented Jul 18, 2019

Got it. I will check it tomorrow

@luotao1
Copy link
Contributor

luotao1 commented Jul 19, 2019

Thanks very much for your contribution.
I test it on demark model. leaky_relu speedup from 132ms->38ms on 4 threads, 3.47x speedup!

  • before:
Event                        Calls       Total       Min.        Max.        Ave.        Ratio.
thread0::conv2d_transpose    400         3794.74     2.86285     17.8366     9.48685     0.768815
thread0::conv2d              400         899.1       0.729642    3.6068      2.24775     0.182158
thread0::leaky_relu          400         132.44      0.085567    1.10562     0.3311      0.0268324
thread0::batch_norm          300         50.5053     0.1174      0.256459    0.168351    0.0102324
thread0::relu                300         39.7742     0.067462    0.242046    0.132581    0.00805825
thread0::fetch               100         11.2578     0.106524    0.123902    0.112578    0.00228083
thread0::tanh                100         7.4732      0.069471    0.090934    0.074732    0.00151407
thread0::feed                100         0.541868    0.004703    0.006246    0.00541868  0.000109783
  • after:
Event                        Calls       Total       Min.        Max.        Ave.        Ratio.
thread0::conv2d_transpose    400         3809.5      2.86721     17.8962     9.52376     0.786597
thread0::conv2d              400         882.729     0.732503    3.49819     2.20682     0.182269
thread0::batch_norm          300         50.5289     0.119364    0.270264    0.16843     0.0104334
thread0::relu                300         42.0752     0.074177    0.234958    0.140251    0.00868782
thread0::leaky_relu          400         38.5553     0.047176    0.194089    0.0963881   0.007961
thread0::fetch               100         10.8161     0.099839    0.117321    0.108161    0.00223335
thread0::tanh                100         8.28066     0.078006    0.108433    0.0828066   0.00170981
thread0::feed                100         0.526761    0.004705    0.006533    0.00526761  0.000108767

@luotao1 luotao1 merged commit d6b6a33 into PaddlePaddle:develop Jul 19, 2019
@luotao1
Copy link
Contributor

luotao1 commented Jul 19, 2019

auto op_list = {"pool2d", "sigmoid", "logsigmoid",
"softshrink", "exp", "brelu",
"pow", "leaky_relu", "stanh",
"relu", "tanh", "tanh_shrink",
"sqrt", "abs", "ceil",
"elu", "floor", "cos",
"sin", "round", "reciprocal",
"hard_shrink", "hard_sigmoid", "relu6",
"soft_relu", "swish", "thresholded_relu",

Do you need to update the op_list in is_test_pass?

@grygielski
Copy link
Contributor Author

I guess leaky_relu is already there

@grygielski grygielski deleted the grygielski/leaky_relu_mkldnn branch July 19, 2019 07:55
@luotao1
Copy link
Contributor

luotao1 commented Jul 19, 2019

I guess leaky_relu is already there

But I don't see it. @grygielski @jczaja

@grygielski
Copy link
Contributor Author

What about 3rd row, 2nd column in op_list you quoted in your last comment?

@luotao1
Copy link
Contributor

luotao1 commented Jul 19, 2019

Sorry, I miss finding it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants