Skip to content

[MKL-DNN] Thread-Safety for MKL-DNN reusing Part 1#17965

Merged
luotao1 merged 8 commits intoPaddlePaddle:developfrom
jczaja:prv-mt-experiments
Jun 11, 2019
Merged

[MKL-DNN] Thread-Safety for MKL-DNN reusing Part 1#17965
luotao1 merged 8 commits intoPaddlePaddle:developfrom
jczaja:prv-mt-experiments

Conversation

@jczaja
Copy link
Contributor

@jczaja jczaja commented Jun 10, 2019

This PR adress the issue reported in #17611 by making Reusing concept thread safe. Currently only
part of ops were adapted. After sorting out an issue #17960 additional PR for Relu, pooling and INT8 prims will be raised.

@luotao1 I have updated this branch couple of hours ago, so please double check if this PR works for you. If ContentDNN is not working then it means that Relu and other missing ops has to be updated as well. Anyway please share error messages if any.

@jczaja jczaja added the Intel label Jun 10, 2019
@jczaja jczaja added this to the v1.5 for Intel milestone Jun 10, 2019
@jczaja jczaja requested review from a user, kbinias, luotao1 and wojtuss June 10, 2019 16:51
@ghost
Copy link

ghost commented Jun 10, 2019

@LeoZhao-Intel Please help to review this PR. Thanks!

@luotao1
Copy link
Contributor

luotao1 commented Jun 11, 2019

I test it on face deployment service, it works successfully.

Copy link
Contributor

@LeoZhao-Intel LeoZhao-Intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM.

Few comments for Part2:

  1. By key generation with threadid, is lock still needed?
  2. Current softmax grad op is dependent on softmax in mkldnn implementation, unlike transpose2 does, can we remove this tight pairing?

mkldnn::prop_kind fwd_prop_kind) {
const std::string key_conv_pd = key_ + "@conv_pd";
// Conv PD has to be passed to Grad op that
// may be exxecuted by diffrent thread, hence
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo, "executed"

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Please refine #17965 (review) in next PR!

@luotao1 luotao1 merged commit 84bb45c into PaddlePaddle:develop Jun 11, 2019
@luotao1
Copy link
Contributor

luotao1 commented Jun 18, 2019

@jczaja @jianhang-liu @LeoZhao-Intel I test this PR on content DNN. It runs successfully on multi-thread with transpose_mkldnn_op.

@LeoZhao-Intel
Copy link
Contributor

I did similar test before, both transpose2/transpose2_grad on mkldnn can work well in ParallelExecutor.

@jczaja
Copy link
Contributor Author

jczaja commented Jun 18, 2019

@luotao1 Thanks for letting us know on your findings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants