[MKL-DNN] Tensor modifications revert by jczaja · Pull Request #16462 · PaddlePaddle/Paddle

jczaja · 2019-03-26T11:00:04Z

This PR is removing all tensor modifications. When doing this PR there were conflicts in Concat MKLDNN op , as this op started to use new API introduced by tensor modifications. Hence @xiaolil1 Could you please take a look at relevant changes.

As for performance results. We tested (so far, only resnet50) on CLX this PR vs this PR with older MKL-DNN. PR with older MKL-DNN shows that FPS was restored to level matching one before regressions. As for this PR (including MKL-DNN 0.18) performance is poorer by ~3 FPS (120 -> 117 ,resnet50 Fp32).

@xiaolil1 Please check this PR for performance and let us know your findings.

On making this revert. I used "git revert" on three commits that introduced tensor modifications.
Manually I had to change:

resolve conflict in concat mkldnn op
Third commit was fixing and enabling transformer for mkl-dnn. This enabling of unit test was not reverted , while fix was reverted as it was related to tensor modifications.
Transpose MKLDNN op was updated with labeling output tensor as kNCHW layout , to avoid having mkldnn_blocked_format.

jczaja · 2019-03-27T22:07:04Z

@luotao1 , @tensor-tang One of tests: test_fsp_op.py does fail on Mac OS (doesn't fail on linux). rerun was done three times. It does not use MKL-DNN or functionality from this PR. Please advice how to proceed.

luotao1 · 2019-03-28T02:11:34Z

One of tests: test_fsp_op.py does fail on Mac OS

It is fixed in #16502

ghost · 2019-03-28T02:48:35Z

@tensor-tang @luotao1 This PR fail in MAC CI several times but have no issue in PR CI (current failure is to wait for approval). The failure in MAC CI is quite strange. Sometimes build failure, sometimes fail in "test_fsp_op.py" which is not related to this change. Could you help to have a look? Thanks!

luotao1 · 2019-03-28T03:25:09Z

@jianhang-liu Please let @xiaolil1 check this PR for performance and let us know your findings at first.

tensor-tang · 2019-03-28T03:36:39Z

@jianhang-liu luotao has commented？#16462 (comment)

ghost · 2019-03-28T03:47:26Z

@xiaolil1 Could you help here? Jacek did measurement in FP32 side and found performance is all recovered (on RESNET-50). Could you help to check in your side especially for INT8? Thanks!

tensor-tang

Approve for tensor.h changes.

…ed (PaddlePaddle#16233)" This reverts commit 13816dd. Apart from enabling transformer for MKL-DNN

This reverts commit c63f6b2. Conflicts: paddle/fluid/operators/mkldnn/concat_mkldnn_op.cc

…15429)" test=develop This reverts commit dec9cf5.

- lint test=develop - Lint fixes test=develop - Lint fixes test=develop - Fix Transpose MKLDNN op test=develop

xiaolil1 · 2019-03-28T04:57:49Z

@luotao1 @jianhang-liu We are verifying #16462.

jczaja · 2019-03-28T05:10:02Z

@tensor-tang , @luotao1 Changes were rebased on newer develop (approval got dismissed)

jczaja · 2019-03-28T05:29:48Z

As mentioned at begining of this PR. This PR was tested for performance as well as its modification(older MKL-DNN) to verify restoring of performance. With currently used MKL-DNN (0.18) performance was found to be lower.

xiaolil1 · 2019-03-28T06:33:55Z

@jczaja @jianhang-liu @tensor-tang @luotao1
Here is the performance we tested,
"latest develop + #16462" vs "reference version 08c96d1",
FP32 latency performance still have 6% regression, while we generally allow 5% drop, @tensor-tang @luotao1 please make the decision whether approve it.
@jczaja @jianhang-liu Due to MKL-DNN update between the latest code and reference code, we suggest you to file a JIRA for MKL-DNN to track the performance regression issue.

INT8 ( BS = 1, Cores = 28)	ResNet-50 ( FPS )	MobileNet ( FPS )
patched	286	691
reference	262	538
patched / reference	1.09	1.28

FP32 ( BS = 1, Cores = 28)	ResNet-50 ( FPS )	MobileNet ( FPS )
patched	94	296
reference	100	307
patched / reference	0.94	0.96

jczaja · 2019-03-28T07:34:52Z

@xiaolil1 Thanks for testing this PR. For full picture it would be good to modify this PR to test with MKL-DNN from before regression. I can prepare that one today, once I'm back to work

As for MKL-DNN regression testing. Without profiling conv operator , it is difficult to say , but my current guess is that time of creation of primitive descriptors was increased. It does not matter for int8 conv (there is a reuse for that one) but it does matter for fp32. Next Week I will make PR with fix improving that. MKL-DNN optimize execution of primitives rather than creation time. During Primitive Descriptor creation MKL-DNN does iterate over available implementations and
pick then one suitable for work. MKL-DNN 0.18 added Int8 functionality so perhaps choosing implementation takes a bit longer than it used to (this is a guess)

xiaolil1 · 2019-03-28T09:55:20Z

We tested develop, develop + #16325 (修复PR)， develop + #16462 (revert PR), reference (08c96d1) , with BS=1, Cores = 1 on CLX, both INT8 and FP32.

INT8 ( BS = 1, Cores = 1)	ResNet-50 ( FPS )	MobileNet ( FPS )
develop	69	313
develop + #16325	69	311
develop + #16462	70	325
reference	64	296
develop / reference	1.07	1.05
develop + #16325 / reference	1.07	1.05
develop + #16462 / reference	1.09	1.10

FP32 ( BS = 1, Cores = 1)	ResNet-50 ( FPS )	MobileNet ( FPS )
develop	18	105
develop + #16325	18	104
develop + #16462	18	106
reference	18	107
develop / reference	1.00	0.98
develop + #16325 / reference	0.99	0.97
develop + #16462 / reference	0.99	0,99

xiaolil1 · 2019-03-28T11:33:08Z

Here is the latency with 1000 iterations.

INT8 ( BS = 1, Cores = 1)	ResNet-50 ( ms )	MobileNet ( ms )
develop	14.51	3.20
develop + #16325	14.47	3.22
develop + #16462	14.20	3.08
reference	15.57	3.38

FP32 ( BS = 1, Cores = 1)	ResNet-50 ( ms )	MobileNet ( ms )
develop	55.14	9.54
develop + #16325	55.28	9.64
develop + #16462	55.51	9.45
reference	54.91	9.37

tensor-tang · 2019-03-28T13:30:39Z

@jczaja @xiaolil1
Thanks again for your efficient work.

jczaja force-pushed the prv-tensor-modifications-revert branch 2 times, most recently from ab775af to 4351e04 Compare March 26, 2019 13:02

jczaja changed the title ~~Prv tensor modifications revert~~ [MKL-DNN] Tensor modifications revert Mar 26, 2019

jczaja requested a review from tensor-tang March 26, 2019 13:40

jczaja added Intel high priority labels Mar 26, 2019

jczaja force-pushed the prv-tensor-modifications-revert branch from 4351e04 to b44cc19 Compare March 27, 2019 10:35

jczaja requested review from kbinias and luotao1 March 27, 2019 12:57

jczaja mentioned this pull request Mar 27, 2019

Performance Regression Caused by Tensor Modification. #16248

Closed

tensor-tang mentioned this pull request Mar 28, 2019

DO-NOT-MERGE [MKL-DNN]Tensor modifications revert (transpose mkldnn revert) #16504

Closed

tensor-tang previously approved these changes Mar 28, 2019

View reviewed changes

jczaja added 4 commits March 28, 2019 05:55

Revert "[MKL-DNN] Fix to crash of Transformer when mkldnn is to be us…

8b46ea5

…ed (PaddlePaddle#16233)" This reverts commit 13816dd. Apart from enabling transformer for MKL-DNN

Revert "- MKL-DNN pooling updated to set_prim_desc"

30acb18

This reverts commit c63f6b2. Conflicts: paddle/fluid/operators/mkldnn/concat_mkldnn_op.cc

Revert "[MKL-DNN] MKL-DNN specific Tensor modification (PaddlePaddle#…

cdaa95f

…15429)" test=develop This reverts commit dec9cf5.

- concat compilation fix

e239cb0

- lint test=develop - Lint fixes test=develop - Lint fixes test=develop - Fix Transpose MKLDNN op test=develop

jczaja dismissed tensor-tang’s stale review via e239cb0 March 28, 2019 04:56

jczaja force-pushed the prv-tensor-modifications-revert branch from b44cc19 to e239cb0 Compare March 28, 2019 04:56

tensor-tang approved these changes Mar 28, 2019

View reviewed changes

jczaja mentioned this pull request Mar 28, 2019

DO-NOT-MERGE tensor modifications revert old mkldnn #16521

Closed

tensor-tang merged commit 2632327 into PaddlePaddle:develop Mar 28, 2019

xiaolil1 mentioned this pull request Apr 8, 2019

Enable MKL-DNN Reshape kernel for MobileNet-SSD performance improvement. #16173

Closed

Conversation

jczaja commented Mar 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jczaja commented Mar 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

luotao1 commented Mar 28, 2019

Uh oh!

ghost commented Mar 28, 2019

Uh oh!

luotao1 commented Mar 28, 2019

Uh oh!

tensor-tang commented Mar 28, 2019

Uh oh!

ghost commented Mar 28, 2019

Uh oh!

tensor-tang left a comment

Choose a reason for hiding this comment

Uh oh!

xiaolil1 commented Mar 28, 2019

Uh oh!

jczaja commented Mar 28, 2019

Uh oh!

jczaja commented Mar 28, 2019

Uh oh!

xiaolil1 commented Mar 28, 2019

Uh oh!

jczaja commented Mar 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xiaolil1 commented Mar 28, 2019

Uh oh!

xiaolil1 commented Mar 28, 2019

Uh oh!

tensor-tang commented Mar 28, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jczaja commented Mar 26, 2019 •

edited

Loading

jczaja commented Mar 27, 2019 •

edited

Loading

jczaja commented Mar 28, 2019 •

edited

Loading