[MKL-DNN] Tensor modifications revert #16462
[MKL-DNN] Tensor modifications revert #16462tensor-tang merged 4 commits intoPaddlePaddle:developfrom
Conversation
ab775af to
4351e04
Compare
4351e04 to
b44cc19
Compare
|
@luotao1 , @tensor-tang One of tests: test_fsp_op.py does fail on Mac OS (doesn't fail on linux). rerun was done three times. It does not use MKL-DNN or functionality from this PR. Please advice how to proceed. |
It is fixed in #16502 |
|
@tensor-tang @luotao1 This PR fail in MAC CI several times but have no issue in PR CI (current failure is to wait for approval). The failure in MAC CI is quite strange. Sometimes build failure, sometimes fail in "test_fsp_op.py" which is not related to this change. Could you help to have a look? Thanks! |
|
@jianhang-liu Please let @xiaolil1 check this PR for performance and let us know your findings at first. |
|
@jianhang-liu luotao has commented?#16462 (comment) |
|
@xiaolil1 Could you help here? Jacek did measurement in FP32 side and found performance is all recovered (on RESNET-50). Could you help to check in your side especially for INT8? Thanks! |
tensor-tang
left a comment
There was a problem hiding this comment.
Approve for tensor.h changes.
…ed (PaddlePaddle#16233)" This reverts commit 13816dd. Apart from enabling transformer for MKL-DNN
This reverts commit c63f6b2. Conflicts: paddle/fluid/operators/mkldnn/concat_mkldnn_op.cc
- lint test=develop - Lint fixes test=develop - Lint fixes test=develop - Fix Transpose MKLDNN op test=develop
b44cc19 to
e239cb0
Compare
|
@tensor-tang , @luotao1 Changes were rebased on newer develop (approval got dismissed) |
|
As mentioned at begining of this PR. This PR was tested for performance as well as its modification(older MKL-DNN) to verify restoring of performance. With currently used MKL-DNN (0.18) performance was found to be lower. |
|
@jczaja @jianhang-liu @tensor-tang @luotao1
|
|
@xiaolil1 Thanks for testing this PR. For full picture it would be good to modify this PR to test with MKL-DNN from before regression. I can prepare that one today, once I'm back to work As for MKL-DNN regression testing. Without profiling conv operator , it is difficult to say , but my current guess is that time of creation of primitive descriptors was increased. It does not matter for int8 conv (there is a reuse for that one) but it does matter for fp32. Next Week I will make PR with fix improving that. MKL-DNN optimize execution of primitives rather than creation time. During Primitive Descriptor creation MKL-DNN does iterate over available implementations and |
|
We tested develop, develop + #16325 (修复PR), develop + #16462 (revert PR), reference (08c96d1) , with BS=1, Cores = 1 on CLX, both INT8 and FP32.
|
|
Here is the latency with 1000 iterations.
|
This PR is removing all tensor modifications. When doing this PR there were conflicts in Concat MKLDNN op , as this op started to use new API introduced by tensor modifications. Hence @xiaolil1 Could you please take a look at relevant changes.
As for performance results. We tested (so far, only resnet50) on CLX this PR vs this PR with older MKL-DNN. PR with older MKL-DNN shows that FPS was restored to level matching one before regressions. As for this PR (including MKL-DNN 0.18) performance is poorer by ~3 FPS (120 -> 117 ,resnet50 Fp32).
@xiaolil1 Please check this PR for performance and let us know your findings.
On making this revert. I used "git revert" on three commits that introduced tensor modifications.
Manually I had to change: