[MKL-DNN] Fully Connected#15226
Conversation
e1b452a to
de14d1f
Compare
|
Could you tell us the difference or improvment over the current fc implementation. Does it just for infer? If it is performance improvement, could you give some brief data of before and after? |
de14d1f to
e9fc623
Compare
dbe2aeb to
e7fb740
Compare
01d0b0d to
471fd62
Compare
|
@Sand3r- any update on this? |
@hshen14 As mentioned in an internal mail, Paddle's CI VM yields flawed output for some of the convolutions and fc. This results in insufficient accuracy to have test_analyzer_resnet50 passed. This isn't the case for all of our local machines, where everything works as intended. |
56c3ab9 to
f121edc
Compare
b14d83c to
23e7ba3
Compare
9be4960 to
1141d48
Compare
10d06d0 to
0649019
Compare
|
The problem with the FC operator is that on CI MKL library is using AVX2 instruction set, while it should use AVX512. We reproduced the problem locally by enforcing AVX2 in MKL using |
All the CI machines have AVX512, see #15032 (comment) and #15032 (comment)
Do you mean Paddle/paddle/scripts/paddle_build.sh Line 55 in 41b8cf0 |
f9cde12 to
8caea93
Compare
|
@luotao1 I noticed differences between the downloaded versions of the MKLML libraries. |
test=develop
test=develop
test=develop
test=develop
Also, refine pass to comply with newest interface. test=develop
test=develop
test=develop
test=develop
test=develop
test=develop
test=develop
test=develop
2d73975 to
04cbeeb
Compare
|
Done. |
|
Didn't expect the happiest day of my life would come today. |
Introduces Fully Connected operator. It has a comparable performance to the reference version, although it is necessary to be merged, so that the base for the int8 implementation and its performance gains is established.
Update 05.04.2019
Compared to a reference FC, the MKL-DNN version provides: