-
Notifications
You must be signed in to change notification settings - Fork 5.9k
[oneDNN] Fix to #33282 , added support of X input broadcasting to oneDNN elementwise ops #33549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your contribution! |
|
@jczaja Hi next time could you please name the PR with more specific description? For example this PR could be extend broadcast elementwise_add op with X(1 dims) Y(2 dims). |
| const auto dst_memory = | ||
| is_inplaced ? src_x_memory : handler.AcquireDstMemory(z); | ||
| const auto dst_memory = handler.AcquireDstMemory(z); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we remove the inplace support for elementwise_add and elementwise_mul ? But you plan to remove it permanently or you will still add inplace later ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inplace working with oneDNN caching is very hard , and there is little perf gain and some performance improvement. Among all of this elementwise ops for inplace are most difficult so I removed inplace support for elementwise oneDNN kernels.
paddle/fluid/platform/mkldnn_reuse.h
Outdated
| // if output tensor(z) is nullptr then we are computing into oneDNN | ||
| // managed buffer | ||
| const auto dst_tz = | ||
| (z == nullptr) ? src_x_tz : framework::vectorize(z->dims()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if z is null, we point set dst_tz to src_x_tz, but what if src_x_tz is one dimension ? And why z could be nullptr, in which case z could be nullptr ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good observation.
-
z is null in situation when we want Binary primitive to write into allocation managed by oneDNN e.g. we do not want output be send to Tensor. This is only used in elementwise_mul_grad:77 as this operation consists of Binary + Reorder . Binary output is a temporary data that is an input to Reorder . Output of reorder goes into output tensor.
-
Currently there in elementwise_mul_grad (src_x_tz) is always gradient of of Out buffer so I know it is never to be broadcasted. But I agree that code would be better if it works for z == nullptr and x being bradcasted. So I improve that. thanks
paddle/fluid/platform/mkldnn_reuse.h
Outdated
| src_y_tz, platform::MKLDNNGetDataType<T>(), y->format()); | ||
| if (rankdiff > 0) { | ||
| auto rankdiff = x->dims().size() - y->dims().size(); | ||
| if (rankdiff > 0) { // Second input is lesser than first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry but why it is implemented this way.
I thought broadcast is copy values line by line until Y is same dimension as X dimension.
If X is 5*12, and Y is 5, rankdiff is 1?
or it is not like that ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If X is [5,12] and Y is [5] then Y dims for oneDNN need to be [5,1] . And then oneDNN having inputs [5,12] and [5,1] will recognize that second input need broadcasting to have applied. So code you are referencing is just to change shape from [5] to [5,1]
lidanqing-vv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@juncaipeng PR-CI-APPROVAL is complaining on wrong PADDLE_ENFORCE_EQ macro I touched in this PR and I do not understand what is wrong with it. couple you please advice? |
The error infomation is The specification in https://github.com/PaddlePaddle/Paddle/wiki/Paddle-Error-Message-Writing-Specification-(English-Verison) Maybe you can add more error messages ~ |
paddle/fluid/platform/mkldnn_reuse.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if (rankdiff > 0) { // Second input is lesser than first | |
| if (rankdiff > 0) { // Second input is smaller than first |
paddle/fluid/platform/mkldnn_reuse.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why aren't you just passing algo to CreateKey function? It can convert dnnl:algorithm datatype to a string
|
@jczaja Please cherry-pick this PR into release/2.1 after this PR pass all CIs. Thanks. |
|
@jczaja I saw you remove the whole test of inplace tests. Since 22 is tag day and Baidu expect today to merge this PR. Is it possible to keep minor change and make it pass CIs. If CI still fail not because of this PR, let me know. |
|
@juncaipeng I would like You to advice me on failing test_pass_builder . Reason of its failure is that Error message: 2021-06-18 19:51:35 Error Message Summary: |
arlesniak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
lidanqing-vv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
LGTM |
…sting to oneDNN elementwise ops (PaddlePaddle#33549) * - fix to PaddlePaddle#33282 * - Increased threshold for elementwise_mul_bf16 grad * -disabled faulty UT * - fix to approval
PR types
Bug fixes
PR changes
OPs
Describe
This is fix to #33282 . It does implement broadcasting for elementwise X input as previosuly only input Y was having broadcasting support. Inplace support for oneDNN elementwise kernels was disabled.