[MKL-DNN] Advice on runtime context related crash in Transformer

When test_analyzer_transformer.profile_mkldnn is run with "--test_all_data" option there is a crash (matmul inputs has mismatch of dims). Problem is that when MKL-DNN is used (any op)
then starting from third iteration (when Input data shapes will differ from Warmup and first iteration) then wrong Tensor is chosen for op to work. 

Problem is that runtime context is having a Variables pointing to tensors "scale_0.tmp_0" that 
comes from previous iterations when shape of those Tensors was diffrent. At the same time
scope does contain "scale_0_tmp_0" of good proper dims.
As a test/workround I forced paddle to recreate runtime context each time. With this hack crash goes away (it all pass). Code I'm refering to (that was disabled to force recreate of runtime context):
https://github.com/PaddlePaddle/Paddle/blob/4267a81afcab6ccc4d84eab8ffad0dff24fd8d65/paddle/fluid/framework/operator.cc#L897

@luotao1 Could you please advice/suggest on how to fix this problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MKL-DNN] Advice on runtime context related crash in Transformer #16841

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[MKL-DNN] Advice on runtime context related crash in Transformer #16841

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions