Skip to content

Conversation

@thisjiang
Copy link
Contributor

PR types

Bug fixes

PR changes

OPs

Describe

问题

decoupledsegnet、hardnet单卡,batch_size=1训练报错,问题定位于PR32266

image
image

原因

cuda-memcheck排查发现在ElemwiseGradBroadcast2CUDAKernel处出现了非法内存访问导致出core。经分析原因在于Tensor实际分配大小与需求不一致,定位问题点在于SliceGradKernelneed_pad_num == 0判断有问题,直接跳过导致没有分配合适的空间,删去该判断逻辑后运行就正常了。

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@wzzju wzzju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

d_input);
}
} else if (need_pad_num == 1) {
if (need_pad_num == 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

测试时请关注对模型的性能影响,是否有明显的性能回退发生,后续可以看是否能把优化的问题修复

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,收到

@lanxianghit lanxianghit merged commit 2dffe32 into PaddlePaddle:release/2.1 Jul 29, 2021
@thisjiang thisjiang deleted the cherrypick-solve_slice_inplace_bug branch July 29, 2021 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants