Skip to content

Conversation

@zhanghonggeng
Copy link
Contributor

PR Category

Execute Infrastructure

PR Types

Improvements

Description

image 将 index size 为1的场景从 flatten + gather + reshape 转为 index_elementwise_get,前向性能会有所下降,加速比在1以内。 同时,针对 非 bool index size 为 1 的反向场景,在 index_elementwise_get_grad 的反向计算中引入 IndexPutWithSortKernel 作为快速路径,以提升该场景下的性能。 time line 如下:

GPUIndexElementwiseGetGrad性能:
image

IndexPutWithSortKernel性能:
image

torch性能:
image

pcard-67164

@paddle-bot
Copy link

paddle-bot bot commented Jul 31, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@xiaoguoguo626807 xiaoguoguo626807 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, PR 提升了tesnor 索引case的反向性能,前向性能降低,可暂时豁免slice ci , 后续补充优化前向case

qingqing01
qingqing01 previously approved these changes Aug 4, 2025
accumulate);
}

const bool is_combined = (index_size == 1) ? false : true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_combined表示什么含义?加些注释说明

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_combined用来区分是普通索引还是组合索引,如果仅有一个普通索引反向时会采用性能更好的IndexPutWithSortKernel。新增了注释。

backward : index_elementwise_get_grad, index_elementwise_get_double_grad
inputs :
{x : x, index : index, input_dims : input_dims, input_strides : input_strides, index_dims : index_dims, index_stride : index_stride, slice_offset : slice_offset, accumulate : accumulate}
{x : x, index : index, input_dims : input_dims, input_strides : input_strides, index_dims : index_dims, index_stride : index_stride, slice_offset : slice_offset, accumulate : accumulate, is_combined : is_combined}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attr和input是分开配置的吧?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

@zhanghonggeng
Copy link
Contributor Author

/re-run all-failed

@zhanghonggeng
Copy link
Contributor Author

/re-run all-failed

1 similar comment
@zhanghonggeng
Copy link
Contributor Author

/re-run all-failed

@xiaoguoguo626807 xiaoguoguo626807 merged commit 4cebc8c into PaddlePaddle:develop Aug 5, 2025
135 of 145 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants