Skip to content

Conversation

@zhanghonggeng
Copy link
Contributor

PR Category

Execute Infrastructure

PR Types

Improvements

Description

image 将 index size 为1的场景从 flatten + gather + reshape 转为 index_elementwise_get,前向性能会有所下降,加速比在1以内。 同时,针对 非 bool index size 为 1 的反向场景,在 index_elementwise_get_grad 的反向计算中引入 IndexPutWithSortKernel 作为快速路径,以提升该场景下的性能。 time line 如下:

GPUIndexElementwiseGetGrad性能:
image

IndexPutWithSortKernel性能:
image

torch性能:
image

pcard-67164

@paddle-bot
Copy link

paddle-bot bot commented Jul 31, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@xiaoguoguo626807 xiaoguoguo626807 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, PR 提升了tesnor 索引case的反向性能,前向性能降低,可暂时豁免slice ci , 后续补充优化前向case

qingqing01
qingqing01 previously approved these changes Aug 4, 2025
accumulate);
}

const bool is_combined = (index_size == 1) ? false : true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_combined表示什么含义?加些注释说明

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_combined用来区分是普通索引还是组合索引,如果仅有一个普通索引反向时会采用性能更好的IndexPutWithSortKernel。新增了注释。

backward : index_elementwise_get_grad, index_elementwise_get_double_grad
inputs :
{x : x, index : index, input_dims : input_dims, input_strides : input_strides, index_dims : index_dims, index_stride : index_stride, slice_offset : slice_offset, accumulate : accumulate}
{x : x, index : index, input_dims : input_dims, input_strides : input_strides, index_dims : index_dims, index_stride : index_stride, slice_offset : slice_offset, accumulate : accumulate, is_combined : is_combined}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attr和input是分开配置的吧?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

@zhanghonggeng
Copy link
Contributor Author

/re-run all-failed

@zhanghonggeng
Copy link
Contributor Author

/re-run all-failed

1 similar comment
@zhanghonggeng
Copy link
Contributor Author

/re-run all-failed

@xiaoguoguo626807 xiaoguoguo626807 merged commit 4cebc8c into PaddlePaddle:develop Aug 5, 2025
135 of 145 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

5 participants