Skip to content

Conversation

@xiaoguoguo626807
Copy link
Contributor

@xiaoguoguo626807 xiaoguoguo626807 commented Jul 15, 2025

PR Category

Execute Infrastructure

PR Types

Improvements

Description

pcard-67164

stride_copy 支持 input 的dim 只要和output dim 的后面维度一致时即可copy ,
这样可以减少前处理的broadcast, 在x.shape= [108,64,12288] , index=slice(0,100,2), value.shape = [64, 12288] case 下,性能提升一倍

todo: 当前stride_copy内部分成了三个路径,input.numel = 0; input.numel = output.numel; input canexpand to output.numel. 每个路径中又根据Rank, vecsize 分为4*9个分支. 代码非常多,可以进行优化

@paddle-bot
Copy link

paddle-bot bot commented Jul 15, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@xiaoguoguo626807 xiaoguoguo626807 changed the title 【】Stridecopy support diff input-dim and ouput_dim 【slice】Stridecopy support diff input-dim and ouput_dim Jul 15, 2025
Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for paddle_enforce

@xiaoguoguo626807 xiaoguoguo626807 merged commit 29b4bf4 into PaddlePaddle:develop Jul 17, 2025
85 of 94 checks passed
@xiaoguoguo626807 xiaoguoguo626807 deleted the stridecopy branch July 17, 2025 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants