Skip to content

Conversation

@xiaoguoguo626807
Copy link
Contributor

PR Category

Execute Infrastructure

PR Types

Improvements

Description

pcard-67164
#74038 Todo:
stride_copy内部分成了三个路径,input.numel = 0; input.numel = output.numel; input canexpand to output. 每个路径中又根据Rank, vecsize 分为4*9个分支. 代码非常多. 此PR 合并为input canexpand to output 一条路径,这会导致cuda kernel 在特殊的路径下多计算一次input_idx = i % input_numel , 可能会影响性能

@paddle-bot
Copy link

paddle-bot bot commented Jul 15, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-ci-bot
Copy link

paddle-ci-bot bot commented Jul 24, 2025

Sorry to inform you that 3f2d9f4's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant