Skip to content

Conversation

@lshpku
Copy link
Contributor

@lshpku lshpku commented Jul 21, 2025

PR Category

Communication Library

PR Types

Performance

Description

This is a partial cherry-pick from deepseek-ai/DeepEP#283

为了使deep_ep在与deep_gemm进行overlap时使用相同的SM分配策略,避免因为不连续分配SM导致deep_gemm的kernel无法launch,这里将deep_ep的kernel也设置成multicast=2,即使deep_ep并未使用multicast功能

Pcard-85711

@lshpku lshpku requested review from ForFishes and sneaxiy as code owners July 21, 2025 08:22
@paddle-bot
Copy link

paddle-bot bot commented Jul 21, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@phlrain phlrain self-requested a review July 25, 2025 14:09
@phlrain phlrain merged commit ce1fc6a into PaddlePaddle:develop Jul 25, 2025
77 of 80 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants