Skip to content

Conversation

@zhengshengning
Copy link
Contributor

@zhengshengning zhengshengning commented Jul 3, 2025

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

问题与修复过程:
存在问题1:
image

存在问题2:
image

存在问题3:
描述:按照常规在Kernel中增加0-size判断返回dev_ctx.template Alloc(out)时,出现 out 存在负数维度报错,以及paddle的out维度与torch的out维度不相同。
修复:通过阅读代码可以发现,原代码在Kernle的后面有对out的size进行设置。所以我们在0-size条件下,Kernle的out分配空间前,将 out 的size进行重新设置。

存在问题4:
描述:当 src_index.dims() 与 dst_index.dims() 不相等时,torch不会抛出异常,但paddle会主动抛出异常
image
修复:对SendURecvInferMeta、SendUERecvInferMeta、SendUVInferMeta中src_index.dims() 与 dst_index.dims() 不相等时抛出的异常增加判断,当src_index_dims[0]!=0时进行非法输入检查。

修复结果:
--accuracy=True 与 --paddle_only=True测试,paddle error 和 accuracy error问题全部修复,只存在numpy与torch error。
image
image

Pcard-67164

@paddle-bot
Copy link

paddle-bot bot commented Jul 3, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@CLAassistant
Copy link

CLAassistant commented Jul 3, 2025

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


ningzhengsheng seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Comment on lines +160 to +164
if (out_size_data[0] <= 0) {
out->Resize(x.dims());
} else {
out->Resize(common::make_ddim(out_size_data));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out_size 出现负数是输入的问题还是shape推导的问题?不应在kernel层面重新处理shape。应该在infermeta的时候就检查好或者保证推导正确,不应该到kernel层面dim中还出现负数。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前正常非0-size的逻辑也是在kernel层面重新处理了shape,所有为了统一,暂时也按照该方式处理

@codecov-commenter
Copy link

codecov-commenter commented Jul 4, 2025

Codecov Report

Attention: Patch coverage is 95.00000% with 3 lines in your changes missing coverage. Please review.

Please upload report for BASE (develop@aa1091e). Learn more about missing BASE report.

Files with missing lines Patch % Lines
paddle/phi/kernels/cpu/send_u_recv_grad_kernel.cc 80.00% 1 Missing ⚠️
paddle/phi/kernels/cpu/send_u_recv_kernel.cc 92.30% 1 Missing ⚠️
paddle/phi/kernels/cpu/send_ue_recv_kernel.cc 94.11% 1 Missing ⚠️
Additional details and impacted files
@@ Coverage Diff @@ ## develop #73806 +/- ## ========================================== Coverage ? 95.00% ========================================== Files ? 8 Lines ? 60 Branches ? 0 ========================================== Hits ? 57 Misses ? 3 Partials ? 0 

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
dst_index_dims.size()));
}

PADDLE_ENFORCE_EQ(src_index_dims[0],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议不为0时保留检查

 if (src_index_dims[0] != 0) { common::errors::InvalidArgument( PADDLE_ENFORCE_EQ( "Src_index and Dst_index should have the same shape.")); src_index_dims[0], dst_index_dims[0], common::errors::InvalidArgument( "Src_index and Dst_index should have the same shape.")); } 
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的

@zhengshengning
Copy link
Contributor Author

/re-run all-failed

Copy link
Contributor

@DanielSun11 DanielSun11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DanielSun11 DanielSun11 merged commit 86d658f into PaddlePaddle:develop Jul 17, 2025
72 of 73 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

5 participants