Skip to content

Conversation

@lshpku
Copy link
Contributor

@lshpku lshpku commented Apr 27, 2025

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

修复ReduceKernel因为LimitGridDim导致的算错问题
方法:将gridDim限制的逻辑移动到blocking_size设置的前面,从而用blocking_size补偿gridDim不足的部分;同时不再使用LimitGridDim,因为这个函数输入是int,也有溢出问题

示例case:

paddle.sum(Tensor([442367, 1], "float16"), axis=0) # 原来对,现在对 paddle.sum(Tensor([442368, 1], "float16"), axis=0) # 原来错,现在对 paddle.sum(Tensor([524280, 2], "float32"), axis=0) # 原来对,现在对 paddle.sum(Tensor([524281, 2], "float32"), axis=0) # 原来错,现在对

TODO:LimitGridDim在其他文件里也有使用,gather和scatter是我写的可以确认没问题,其他文件需要找同学逐一确认一下正确性

Pcard-85711

@paddle-bot
Copy link

paddle-bot bot commented Apr 27, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@lshpku lshpku merged commit 68677e0 into PaddlePaddle:develop Apr 27, 2025
45 checks passed
YqGe585 pushed a commit to YqGe585/Paddle that referenced this pull request May 7, 2025
wanghuancoder pushed a commit to wanghuancoder/Paddle that referenced this pull request May 27, 2025
wanghuancoder added a commit that referenced this pull request Jun 3, 2025
* refine forrange (#72360) * refine forrange * refine forrange * reduce support big tensor (#71970) * reduce support big tensor * [PHI] Fix gridDim limit for reduce kernel (#72507) * [API] isclose support bigtensor (#72516) * isclose support bigtensor * refine * [API] isnan isinf isfinite support bigtensor (#72517) * isnan isinf isfinite support bigtensor * refine * [PHI] Fix cum kernel for big tensor (#72562) * [PHI] Preliminary fix for elementwise broadcast int32 shape overflow (#72584) * [PHI] Align linalg.solve kernel with torch (#72608) * Update strided copy kernel (#72662) * [PHI] Fix grid sample kernel for big tensor (#72628) * [PHI] Fix argsort big tensor bug (#72712) * [PHI] Fixed argsort big tensor bug * [PHI] Fixed shape mismatch problem. * [PHI] Fix contiguous kernel for big tensor (#72705) * [PHI] Fix flatten and split kernel for big tensor (#72634) * [PHI] Fix out-of-bound issue of paddle.take_along_axis (#72757) * [PHI] fix paddle.diag with big tensor (#72638) * [API] fix paddle.cross with big tensor (#72652) * [PHI] Fix paddle.where api for big tensor (#72717) * [PHI] Fix bincount kernel for big tensor (#72706) * fix bincount kernel for big tensor * use HostAlloc to alloc memory * add cpu test case * [PHI] Fix full_like kernel for big tensor (#72831) * [API] Fix int overflow and float16 support for paddle.frac (#72815) * [PHI] Align paddle.inner with torch in matmul logic (#72843) * [PHI] Fix paddle.var & paddle.std float16 overflow (#72650) * [PHI] Fix logsumexp precision problem (#72681) * [PHI] Debug for logsumexp, bug source found * [PHI] Removed GetNumBlocks func to get correct logsumexp * [PHI] Removed redundant debug VLOG * [PHI] Elegant grid bounded solution * [Accuracy diff No.55-56、76-77] Fix accuracy diff for var&std API (#72879) * [Accuracy diff No.21] Fix accuracy diff for heaviside API (#72894) --------- Co-authored-by: Shuhao Liang <50269654+lshpku@users.noreply.github.com> Co-authored-by: Qianyue He <46109954+Enigmatisms@users.noreply.github.com> Co-authored-by: Lei Ding <69283446+Dmovic@users.noreply.github.com> Co-authored-by: ggggxm <66855582+ggggxm@users.noreply.github.com> Co-authored-by: xkkkkkk23 <xiekeke@baidu.com> Co-authored-by: Zx <zhangxiao35@baidu.com> Co-authored-by: huangjiyi <43315610+huangjiyi@users.noreply.github.com> Co-authored-by: ooo oo <106524776+ooooo-create@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants