[Accuracy diff No.16] Fix accuracy diff for `paddle.cumsum` 、 `paddle.logcumsumexp` API #74081

cangtianhuang · 2025-07-16T17:00:22Z

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

错误定位：

ThrustCumsumKernel 自身的精度误差就极大
fp32 下大 tensor 拥有累计误差

解决方案：

直接删除 ThrustCumsumKernel 分支处理，进入后续 CUDA cub 计算
BlockPrefixCallbackOp 采用 Kahan 算法，参考：https://en.wikipedia.org/wiki/Kahan_summation_algorithm
BlockPrefixCallbackOp 对于 LogAddExp 算子特例，采用 Kahan + Online Scale，使其数值更稳定

其他修改：

优化 np_logcumsumexp_grad ，避免直接计算 np.log(-dout) (dout > 0) 报错

测试：

修复后，测试用例大致分为三类：

累积维度小，直接 Pass
累积维度过大，精度误差超 1e-2：

考虑到超大张量累积误差本身就更大，将 atol、rtol 改为 1 后基本通过测试：

数据类型为 fp16，torch 计算错误，paddle 使用 MPType，符合理论值：

将其添加至 torch_error_skip 跳过精度测试

torch 直接报告 CUDA 700：

将其添加至 torch_error_skip 跳过精度测试

补充测试：

paddle_only 测试全部通过：

paddle 与 torch 在固定数值下的理论值分析，对于以下测试用例：

paddle.cumsum(x=Tensor([4294967297],"float32"), ) paddle.logcumsumexp(x=Tensor([4294967297],"float32"), )

张量通过 full 填充为 0.01，则 cumsum 期望值约为 42949672.97 ， logcumsumexp 期望值约为 22.1907

发现 cumsum 均与理论值有差异，但数值接近，用时相同；logcumsumexp 的 torch 更接近理论值，paddle 仍有差异，且远远慢于 torch，等待算法进一步修复

Pcard-85711

paddle-bot · 2025-07-16T17:00:30Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

wanghuancoder

LGTM Kernel改动比较大，让书豪帮忙把把关

paddle/phi/kernels/gpu/cum_kernel.cu

test/legacy_test/test_logcumsumexp_op.py

cangtianhuang added 2 commits July 17, 2025 00:52

using kahan

552655d

fix test

c788ae7

luotao1 mentioned this pull request Jul 17, 2025

【开源任务】Paddle CPU/GPU Kernel 精度问题推全 #72667

Open

cangtianhuang marked this pull request as ready for review July 17, 2025 07:07

wanghuancoder approved these changes Jul 17, 2025

View reviewed changes

lshpku reviewed Jul 17, 2025

View reviewed changes

paddle/phi/kernels/gpu/cum_kernel.cu Show resolved Hide resolved

lshpku reviewed Jul 17, 2025

View reviewed changes

test/legacy_test/test_logcumsumexp_op.py Show resolved Hide resolved

lshpku approved these changes Jul 17, 2025

View reviewed changes

lshpku merged commit 0ca88c4 into PaddlePaddle:develop Jul 17, 2025
92 of 94 checks passed

cangtianhuang mentioned this pull request Jul 17, 2025

[PHI] Fix cumsum/logcumsumexp torch_error_list PFCCLab/PaddleAPITest#391

Merged

cangtianhuang deleted the fix-cumsum branch July 26, 2025 14:30

cangtianhuang mentioned this pull request Aug 6, 2025

[PHI] Fix paddle.cumsum calculation speed #74442

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Accuracy diff No.16] Fix accuracy diff for `paddle.cumsum` 、 `paddle.logcumsumexp` API #74081

[Accuracy diff No.16] Fix accuracy diff for `paddle.cumsum` 、 `paddle.logcumsumexp` API #74081

Uh oh!

cangtianhuang commented Jul 16, 2025 •

edited

Loading

paddle-bot bot commented Jul 16, 2025

wanghuancoder left a comment

Uh oh!

Uh oh!

Uh oh!

Labels

3 participants

Uh oh!

[Accuracy diff No.16] Fix accuracy diff for paddle.cumsum 、 paddle.logcumsumexp API #74081

[Accuracy diff No.16] Fix accuracy diff for paddle.cumsum 、 paddle.logcumsumexp API #74081

Uh oh!

Conversation

cangtianhuang commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

错误定位：

解决方案：

其他修改：

测试：

补充测试：

paddle-bot bot commented Jul 16, 2025

wanghuancoder left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Labels

3 participants

[Accuracy diff No.16] Fix accuracy diff for `paddle.cumsum` 、 `paddle.logcumsumexp` API #74081

[Accuracy diff No.16] Fix accuracy diff for `paddle.cumsum` 、 `paddle.logcumsumexp` API #74081

cangtianhuang commented Jul 16, 2025 •

edited

Loading