Skip to content

Conversation

@cangtianhuang
Copy link
Contributor

@cangtianhuang cangtianhuang commented Jun 16, 2025

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

logcumsumexp 与 cumsum 共用 ScanKernel,在 #72562 pr 合入后,cumsum 能过通过大 tensor 测试,但是部分 logcumsumexp 用例仍报 Erroneous arithmetic operation

经打桩检查,是因为 int 类型所致,将两个 int 类型转换为 int64_t 后不再出现上述报错;但是在运行 float32 配置时会抛出 paddle::memory::allocation::BadAlloc,将数据类型全部改为 float16 后通过 paddleonly 测试:
image

考虑到 paddle::memory::allocation::BadAlloc 的错误,经检查,发现是 ScanKernel 在计算时将申请一份 tmp_data 用于转置、行反转等中间计算,大小与输出相同,这说明该内核在执行时将花销约两倍显存,而 torch 采用直接索引的方法读取目标数据,在某些情况下可能更加通用

将 LogAddExp 修改为 log1p,使其数值更稳定,但是大 tensor 仍有严重的精度问题,需要进一步修复

@paddle-bot
Copy link

paddle-bot bot commented Jun 16, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Jun 16, 2025
@cangtianhuang cangtianhuang marked this pull request as ready for review June 23, 2025 08:56
Copy link
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 精度问题没有修复完之前,不要在任务表格中将任务标记为完成。大Tensor中暴露的精度问题在大Tensor工作中修复。所以这个API还得继续看

std::log改为std::log1p请书豪把把关~

@cangtianhuang
Copy link
Contributor Author

LGTM 精度问题没有修复完之前,不要在任务表格中将任务标记为完成。大Tensor中暴露的精度问题在大Tensor工作中修复。所以这个API还得继续看

std::log改为std::log1p请书豪把把关~

精度问题已经在修复中了~目前看来是thrust库的问题

@lshpku lshpku merged commit 6a260b7 into PaddlePaddle:develop Jun 26, 2025
47 of 51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

3 participants