Skip to content

Conversation

@hxzd5568
Copy link
Contributor

PR Category

Operator Mechanism

PR Types

Improvements

Description

Pcard-67164

  1. 改动提升了总体精度,对齐了torch
  2. 改动不会影响性能:
    因为改动仅仅影响线性规模的参数,paddle本来 float 16运算过程也做了精度提升,只是有线性个中间结果用float16存储的,所以在保存中间结果的时候又产生了误差。
  3. 为什么不在c++层面修改:
    paddle中间过程拆得很碎,拆成了5个中间过程,如果仅对中间结果提升精度,改起来工作量很大。
    (见Paddle/paddle/phi/kernels/gpu/margin_cross_entropy_kernel.cu:19)
  4. 实验速度——修复后f16性能略有提升
大tensor 修复后 7.63s Time (%) Total Time (ns) Instances Avg (ns) Med (ns) Min (ns) Max (ns) StdDev (ns) Name -------- --------------- --------- --------------- --------------- ------------- ------------- ------------ ---------------------------------------------------------------------------------------------------- 82.6 6,313,407,732 1 6,313,407,732.0 6,313,407,732.0 6,313,407,732 6,313,407,732 0.0 void phi::HardLabelSoftmaxWithCrossEntropyKernel<float, long>(T1 *, T1 *, const T2 *, int, long, lo… 修复前:8.08s Time (%) Total Time (ns) Instances Avg (ns) Med (ns) Min (ns) Max (ns) StdDev (ns) Name -------- --------------- --------- --------------- --------------- ------------- ------------- ----------- ---------------------------------------------------------------------------------------------------- 84.1 6,873,639,751 1 6,873,639,751.0 6,873,639,751.0 6,873,639,751 6,873,639,751 0.0 void phi::HardLabelSoftmaxWithCrossEntropyKernel<phi::dtype::float16, long>(T1 *, T1 *, const T2 *,… 8.0 649,658,963 
@paddle-bot
Copy link

paddle-bot bot commented Jul 26, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@b4a4977). Learn more about missing BASE report.

Additional details and impacted files
@@ Coverage Diff @@ ## develop #74254 +/- ## =========================================== Coverage ? 100.00% =========================================== Files ? 1 Lines ? 6 Branches ? 0 =========================================== Hits ? 6 Misses ? 0 Partials ? 0 

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@hxzd5568
Copy link
Contributor Author

/re-run npu

Copy link
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wanghuancoder wanghuancoder merged commit 8ad991d into PaddlePaddle:develop Jul 28, 2025
65 of 67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants