Fix paddle.incubate.nn.functional.fused_bias_dropout_residual_layer_norm big Tensor #74060

xingmingyyj · 2025-07-16T03:10:45Z

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

修复int溢出导致的访存越界问题
batch较大时存在精度diff，未定位出明显bug

[accuracy error] backward paddle.incubate.nn.functional.fused_bias_dropout_residual_layer_norm(x=Tensor([270000000, 2, 4],"float32"), residual=Tensor([270000000, 2, 4],"float32"), bias=None, ln_scale=Tensor([4],"float32"), ln_bias=None, dropout_rate=0.0, ln_epsilon=1e-05, training=True, mode="upscale_in_train", name=None, ) Not equal to tolerance rtol=0.01, atol=0.01 Tensor-likes are not close! Mismatched elements: 3 / 2160000000 (0.0%) Greatest absolute difference: 0.03557777404785156 at index (122354108, 0, 3) (up to 0.01 allowed) Greatest relative difference: 0.1643351912498474 at index (64014755, 1, 3) (up to 0.01 allowed) ACTUAL: (shape=torch.Size([270000000, 2, 4]), dtype=torch.float32)

通过一些测试发现和做layernorm在实现上和torch后很大的区别。
paddle实现位置：https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/kernels/funcs/layer_norm_impl.cu.h#L445
torch实现位置：https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/cuda/layer_norm_kernel.cu#L196
paddle通过$$D(x) = E(x^2) - (E(x))^2$$的方式计算方差，而torch采用了Welford算法，在数值上更具稳定性。另外，paddle和torch在计算dx时也有很大的计算差异：
paddle实现位置：
https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/kernels/funcs/layer_norm_impl.cu.h#L1735
torch实现位置：
https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/cuda/layer_norm_kernel.cu#L348
这些diff使得在feature_size在较小的情况下导致float32计算会产生上述精度diff。

[accuracy error] backward paddle.incubate.nn.functional.fused_bias_dropout_residual_layer_norm(x=Tensor([200000000, 1, 4],"float32"), residual=Tensor([200000000, 1, 4],"float32"), bias=None, ln_scale=Tensor([4],"float32"), ln_bias=None, dropout_rate=0.0, ln_epsilon=1e-05, training=True, mode="upscale_in_train", name=None, ) Not equal to tolerance rtol=0.01, atol=0.01 Tensor-likes are not close! Mismatched elements: 1 / 800000000 (0.0%) Greatest absolute difference: 0.02027149498462677 at index (7029107, 0, 3) (up to 0.01 allowed) Greatest relative difference: 0.20506636798381805 at index (7029107, 0, 3) (up to 0.01 allowed) ACTUAL: (shape=torch.Size([200000000, 1, 4]), dtype=torch.float32) tensor([[[ 0.4957, -0.1273, -0.2506, -0.1178]],

在float64下正常：

[Pass] paddle.incubate.nn.functional.fused_bias_dropout_residual_layer_norm(x=Tensor([200000000, 1, 4],"float64"), residual=Tensor([200000000, 1, 4],"float64"), bias=None, ln_scale=Tensor([4],"float64"), ln_bias=None, dropout_rate=0.0, ln_epsilon=1e-05, training=True, mode="upscale_in_train", name=None, )

Pcard-73263

paddle-bot · 2025-07-16T03:10:51Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle-ci-bot · 2025-07-24T02:47:07Z

Sorry to inform you that d9090b0's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

xingmingyyj · 2025-07-28T11:07:41Z

/re-run all-failed

wanghuancoder

LGTM

fix fused_bias_dropout_residual_layer_norm big Tensor

d9090b0

xingmingyyj added 2 commits July 24, 2025 11:32

fix

0a6ef28

fix

c5cd07b

xingmingyyj mentioned this pull request Jul 29, 2025

Fix error config and set dismiss for paddle.incubate.nn.functional.fused_bias_dropout_residual_layer_norm PFCCLab/PaddleAPITest#456

Merged

wanghuancoder approved these changes Jul 29, 2025

View reviewed changes

lshpku approved these changes Jul 29, 2025

View reviewed changes

lshpku merged commit 97f18d5 into PaddlePaddle:develop Jul 29, 2025
86 of 88 checks passed

xingmingyyj deleted the fused_bias_dropout_residual_layer_norm branch July 30, 2025 02:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix paddle.incubate.nn.functional.fused_bias_dropout_residual_layer_norm big Tensor #74060

Fix paddle.incubate.nn.functional.fused_bias_dropout_residual_layer_norm big Tensor #74060

Uh oh!

xingmingyyj commented Jul 16, 2025 •

edited

Loading

paddle-bot bot commented Jul 16, 2025

paddle-ci-bot bot commented Jul 24, 2025

xingmingyyj commented Jul 28, 2025

wanghuancoder left a comment

Uh oh!

Labels

3 participants

Uh oh!

Fix paddle.incubate.nn.functional.fused_bias_dropout_residual_layer_norm big Tensor #74060

Fix paddle.incubate.nn.functional.fused_bias_dropout_residual_layer_norm big Tensor #74060

Uh oh!

Conversation

xingmingyyj commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

paddle-bot bot commented Jul 16, 2025

paddle-ci-bot bot commented Jul 24, 2025

xingmingyyj commented Jul 28, 2025

wanghuancoder left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

3 participants

xingmingyyj commented Jul 16, 2025 •

edited

Loading