Skip to content

Conversation

@xingmingyyj
Copy link
Contributor

@xingmingyyj xingmingyyj commented Jul 30, 2025

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

问题修复

  1. 修复int溢出导致的访存越界问题
  2. 修复 CUDA error(9)问题
  3. batch较大时存在精度diff
paddle.vision.ops.deform_conv2d(Tensor([5070448, 3, 5, 5],"float32"), Tensor([5070448, 18, 5, 5],"float32"), Tensor([6, 3, 3, 3],"float32"), None, list[1,1,], list[1,1,], list[1,1,], 1, 1, Tensor([5070448, 9, 5, 5],"float32"), ) Not equal to tolerance rtol=0.01, atol=0.01 Tensor-likes are not close! Mismatched elements: 3 / 2281701600 (0.0%) Greatest absolute difference: 0.04030437394976616 at index (1852051, 0, 0, 2) (up to 0.01 allowed) Greatest relative difference: 1.0 at index (910808, 0, 0, 3) (up to 0.01 allowed) ACTUAL: (shape=torch.Size([5070448, 18, 5, 5]), dtype=torch.float32) tensor([[[[ 0.0000e+00, 0.0000e+00, -3.4506e-02, -8.3038e-04, 0.0000e+00], [-8.7802e-04, 3.7523e-03, 1.5622e-02, -8.7787e-04, 5.6141e-03], 

offset的反向传播公式为:
$$\frac{\partial L}{\partial \Delta p_n} = \sum_{p_0} \frac{\partial L}{\partial y(p_0)} \cdot w(p_n) \cdot \frac{\partial x (p_0 + p_n + \Delta p_n)}{\partial \Delta p_n}$$
其中:

  1. 排除上述问题后,image_shape较大时仍然存在精度diff
test begin: paddle.vision.ops.deform_conv2d(x=Tensor([4, 22817014, 5, 5],"float32"), offset=Tensor([4, 18, 3, 3],"float32"), weight=Tensor([5, 22817014, 3, 3],"float32"), bias=None, stride=list[1,1,], padding=list[0,0,], dilation=list[1,1,], deformable_groups=1, groups=1, mask=Tensor([4, 9, 3, 3],"float32"), ) [accuracy error] backward paddle.vision.ops.deform_conv2d(x=Tensor([4, 22817014, 5, 5],"float32"), offset=Tensor([4, 18, 3, 3],"float32"), weight=Tensor([5, 22817014, 3, 3],"float32"), bias=None, stride=list[1,1,], padding=list[0,0,], dilation=list[1,1,], deformable_groups=1, groups=1, mask=Tensor([4, 9, 3, 3],"float32"), ) Not equal to tolerance rtol=0.01, atol=0.01 Tensor-likes are not close! Mismatched elements: 1833583996 / 2281701400 (80.4%) Greatest absolute difference: 1.451427698135376 at index (3, 22517627, 2, 1) (up to 0.01 allowed) Greatest relative difference: inf at index (1, 13256072, 1, 4) (up to 0.01 allowed) ACTUAL: (shape=torch.Size([4, 22817014, 5, 5]), dtype=torch.float32) tensor([[[[-1.1867e-02, -9.4617e-02, -6.7302e-02, 4.5707e-02, 1.0042e-01], [-1.4820e-03, 5.5286e-03, 2.1573e-02, -7.3028e-03, 7.3889e-02], 

怀疑是Matmul导致,Matmul修复后进行验证。

性能统计

PR #74058 中的修复全部使用int64_t作为index,在小shape下会造成性能降低,已revert。

Input Config Stage 修复前 (s) 修复后 (s) Pre Fix (PR 74058)(s)
Config1 Forward 1.1488 1.1348 1.1714
Backward 2.5604 2.5957 2.8967
Config2 Forward 0.0890 0.0805 0.0888
Backward 0.2128 0.2000 0.2144
Config3 Forward 0.0530 0.0531 0.0603
Backward 0.1793 0.1802 0.2098

Config1: paddle.vision.ops.deform_conv2d(Tensor([79220, 3, 4, 4],"float32"), Tensor([79220, 18, 4, 4],"float32"), Tensor([6, 3, 3, 3],"float32"), None, list[2,2,], list[3,3,], list[1,1,], 1, 1, None, )
Config2: paddle.vision.ops.deform_conv2d(Tensor([5070, 3, 5, 5],"float32"), Tensor([5070, 18, 5, 5],"float32"), Tensor([6, 3, 3, 3],"float32"), None, list[2,2,], list[3,3,], list[1,1,], 1, 1, None, )
Config3: paddle.vision.ops.deform_conv2d(x=Tensor([446, 128, 200, 200],"float32"), offset=Tensor([446, 36, 100, 100],"float32"), weight=Tensor([128, 128, 3, 3],"float32"), bias=None, stride=list[2,2,], padding=list[1,1,], dilation=list[1,1,], deformable_groups=2, groups=1, mask=Tensor([446, 18, 100, 100],"float32"), )
Pcard-73263

@paddle-bot
Copy link

paddle-bot bot commented Jul 30, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@codecov-commenter
Copy link

codecov-commenter commented Jul 30, 2025

Codecov Report

❌ Patch coverage is 16.66667% with 85 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@deddb2b). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...dle/phi/kernels/cpu/deformable_conv_grad_kernel.cc 0.00% 49 Missing ⚠️
...hi/kernels/impl/deformable_conv_grad_kernel_impl.h 0.00% 36 Missing ⚠️

❌ Your patch status has failed because the patch coverage (16.66%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@ Coverage Diff @@ ## develop #74324 +/- ## ========================================== Coverage ? 16.66% ========================================== Files ? 3 Lines ? 102 Branches ? 0 ========================================== Hits ? 17 Misses ? 85 Partials ? 0 

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@xingmingyyj xingmingyyj changed the title fix deform_conv2d Fix paddle.vision.ops.deform_conv2d API big Tensor Jul 31, 2025
@xingmingyyj
Copy link
Contributor Author

/re-run all-failed

Copy link
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wanghuancoder wanghuancoder merged commit 460b539 into PaddlePaddle:develop Aug 5, 2025
70 of 75 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

5 participants