Fix paddle.vision.ops.deform_conv2d API big Tensor #74324
Merged
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
PR Category
Operator Mechanism
PR Types
Bug fixes
Description
问题修复
offset的反向传播公式为:
$$\frac{\partial L}{\partial \Delta p_n} = \sum_{p_0} \frac{\partial L}{\partial y(p_0)} \cdot w(p_n) \cdot \frac{\partial x (p_0 + p_n + \Delta p_n)}{\partial \Delta p_n}$$
其中:
paddle和torch在计算双线性插值反向时有diff。当点$p_0 + p_n + \Delta p_n$越界时,例如,$p_0 + p_n + \Delta p_n=(-1,0)$, paddle前向直接返回0,反向直接返回0。torch前向直接返回0,但是会给周围没有越界的点回传梯度。对齐torch实现后,精度可以对齐。这里保持了paddle的实现。paddle代码实现位置:https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/kernels/impl/deformable_conv_grad_kernel_impl.h#L62
;torch代码实现位置:https://github.com/pytorch/vision/blob/main/torchvision/csrc/ops/cuda/deform_conv2d_kernel.cu#L494。
怀疑是Matmul导致,Matmul修复后进行验证。
性能统计
PR #74058 中的修复全部使用int64_t作为index,在小shape下会造成性能降低,已revert。
Config1: paddle.vision.ops.deform_conv2d(Tensor([79220, 3, 4, 4],"float32"), Tensor([79220, 18, 4, 4],"float32"), Tensor([6, 3, 3, 3],"float32"), None, list[2,2,], list[3,3,], list[1,1,], 1, 1, None, )
Config2: paddle.vision.ops.deform_conv2d(Tensor([5070, 3, 5, 5],"float32"), Tensor([5070, 18, 5, 5],"float32"), Tensor([6, 3, 3, 3],"float32"), None, list[2,2,], list[3,3,], list[1,1,], 1, 1, None, )
Config3: paddle.vision.ops.deform_conv2d(x=Tensor([446, 128, 200, 200],"float32"), offset=Tensor([446, 36, 100, 100],"float32"), weight=Tensor([128, 128, 3, 3],"float32"), bias=None, stride=list[2,2,], padding=list[1,1,], dilation=list[1,1,], deformable_groups=2, groups=1, mask=Tensor([446, 18, 100, 100],"float32"), )
Pcard-73263