Grid_sampler optimization #39751

AshburnLee · 2022-02-20T09:13:05Z

PR types

Performance optimization

PR changes

OPs

Describe

功能

经过开发测试，3Dkernel整体性能不能优于优化前的1D kernel。
经过分析，发现该OP实现过程中存在重复操作，导致每次该op执行时都会有一个EigenMetaKernel被launch，而该kernel的耗时占比不能被忽略，故删除。
经进一步分析，当block大小为512，经输出img计算得到的grid大小远小于SM数（V100 80个SM），而相同的case，竞品block设为256（paddle设为256后，实际性能整体差于竞品，故保持512），grid大小为74，接近SM数。故代码中添加了对于block大小为512时，grid大小的判断和重新设置，LaunchConfig1D中有类似的处理。效果如下

在模型20个case上的效果

前向

反向

结论

前向：将SM数考虑进去后，模型case性能优于develop，除case#7（从差于竞品10.79%距离缩小到差于8.38%），其他不差于竞品。对与上次优化输出img为300*4的5个case，与竞品差距大幅度减小（分别是9.11%->2.14%、10.79%->8.38%、12.94%->3.32%、13.93%->4.49%、10.24%->1.53%）。
反向：模型case性能优于develop。但是由于反向逻辑存在原子操作，其掩盖了上述处理得到的性能收益（同一个case的前/反向有相同的处理规模，但前/反向的耗时差距很大，原子操作是瓶颈）。
op benchmark case 较优化前有明显提升（见CI-op-benchmark）。

Update forked PaddlePaddle

Update my fork

update from PaddlePaddle

Update forked paddle repo

Update USERNAME/paddle

update Paddle USERNAME repo

update username repo

update local paddlepaddle

update paddlepaddle

… develop

… grid_sampler_fw_bilinear

paddle-bot-old · 2022-02-20T09:13:10Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… grid_sampler_fw_bilinear

ZzSean

LGTM

AshburnLee added 20 commits September 8, 2020 09:45

Merge pull request #1 from PaddlePaddle/develop

8f532b0

Update forked PaddlePaddle

Merge pull request #2 from PaddlePaddle/develop

5b5804d

Update my fork

Merge pull request #3 from PaddlePaddle/develop

cee2470

update from PaddlePaddle

Merge pull request #4 from PaddlePaddle/develop

5be3a45

Update forked paddle repo

Merge pull request #5 from PaddlePaddle/develop

a1d92b7

Update USERNAME/paddle

Merge pull request #6 from PaddlePaddle/develop

e674a5d

update Paddle USERNAME repo

Merge pull request #7 from PaddlePaddle/develop

855d00b

update username repo

Merge pull request #8 from PaddlePaddle/develop

7cb2c97

update local paddlepaddle

Merge pull request #9 from PaddlePaddle/develop

db9fc91

update paddlepaddle

Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into…

c7b68c8

… develop

Merge branch 'PaddlePaddle:develop' into develop

0fd630e

Merge branch 'PaddlePaddle:develop' into develop

4bbb33b

Merge branch 'PaddlePaddle:develop' into develop

30a1a89

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

ce3deec

… develop

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

925eb06

… develop

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

7fcf902

… develop

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

956bd69

… develop

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

5f5fb9e

… develop

init grid_sampler with mode=bilinear

02ad020

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

7d0baac

… grid_sampler_fw_bilinear

AshburnLee added 5 commits February 21, 2022 02:45

solve error

b9f7af8

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

91e7467

… grid_sampler_fw_bilinear

rm fill constant

bf3ef1a

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

9f0889f

… grid_sampler_fw_bilinear

rm head

78fed2b

AshburnLee mentioned this pull request Feb 21, 2022

optimize grid_sample with Mode=nearest #39739

Closed

AshburnLee added 3 commits February 21, 2022 14:28

change block size

207564d

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

0fc7705

… grid_sampler_fw_bilinear

change block size

755c48b

AshburnLee added 5 commits February 22, 2022 08:24

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

cbbb3cd

… grid_sampler_fw_bilinear

optimize

5a431f0

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

68c43f7

… grid_sampler_fw_bilinear

apply existing config

973cad0

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

625f725

… grid_sampler_fw_bilinear

AshburnLee changed the title ~~Grid sampler fw bilinear~~ Grid_sampler optimization Feb 25, 2022

ZzSean approved these changes Feb 28, 2022

View reviewed changes

ZzSean merged commit 2c66775 into PaddlePaddle:develop Feb 28, 2022

AshburnLee deleted the grid_sampler_fw_bilinear branch February 28, 2022 07:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Grid_sampler optimization #39751

Grid_sampler optimization #39751

Uh oh!

AshburnLee commented Feb 20, 2022 •

edited

Loading

paddle-bot-old bot commented Feb 20, 2022

ZzSean left a comment

Labels

2 participants

Grid_sampler optimization #39751

Grid_sampler optimization #39751

Uh oh!

Conversation

AshburnLee commented Feb 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR types

PR changes

Describe

功能

在模型20个case上的效果

结论

paddle-bot-old bot commented Feb 20, 2022

ZzSean left a comment

Choose a reason for hiding this comment

Labels

2 participants

AshburnLee commented Feb 20, 2022 •

edited

Loading