Skip to content

Conversation

@AshburnLee
Copy link
Contributor

@AshburnLee AshburnLee commented Dec 28, 2021

PR types

Performance optimization

PR changes

OPs

Describe

功能

优化了nearest_interp 算子的的前向计算

最终效果

截屏2022-01-21 10 04 29

效果:超越竞品,远超paddle-dev

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot-old
Copy link

paddle-bot-old bot commented Jan 5, 2022

Sorry to inform you that b7fd119's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

@AshburnLee
Copy link
Contributor Author

已将FastDivMod 的初始化从GPU端移到CPU端

@JamesLim-sy
Copy link
Contributor

已将FastDivMod 的初始化从GPU端移到CPU端

需要贴一下修改后的性能数据

@AshburnLee
Copy link
Contributor Author

已将FastDivMod 的初始化从GPU端移到CPU端

需要贴一下修改后的性能数据

Done,见PR 描述

Copy link
Contributor Author

@AshburnLee AshburnLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对于nchw,PR提供更快的3D kernel;对于nhwc,PR使用快速除法优化已有的1D kernel

Copy link
Contributor

@JamesLim-sy JamesLim-sy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,3D卷积的操作比较新颖,而且性能水平相比1D的提升更明显,整理一下这次优化的材料,组内做一个分享吧

Copy link
Contributor

@ZzSean ZzSean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ZzSean ZzSean merged commit 232bbce into PaddlePaddle:develop Jan 25, 2022
@AshburnLee AshburnLee deleted the nearest_interp branch January 25, 2022 05:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants