Skip to content

Conversation

@Zjq9409
Copy link
Contributor

@Zjq9409 Zjq9409 commented Dec 1, 2021

PR types

Performance optimization

PR changes

OPs

Describe

使用reduce实现broadcast sub 反向,相比于原始性能数据如下:

Case pytorch paddle(优化前) 优化前相比pytorch paddle(优化后) 优化后相比pytorch 加速比
[50, 128, 1000], [128, 1000] 0.30086 0.18073 优于 (39.93%) 0.17129 优于(43.07%) 1.06
[50, 128, 1000], [1, 128, 1000] 0.30359 0.17959 优于 (40.84%) 0.17206 优于(43.32%) 1.04
[16, 2048, 7, 7], [16, 2048] 0.09000 0.07593 优于 (15.63%) 0.06041 优于(32.88%) 1.26
[16, 2048, 16, 16], [16, 2048, 16, 16] 0.38284 0.25730 优于 (32.79%) 0.25788 优于(32.64%) 1.00
[6, 1, 80, 46080], [1] 0.20880 1.85418 差于 (7.88x) 0.11859 优于(43.2%) 15.64
[512, 896, 4, 12], [512, 896, 4, 1] 1.11503 2.82007 差于 (1.53x) 0.64902 优于(41.79%) 4.35
[512, 896, 4, 12], [512, 896, 4, 1] fp16 0.71971 2.73426 差于 (2.80x) 0.43191 优于(39.99%) 6.33
[32, 12, 128, 128], [32, 1, 1, 128] fp16 0.18400 0.45639 差于 (1.48x) 0.09958 优于(45.88%) 4.58
[32, 1, 1, 128], [1, 12, 128, 1] fp16 0.19077 0.31292 差于 (64.03%) 0.10816 优于(43.3%) 2.89
[8,256,1,400],[8,1,512,400] 15.51507 34.88378 差于 (1.25x) 7.17431 优于(53.76%) 4.86
@paddle-bot-old
Copy link

paddle-bot-old bot commented Dec 1, 2021

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@ZzSean
Copy link
Contributor

ZzSean commented Dec 2, 2021

除了benchmark里的配置,卡片里的那个配置的数据也添加一下吧

@Zjq9409
Copy link
Contributor Author

Zjq9409 commented Dec 2, 2021

除了benchmark里的配置,卡片里的那个配置的数据也添加一下吧
已经添加

@Zjq9409 Zjq9409 force-pushed the broadcast_sub_bw branch 2 times, most recently from 805412b to 1a0fd54 Compare December 7, 2021 03:07
Copy link
Contributor

@ZzSean ZzSean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ZzSean ZzSean merged commit 567e6bb into PaddlePaddle:develop Dec 8, 2021
Zjq9409 added a commit to Zjq9409/Paddle that referenced this pull request Dec 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants