Skip to content

Conversation

@zhangbo9674
Copy link
Contributor

PR types

Performance optimization

PR changes

APIs

Describe

AmpScaler类用于混合精度训练过程中对loss进行缩放,其中成员属性:_found_inf用于标记每轮训练过程中参数梯度是否存在inf。

原本框架代码会在调用check_finite_and_unscaleop通过to_variable申请两个bool类型的tensor,导致每轮训练在该时间存在cudaMemcpy,影响GPU性能:
图片

优化后,将在AmpScaler类初始化过程中声明并定义两个bool类型的tensor,消除训练过程中的cudaMemcpy:
图片

@paddle-bot-old
Copy link

paddle-bot-old bot commented Dec 1, 2021

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhiqiu zhiqiu merged commit cc2b466 into PaddlePaddle:develop Dec 2, 2021
Zjq9409 pushed a commit to Zjq9409/Paddle that referenced this pull request Dec 10, 2021
@zhangbo9674 zhangbo9674 deleted the dev/loss_scaler_found_inf branch March 2, 2023 02:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants