Skip to content

Conversation

@zhangbo9674
Copy link
Contributor

@zhangbo9674 zhangbo9674 commented Feb 22, 2022

PR types

New features

PR changes

Others

Describe

完善 bf16 amp-o1实现逻辑:

  • 原逻辑:O1模式下:白名单op跑在BF16下,其他op跑在FP32下(pr

该该逻辑与view机制存在不兼容问题:如unsqueeze op 不在白名单中,在amp-o1下若其input是BF16,执行前会自动插入cast将input转为FP32,与view机制存在兼容问题。

  • 现逻辑:白名单op跑BF16,不再白名单中的op,会根据其input类型确定kernel类型,若input全部为BF16,则跑BF16 kernel,否则跑FP32 kernel。同时提供黑名单策略。

图片

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhangbo9674 zhangbo9674 merged commit 18ee051 into PaddlePaddle:develop Feb 28, 2022
@zhangbo9674 zhangbo9674 deleted the dev/bf16_refine_o1 branch March 2, 2023 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants