add warning of tf32-tensor-cores for LBFGS optimizer #74188
Merged
+14 −0
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
PR Category
User Experience
PR Types
Improvements
Description
Pcard-66961

问题:使用L-BFGS优化器时,在部分机器上有案例的精度无法复现 PaddlePaddle/PaddleScience#1185
出现原因:在Nvidia的部分卡上默认使用TF32而不是FP32来加速计算,但是在部分场景下这会导致精度损失
复现方法:如上issue连接,使用A100机器,运行PaddleScience下biharmonic2d案例
修改方案:在L-BFGS优化器中添加warning,引导用户export NVIDIA_TF32_OVERRIDE=0
验证:如上Issue所示,用户已验证