Skip to content

Conversation

@lugimzzz
Copy link
Contributor

@lugimzzz lugimzzz commented Dec 2, 2024

PR types

New features

PR changes

APIs

Description

新增 adam-mini,训练过程新增--optim "adamw_mini",暂时不支持unified checkpoint存储
TP暂不支持,moment2计算需要进行gather

@paddle-bot
Copy link

paddle-bot bot commented Dec 2, 2024

Thanks for your contribution!

@@ -0,0 +1 @@
../../../../llm/docs/predict/speculative_decoding.md No newline at end of file
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

修复死链问题

@codecov
Copy link

codecov bot commented Dec 2, 2024

Codecov Report

Attention: Patch coverage is 18.18182% with 63 lines in your changes missing coverage. Please review.

Project coverage is 53.03%. Comparing base (9f237b4) to head (ba987b4).
Report is 283 commits behind head on develop.

Files with missing lines Patch % Lines
paddlenlp/utils/optimizer.py 15.94% 58 Missing ⚠️
paddlenlp/trainer/trainer.py 0.00% 4 Missing ⚠️
paddlenlp/trainer/training_args.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@ Coverage Diff @@ ## develop #9542 +/- ## =========================================== - Coverage 53.10% 53.03% -0.07%  =========================================== Files 704 704 Lines 110967 110878 -89 =========================================== - Hits 58925 58809 -116  - Misses 52042 52069 +27 

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
from ..utils import AdamWMini

optimizer_cls = AdamWMini
optimizer_kwargs.update(adam_kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是否可以做一些限制或者提示,例如tp、sharding情况下不能开启 AdamWMini

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -0,0 +1,53 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2022->2024

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

def _add_moments_pows(self, p):
acc_dtype = p.dtype
if self._is_dtype_fp16_or_bf16(acc_dtype):
acc_dtype = DataType.FLOAT32 if in_pir_mode() else core.VarDesc.VarType.FP32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前框架中都统一到 paddle.xxx 的dtype,可以参考下 https://github.com/PaddlePaddle/PaddleNLP/pull/9366/files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@wawltor wawltor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wawltor wawltor merged commit da41c4f into PaddlePaddle:develop Dec 16, 2024
8 of 11 checks passed
@lugimzzz lugimzzz deleted the adam-mini branch December 16, 2024 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants