add flashmask rm #9154

lugimzzz · 2024-09-19T11:44:27Z

PR types

New features

PR changes

Others

Description

rm

CLAassistant · 2024-09-19T11:44:33Z

All committers have signed the CLA.

codecov · 2024-09-19T12:19:38Z

Codecov Report

Attention: Patch coverage is 0% with 12 lines in your changes missing coverage. Please review.

Project coverage is 53.02%. Comparing base (ad14dc4) to head (ee302b6).
Report is 254 commits behind head on develop.

Files with missing lines	Patch %	Lines
paddlenlp/datasets/zero_padding_dataset.py	0.00%	8 Missing ⚠️
paddlenlp/transformers/llama/fusion_ops.py	0.00%	4 Missing ⚠️

Additional details and impacted files

@@ Coverage Diff @@ ## develop #9154 +/- ## =========================================== - Coverage 53.06% 53.02% -0.05%  =========================================== Files 656 656 Lines 106147 106162 +15 =========================================== - Hits 56324 56288 -36  - Misses 49823 49874 +51

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

gongel · 2024-09-25T04:16:08Z

llm/alignment/rm/flashmask/reward_model.py

+ rejected_indexes = paddle.to_tensor(
+ [[response_index[0], response_index[2]] for response_index in response_indexs]
+ )
+ chosen_hidden_states = hidden_states.gather_nd(chosen_indexes)


sequence parallel 等都不支持吗

gongel · 2024-09-25T04:20:51Z

llm/alignment/rm/flashmask/run_reward.py

+ """main"""
+ parser = PdArgumentParser((ModelArgument, DataArgument, TrainingArguments))
+ if len(sys.argv) == 2 and sys.argv[1].endswith(".json"):
+ model_args, data_args, training_args = parser.parse_json_file(json_file=os.path.abspath(sys.argv[1]))


这里要不支持下json和命令行，因为ce里面可能会增加命令行去覆盖json的配置，可参考这里：https://github.com/PaddlePaddle/PaddleNLP/blob/develop/llm/run_finetune.py#L77

gongel · 2024-09-25T04:21:53Z

llm/alignment/rm/flashmask/run_reward.py

+ logger.info("Start to create dataset")
+ trans_func = partial(preprocess_preference_data, tokenizer=tokenizer, data_args=data_args, model_args=model_args)
+ if data_args.lazy:
+ zero_padding_dataset = ZeroPaddingIterableDataset


如果设置Lazy，那么按epoch保存会报错吗？

lazy只支持step

ad flashmask rm

3016c80

lugimzzz changed the title ~~ad flashmask rm~~ add flashmask rm Sep 19, 2024

lugimzzz added 5 commits September 20, 2024 14:35

support rm

09ccf24

add test

5b6c159

fix format

cf2cbcf

fix bug

598694d

rm

958b18d

gongel reviewed Sep 25, 2024

View reviewed changes

lugimzzz added 2 commits September 26, 2024 14:28

Merge branch 'develop' of https://github.com/lugimzzz/PaddleNLP into rm

79e7222

fix

ee302b6

gongel approved these changes Sep 26, 2024

View reviewed changes

ZHUI merged commit b2e4db2 into PaddlePaddle:develop Sep 27, 2024

lugimzzz deleted the rm branch September 27, 2024 06:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add flashmask rm #9154

add flashmask rm #9154

Uh oh!

lugimzzz commented Sep 19, 2024

CLAassistant commented Sep 19, 2024 •

edited

Loading

codecov bot commented Sep 19, 2024 •

edited

Loading

gongel Sep 25, 2024

lugimzzz Sep 26, 2024

gongel Sep 25, 2024

lugimzzz Sep 25, 2024

gongel Sep 25, 2024

lugimzzz Sep 25, 2024

Labels

4 participants

add flashmask rm #9154

add flashmask rm #9154

Uh oh!

Conversation

lugimzzz commented Sep 19, 2024

PR types

PR changes

Description

CLAassistant commented Sep 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

codecov bot commented Sep 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

gongel Sep 25, 2024

Choose a reason for hiding this comment

lugimzzz Sep 26, 2024

Choose a reason for hiding this comment

gongel Sep 25, 2024

Choose a reason for hiding this comment

lugimzzz Sep 25, 2024

Choose a reason for hiding this comment

gongel Sep 25, 2024

Choose a reason for hiding this comment

lugimzzz Sep 25, 2024

Choose a reason for hiding this comment

Labels

4 participants

CLAassistant commented Sep 19, 2024 •

edited

Loading

codecov bot commented Sep 19, 2024 •

edited

Loading