[LLM INFER] Fix some bugs and chatglm_v2 support block_attn #9271

yuanlehome · 2024-10-15T07:27:43Z

PR types

New features

PR changes

Models

Description

chatglm_v2 support block_attn mode, but the accuracy needs to be aligned
修复先前disable掉的诸多单测
略微优化下组网代码
添加USE_FASTER_TOP_P_SAMPLING环境变量用于使用性能更好的top_p_sampling算子

paddle-bot · 2024-10-15T07:27:48Z

Thanks for your contribution!

codecov · 2024-10-15T07:59:44Z

Codecov Report

Attention: Patch coverage is 0% with 104 lines in your changes missing coverage. Please review.

Project coverage is 52.89%. Comparing base (7551730) to head (d19ed92).
Report is 263 commits behind head on develop.

Files with missing lines	Patch %	Lines
...p/experimental/transformers/chatglm_v2/modeling.py	0.00%	84 Missing ⚠️
...erimental/transformers/fused_transformer_layers.py	0.00%	15 Missing ⚠️
...enlp/experimental/transformers/generation_utils.py	0.00%	5 Missing ⚠️

Additional details and impacted files

@@ Coverage Diff @@ ## develop #9271 +/- ## =========================================== + Coverage 52.80% 52.89% +0.08%  =========================================== Files 660 660 Lines 106869 106929 +60 =========================================== + Hits 56434 56561 +127  + Misses 50435 50368 -67

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ZHUI · 2024-10-25T06:24:45Z

llm/predict/predictor.py

- else:
- return 8192 # Maximum sequence length.
+ total_max_length: int = field(
+ default=4096, metadata={"help": "Super parameter. Maximum sequence length(encoder+decoder)."}


这个跟npu相关同学确认的吗?

已确认，没问题。

ZHUI · 2024-10-25T06:25:57Z

llm/predict/predictor.py

 arange_tensor_encoder = paddle.arange(self.config.total_max_length, dtype=self.config.dtype)
- alibi = alibi_slopes[None, :, None, None] * arange_tensor_encoder
+ alibi = (alibi_slopes[None, :, None, None] * arange_tensor_encoder).astype(self.config.dtype)



emm,这个 config dtype保险吗？用户可以改这个值。要不用里面一个tensor的dtype。

这个dtype确实需要与config.dtype保持一致的

ZHUI · 2024-10-25T06:28:04Z

llm/predict/predictor.py

-
- model = Model.from_pretrained(
+ predictor_args.total_max_length = config.seq_length
+ if predictor_args.block_attn:


emm，我建议吧 block_attn 放到config的属性里面，然后 ChatGLMv2InferenceModel 里面自己控制。
这里改的话，后期这样修改的模型太多了。

但严格来说其实这个不属于每个模型的Config，如果加入如LlamaConfig的话，每个模型的Config里都需要加，先保持这样吧，后面重构的时候，会看下有没有更好的方式

yuanlehome marked this pull request as draft October 15, 2024 07:39

yuanlehome closed this Oct 24, 2024

yuanlehome force-pushed the test_chatglm2 branch from f3b2d99 to 7551730 Compare October 24, 2024 06:54

chatglm2 support block_attn and fix some bugs

c19166a

yuanlehome reopened this Oct 24, 2024

yuanlehome marked this pull request as ready for review October 24, 2024 06:55

yuanlehome added 5 commits October 24, 2024 07:14

fix ci

2d2c5bd

update

57773ef

fix more ut error

00c9819

update

e763f17

update

d19ed92

yuanlehome changed the title ~~[LLM INFER] chatglm_v2 support block_attn~~ [LLM INFER] Fix some bugs and chatglm_v2 support block_attn Oct 24, 2024

qingqing01 approved these changes Oct 25, 2024

View reviewed changes

ZHUI reviewed Oct 25, 2024

View reviewed changes

qingqing01 merged commit 2e8b220 into PaddlePaddle:develop Oct 25, 2024
2 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LLM INFER] Fix some bugs and chatglm_v2 support block_attn #9271

[LLM INFER] Fix some bugs and chatglm_v2 support block_attn #9271

Uh oh!

yuanlehome commented Oct 15, 2024 •

edited

Loading

paddle-bot bot commented Oct 15, 2024

codecov bot commented Oct 15, 2024 •

edited

Loading

ZHUI Oct 25, 2024

yuanlehome Oct 25, 2024

ZHUI Oct 25, 2024

yuanlehome Oct 25, 2024

ZHUI Oct 25, 2024

yuanlehome Oct 25, 2024

Uh oh!

Labels

3 participants

[LLM INFER] Fix some bugs and chatglm_v2 support block_attn #9271

[LLM INFER] Fix some bugs and chatglm_v2 support block_attn #9271

Uh oh!

Conversation

yuanlehome commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR types

PR changes

Description

paddle-bot bot commented Oct 15, 2024

codecov bot commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

ZHUI Oct 25, 2024

Choose a reason for hiding this comment

yuanlehome Oct 25, 2024

Choose a reason for hiding this comment

ZHUI Oct 25, 2024

Choose a reason for hiding this comment

yuanlehome Oct 25, 2024

Choose a reason for hiding this comment

ZHUI Oct 25, 2024

Choose a reason for hiding this comment

yuanlehome Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

Labels

3 participants

yuanlehome commented Oct 15, 2024 •

edited

Loading

codecov bot commented Oct 15, 2024 •

edited

Loading