Skip to content

Conversation

@lixcli
Copy link
Contributor

@lixcli lixcli commented Aug 28, 2024

PR types

New features

PR changes

APIs | Docs

Description

  1. add a8w8(fp8) a8w8c8(int8) quant_type support
  2. add llama3.1 and qwen2 ptq config
  3. update quantization.md
2. add llama3.1 and qwen2 ptq config 3. update quantization.md
@paddle-bot
Copy link

paddle-bot bot commented Aug 28, 2024

Thanks for your contribution!

@codecov
Copy link

codecov bot commented Aug 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 54.01%. Comparing base (34a71c8) to head (d21ace7).
Report is 229 commits behind head on develop.

Additional details and impacted files
@@ Coverage Diff @@ ## develop #9032 +/- ## =========================================== + Coverage 53.81% 54.01% +0.19%  =========================================== Files 652 652 Lines 104356 105208 +852 =========================================== + Hits 56155 56823 +668  - Misses 48201 48385 +184 

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@DrownFish19 DrownFish19 changed the title add a8w8(fp8) a8w8c8(int8) quant_type support [Inference] Add a8w8(fp8) a8w8c8(int8) quant_type support Aug 28, 2024
Copy link
Collaborator

@DrownFish19 DrownFish19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DrownFish19 DrownFish19 merged commit 19927ba into PaddlePaddle:develop Aug 28, 2024
python run_finetune.py ./config/llama/ptq_argument.json

# W8A8C8(INT)量化启动命令参考
python run_finetune.py ./config/llama/ptq_c8_argument.json
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

多了一个空格

@@ -0,0 +1,138 @@
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2023 -> 2024

Comment on lines +47 to +49
# print(
# f"{index/len(subject_list)} Inference starts at {run_date} on {args.model_name_or_path} with subject of {subject_name}!"
# )
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug信息还是否有必要保留?

@@ -0,0 +1,61 @@
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2023 -> 2024

@@ -0,0 +1,191 @@
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2023 -> 2024

@@ -0,0 +1,94 @@
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2023 -> 2024

import numpy as np
import paddle

# from paddleslim.quant.observers.channel_wise import ChannelWiseObserver
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?删掉

@@ -0,0 +1,105 @@
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2023 -> 2024

@@ -0,0 +1,55 @@
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2023 -> 2024

# from paddle.quantization.factory import ObserverFactory
from experimental.layers.cache_kv import CacheKVMatMul

# from paddleslim.quant.observers.mse import MSEObserverLayer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不需要的都给删掉,其他地方自行排查下

Mangodadada pushed a commit to Mangodadada/PaddleNLP that referenced this pull request Sep 10, 2024
…le#9032) * 1. add a8w8(fp8) a8w8c8(int8) quant_type support 2. add llama3.1 and qwen2 ptq config 3. update quantization.md * fix load_quant_model bug * fix load quant bug * update ll/README.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants