Skip to content
This repository was archived by the owner on Oct 25, 2024. It is now read-only.

Commit 1380d5e

Browse files
Support PEFT model (#153)
Signed-off-by: Mengni Wang <mengni.wang@intel.com> Co-authored-by: Haihao Shen <haihao.shen@intel.com>
1 parent 10af3ca commit 1380d5e

File tree

3 files changed

+14
-2
lines changed

3 files changed

+14
-2
lines changed

examples/huggingface/pytorch/language-modeling/quantization/README.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,13 +30,15 @@ Here is how to run the scripts:
3030
```bash
3131
# "--sq" is used to enable smooth quant
3232
# "--int8_bf16_mixed" is used to enable int8-bf16 mixed mode for platform that natively supports bf16
33+
# "--peft_model_id" is used to loaded PEFT weights from peft_model_id
3334
python run_clm_no_trainer.py \
3435
--model EleutherAI/gpt-j-6B \
3536
--quantize \
3637
--sq \
3738
--alpha 1.0 \
3839
--output_dir "saved_results" \
3940
--ipex \
41+
--peft_model_id "peft_model_id"
4042
```
4143

4244
```bash
@@ -70,14 +72,16 @@ python run_clm_no_trainer.py \
7072
```bash
7173
# "--sq" is used to enable smooth quant
7274
# "--int8_bf16_mixed" is used to enable int8-bf16 mixed mode for platform that natively supports bf16
75+
# "--peft_model_id" is used to loaded PEFT weights from peft_model_id
7376
python run_clm_no_trainer.py \
7477
--model facebook/opt-2.7b \
7578
--quantize \
7679
--sq \
7780
--alpha 0.5 \
7881
--ipex \
7982
--output_dir "saved_results" \
80-
--int8_bf16_mixed
83+
--int8_bf16_mixed \
84+
--peft_model_id "peft_model_id"
8185
```
8286

8387
#### Accuracy with lm_eval
@@ -99,14 +103,16 @@ python run_clm_no_trainer.py \
99103
```bash
100104
# "--sq" is used to enable smooth quant
101105
# "--int8_bf16_mixed" is used to enable int8-bf16 mixed mode for platform that natively supports bf16
106+
# "--peft_model_id" is used to loaded PEFT weights from peft_model_id
102107
python run_clm_no_trainer.py \
103108
--model decapoda-research/llama-7b-hf \
104109
--quantize \
105110
--sq \
106111
--alpha 0.8 \
107112
--ipex \
108113
--output_dir "saved_results" \
109-
--int8_bf16_mixed
114+
--int8_bf16_mixed \
115+
--peft_model_id "peft_model_id"
110116
```
111117

112118
#### Accuracy with lm_eval

examples/huggingface/pytorch/language-modeling/quantization/requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,4 @@ wandb
99
einops
1010
neural-compressor
1111
git+https://github.com/EleutherAI/lm-evaluation-harness.git@83dbfbf6070324f3e5872f63e49d49ff7ef4c9b3
12+
git+https://github.com/huggingface/peft.git@6c44096c7b8d55a2ecf24be9bc68393467e1584a

examples/huggingface/pytorch/language-modeling/quantization/run_clm_no_trainer.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@
5252
parser.add_argument("--weight_only_group", type=int, default=-1)
5353
parser.add_argument("--weight_only_scheme", default="sym")
5454
parser.add_argument("--weight_only_sym_full_range", action="store_true")
55+
parser.add_argument("--peft_model_id", type=str, default=None, help="model_name_or_path of peft model")
5556

5657
args = parser.parse_args()
5758
if args.ipex:
@@ -185,6 +186,10 @@ def get_user_model():
185186
)
186187
tokenizer = AutoTokenizer.from_pretrained(args.model)
187188

189+
if args.peft_model_id is not None:
190+
from peft import PeftModel
191+
user_model = PeftModel.from_pretrained(user_model, args.peft_model_id)
192+
188193
# to channels last
189194
user_model = user_model.to(memory_format=torch.channels_last)
190195
user_model.eval()

0 commit comments

Comments
 (0)