Skip to content
111 changes: 102 additions & 9 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,102 @@ save_to_hf: false
```


## 1. 精调
## 1. 预训练

### 1.1 数据准备
### 1.1. 数据准备

#### 1.1.1. 在线数据流

我们支持的精调数据格式是每行包含一个字典的 json 文件,每个字典包含以下字段:

- `text` : `str, List(str)`, 预训练文本。

样例数据:

```text
{"text": ["一个需要连续输入值的分类问题的示例是房屋价格预测。房屋的价格通常基于诸如平方英尺、位置、卧室和浴室数量以及像后院或车库等功能这样的因素定价。为了准确预测房屋价格,这些标准必须作为连续输入值输入到分类模型中。"]}
...
```

#### 1.1.2. 离线数据流

我们也可以选择使用离线的比特预训练数据流,更节省内存。离线数据流制作方法如下:

下载一个文本数据集,例如 https://modelscope.cn/datasets/BazingaLyn/mini_pretrain_dataset

格式需为jsonl,每行格式例如BazingaLyn/mini_pretrain_dataset/pretrain_hq_v7.jsonl:
```text
{"text": "番茄炒蛋\n材料:\n鸡蛋3个、番茄1个、油、盐、糖、水淀粉\n做法:..."}
{"text": "请描述一下如何正确规划个人理财。正确规划个人理财需要以下几个步骤..."}
{"text": "请输入一段描述有关海洋保护的情景对话。Person A: 哇,这个海滩真..."}
{"text": "鉴别两种不同类型的葡萄酒。鉴别葡萄酒的方法因其类型和品种而异,下..."}
```

运行`examples/tools/create_pretraining_data.py`,生成数据将会保存在当前目录下的`./pretrain_data.bin`和`./pretrain_data.idx`
```text
python -u examples/tools/create_pretraining_data.py \
--model_name_or_path "/path/to/your/Qwen3-0.6B-base" \
--data_format "JSON" \
--input_path "/path/to/your/BazingaLyn/mini_pretrain_dataset/pretrain_hq_v7.jsonl" \
--append_eos \
--output_prefix "./pretrain_data" \
--workers 1 \
--log_interval 10000 \
--data_impl "mmap"
```

- 参数说明

| 参数名 | 类型 | 说明 |
|--------------------|----------- |-----------------|
| `--model_name_or_path` | string | 模型路径 |
| `--data_format` | string | 支持的文件格式,之前只支持 json |
| `--input_path` | string | 输入的json文件的路径 |
| `--append_eos` | store_true | 是否在document的结尾添加eos token |
| `--output_prefix` | str | 输出文件的前缀 |
| `--workers` | int | 运行的进程数 |
| `--log_interval` | int | 打印日志间隔 |
| `--data_impl` | str | 制作的数据集类型,默认为mmap,也可以选择lazy |

### 1.2. 全参 PT

预训练需要在配置文件中指定 `stage: PT`

在线数据流
```bash
# 单卡
paddleformers-cli train ./config/pt/full.yaml
# 多卡
paddleformers-cli train ./config/pt/full_tp_pp.yaml
```

离线数据流

在配置文件中:

`input_dir`指定数据集的前缀,例如:数据集 `data-1-part0.bin` 需要设置为 `input_dir: "1.0 ./data-1-part0"`,`1.0` 为数据配比;

`split` 字段为 `train/eval` 的分配比例,如:`split: "998,2"`, 其中`train`为训练集,`eval`为评估集

`dataset_type` 指定为 `pretrain`,例如:`dataset_type: "pretrain"`

```bash
paddleformers-cli train ./config/pt/full_offline_data.yaml
```

### 1.3. LoRA PT

LoRA SFT 启动命令参考
```bash
# 单卡
paddleformers-cli train ./config/pt/lora.yaml
# 多卡
paddleformers-cli train ./config/pt/lora_tp_pp.yaml
```

## 2. 精调

### 2.1 数据准备

我们支持的精调数据格式是每行包含一个字典的 json 文件,每个字典包含以下字段:

Expand All @@ -51,7 +144,7 @@ wget https://bj.bcebos.com/paddlenlp/datasets/examples/alpaca_demo.gz
mkdir -p data/sft && tar -xf alpaca_demo.gz -C data/sft/ --strip-components=1
```

### 1.2 全参 SFT
### 2.2 全参 SFT

单卡
```bash
Expand All @@ -63,17 +156,17 @@ python -u run_finetune.py ./config/sft/full.yaml
python -u -m paddle.distributed.launch --devices "0,1,2,3,4,5,6,7" run_finetune.py ./config/sft/full_tp_pp.yaml
```

### 1.3 LoRA SFT
### 2.3 LoRA SFT

LoRA SFT 启动命令参考
```bash
python -u run_finetune.py ./config/sft/lora.yaml
```


## 2. 对齐
## 3. 对齐

### 2.1 数据准备
### 3.1 数据准备

我们支持的精调数据格式是每行包含一个字典的 json 文件,每个字典包含以下字段:

Expand Down Expand Up @@ -105,7 +198,7 @@ wget https://bj.bcebos.com/paddlenlp/datasets/examples/ultrafeedback_binarized.t
mkdir -p data/dpo && tar -zxf ultrafeedback_binarized.tar.gz -C data/dpo/ --strip-components=1
```

### 2.2 全参 DPO
### 3.2 全参 DPO

单卡
```bash
Expand All @@ -117,15 +210,15 @@ python -u ./alignment/dpo/run_dpo.py ./config/dpo/full.yaml
python -u -m paddle.distributed.launch --devices "0,1,2,3,4,5,6,7" ./alignment/dpo/run_dpo.py ./config/dpo/full_tp_pp.yaml
```

### 2.3 LoRA DPO
### 3.3 LoRA DPO

LoRA DPO 启动命令参考
```bash
python -u ./alignment/dpo/run_dpo.py ./config/dpo/lora.yaml
```


## 3. LoRA 参数合并
## 4. LoRA 参数合并

使用 LoRA 方式训练模型后,为了方便推理,我们提供将 LoRA 参数合并到模型主权重中的脚本`tools/mergekit.py`。

Expand Down
33 changes: 21 additions & 12 deletions paddleformers/cli/README.md → examples/cli_usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ Expected output:
```
------------------------------------------------------------
| Usage: |
| paddleformers-cli train -h: model finetuning |
| paddleformers-cli export -h: model export |
| paddleformers-cli train : model finetuning |
| paddleformers-cli export : model export |
| paddleformers-cli help: show helping info |
------------------------------------------------------------
```
Expand Down Expand Up @@ -60,33 +60,42 @@ Examples using **Qwen/Qwen3-0.6B-Base** model:
## 1.1. Chat
待补充

## 1.2. Model Fine-tuning
## 1.2. Model Pre-training

### 1.2.1. SFT & LoRA Fine-tuning
```bash
# Example 1: SFT-Full using online dataset
paddleformers-cli train examples/config/pt/full.yaml
# Example 2: SFT-Full using offline dataset
paddleformers-cli train examples/config/pt/full_offline_data.yaml
```

## 1.3. Model Fine-tuning

### 1.3.1. SFT & LoRA Fine-tuning
```bash
# Example 1: SFT
paddleformers-cli train examples/config/sft_lora.yaml
paddleformers-cli train examples/config/sft/lora.yaml
# Example 2: SFT-Full
paddleformers-cli train examples/config/sft_full.yaml
paddleformers-cli train examples/config/sft/full.yaml
```

### 1.2.2. DPO & LoRA Fine-tuning
### 1.3.2. DPO & LoRA Fine-tuning
```bash
# Example 1: 8K seq length, DPO
paddleformers-cli train examples/config/dpo_full.yaml
paddleformers-cli train examples/config/dpo/full.yaml
# Example 2: 8K seq length, DPO-LoRA
paddleformers-cli train examples/config/dpo_lora.yaml
paddleformers-cli train examples/config/dpo/lora.yaml
```

## 1.3 Model Eval
## 1.4. Model Eval
待补充

## 1.4. Model Export
## 1.5. Model Export
```bash
paddleformers-cli export examples/config/run_export.yaml
```

## 1.5. Multi-Node Training
## 1.6. Multi-Node Training
```bash
NNODES={num_nodes} MASTER_ADDR={your_master_addr} MASTER_PORT={your_master_port} CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 paddleformers-cli train examples/config/sft_full.yaml
```
101 changes: 101 additions & 0 deletions examples/cli_usage_zh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# 命令行界面

## 概述

CLI(Command Line Interface)提供基于终端的程序交互,通过参数化配置,高效灵活地执行模型训练、推理和评估任务。

## 快速入门

**安装**

在PaddleFormers根目录下运行:
```bash
python -m pip install -e .
```

验证安装:
```bash
paddleformers-cli help
```

预期输出:
```
------------------------------------------------------------
| Usage: |
| paddleformers-cli train : model finetuning |
| paddleformers-cli export : model export |
| paddleformers-cli help: show helping info |
------------------------------------------------------------
```

**GPU配置**

默认情况下,CLI 中使用所有可用的 GPU。
如果您想指定某些 GPU,请在运行 CLI 之前设置 CUDA_VISIBLE_DEVICES:

```bash
# Single GPU
export CUDA_VISIBLE_DEVICES=0
# Multi GPUs
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7

# Single XPU
export XPU_VISIBLE_DEVICES=0
# Multi XPUs
export XPU_VISIBLE_DEVICES=0,1,2,3,4,5,6,7

# Single NPU
export ASCEND_RT_VISIBLE_DEVICES=0
# Multi NPUs
export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
```

* 注:在`Chat`模块,CUDA_VISIBLE_DEVICES配置的GPU数量应该等于`tensor_parallel_degree`在配置中。
或者,您也可以取消设置 CUDA_VISIBLE_DEVICES。

# 1. CLI 用法

使用 **Qwen/Qwen3-0.6B-Base** 模型的示例:

## 1.1.聊天
待补充

## 1.2.模型预训练

```bash
# Example 1: SFT-Full using online dataset
paddleformers-cli train examples/config/pt/full.yaml
# Example 2: SFT-Full using offline dataset
paddleformers-cli train examples/config/pt/full_offline_data.yaml
```

## 1.3.模型微调

### 1.3.1. SFT 和 LoRA 微调
```bash
# Example 1: SFT
paddleformers-cli train examples/config/sft/lora.yaml
# Example 2: SFT-Full
paddleformers-cli train examples/config/sft/full.yaml
```

### 1.3.2. DPO 和 LoRA 微调
```bash
# Example 1: 8K seq length, DPO
paddleformers-cli train examples/config/dpo/full.yaml
# Example 2: 8K seq length, DPO-LoRA
paddleformers-cli train examples/config/dpo/lora.yaml
```

## 1.4.模型评估
待补充

## 1.5.模型导出
```bash
paddleformers-cli export examples/config/run_export.yaml
```

## 1.6.多节点训练
```bash
NNODES={num_nodes} MASTER_ADDR={your_master_addr} MASTER_PORT={your_master_port} CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 paddleformers-cli train examples/config/sft_full.yaml
```
2 changes: 0 additions & 2 deletions examples/config/dpo/full.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ train_dataset_prob: "1.0"
eval_dataset_path: ./data/dpo/dev.jsonl
eval_dataset_prob: "1.0"
max_seq_len: 8192
num_samples_each_epoch: 6000000
packing: false
mix_strategy: concat

Expand All @@ -28,7 +27,6 @@ max_steps: -1
eval_steps: 100
evaluation_strategy: steps
save_steps: 100
save_total_limit: 1
save_strategy: steps
logging_steps: 1
gradient_accumulation_steps: 4
Expand Down
2 changes: 0 additions & 2 deletions examples/config/dpo/full_function_call.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ train_dataset_prob: "1.0"
eval_dataset_path: ./data/fc/function-call-eval.jsonl
eval_dataset_prob: "1.0"
max_seq_len: 8192
num_samples_each_epoch: 6000000
packing: false
mix_strategy: concat

Expand All @@ -30,7 +29,6 @@ max_steps: -1
eval_steps: 100
evaluation_strategy: steps
save_steps: 100
save_total_limit: 1
save_strategy: steps
logging_steps: 1
gradient_accumulation_steps: 4
Expand Down
1 change: 0 additions & 1 deletion examples/config/dpo/full_tp_pp.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ max_steps: -1
eval_steps: 100
evaluation_strategy: steps
save_steps: 100
save_total_limit: 1
save_strategy: steps
logging_steps: 1
gradient_accumulation_steps: 4
Expand Down
2 changes: 0 additions & 2 deletions examples/config/dpo/lora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ train_dataset_prob: "1.0"
eval_dataset_path: ./data/dpo/dev.jsonl
eval_dataset_prob: "1.0"
max_seq_len: 8192
num_samples_each_epoch: 6000000
packing: false
mix_strategy: concat

Expand All @@ -30,7 +29,6 @@ max_steps: -1
eval_steps: 100
evaluation_strategy: steps
save_steps: 100
save_total_limit: 1
save_strategy: steps
logging_steps: 1
gradient_accumulation_steps: 4
Expand Down
2 changes: 0 additions & 2 deletions examples/config/dpo/lora_tp_pp.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ train_dataset_prob: "1.0"
eval_dataset_path: ./data/dpo/dev.jsonl
eval_dataset_prob: "1.0"
max_seq_len: 8192
num_samples_each_epoch: 6000000
packing: true
mix_strategy: concat

Expand All @@ -30,7 +29,6 @@ max_steps: -1
eval_steps: 100
evaluation_strategy: steps
save_steps: 100
save_total_limit: 1
save_strategy: steps
logging_steps: 1
gradient_accumulation_steps: 4
Expand Down
Loading