Skip to content

Commit 02977d0

Browse files
authored
Update causal_dataset.py
1 parent 31c3b55 commit 02977d0

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

paddlenlp/data/causal_dataset.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@ def build_train_valid_test_datasets(
149149
prefixes, weights, datasets_train_valid_test_num_samples = output
150150
# NOTE: megatron/gpt_dataset.py has been updated. When creating BlendableDataset, we will use the raw train_val_test_num_samples instead of the expanded ones.
151151
# Please refer to https://github.com/NVIDIA/NeMo/blob/72f630d087d45655b1a069dc72debf01dfdbdb2d/nemo/collections/nlp/data/language_modeling/megatron/gpt_dataset.py#L74-L80 for more information
152-
train_num_samples, valid_num_samples, test_num_samples = datasets_train_valid_test_num_samples
152+
train_num_samples, valid_num_samples, test_num_samples = train_val_test_num_samples
153153

154154
# Build individual datasets.
155155
train_datasets = []

0 commit comments

Comments
 (0)