@@ -60,42 +60,42 @@ Attention Is All You Need
6060 ```
6161 python make_dataset.py
6262 -mode train
63- -source_input_path path/bpe_wmt17.en [ bpe applied document data]
64- -source_out_path path/source_idx_wmt17_en.csv [ bpe idx data]
65- -target_input_path path/bpe_wmt17.de [ bpe applied document data]
66- -target_out_path path/source_idx_wmt17_de.csv [ bpe idx data
67- -bucket_out_path ./bpe_dataset/train_set_wmt17 [ bucket trainset]
68- -voca_path voca_path/voca_file_name [ bpe voca]
63+ -source_input_path path/bpe_wmt17.en (source bpe applied document data)
64+ -source_out_path path/source_idx_wmt17_en.csv (source bpe idx data)
65+ -target_input_path path/bpe_wmt17.de (target bpe applied document data)
66+ -target_out_path path/source_idx_wmt17_de.csv (target bpe idx data)
67+ -bucket_out_path ./bpe_dataset/train_set_wmt17 ( bucket trainset from source bpe idx data, target bpe idx data)
68+ -voca_path voca_path/voca_file_name ( bpe voca from bpe_learn.py)
6969 ```
7070 * make bucket valid_set newstest2014
7171 ```
7272 python make_dataset.py
7373 -mode infer
74- -source_input_path path/bpe_newstest2014.en [ bpe applied document data]
75- -source_out_path path/source_idx_newstest2014_en.csv [ bpe idx data]
76- -target_input_path path/dev.tar/newstest2014.tc.de [ original raw data]
77- -bucket_out_path ./bpe_dataset/valid_set_newstest2014 [ bucket validset]
78- -voca_path voca_path/voca_file_name [ bpe voca]
74+ -source_input_path path/bpe_newstest2014.en (source bpe applied document data)
75+ -source_out_path path/source_idx_newstest2014_en.csv (source bpe idx data)
76+ -target_input_path path/dev.tar/newstest2014.tc.de (target original raw data)
77+ -bucket_out_path ./bpe_dataset/valid_set_newstest2014 ( bucket validset from source bpe idx data, target original raw data)
78+ -voca_path voca_path/voca_file_name ( bpe voca from bpe_learn.py)
7979 ```
80- * make bucket valid_set newstest2015
80+ * make bucket test_set newstest2015
8181 ```
8282 python make_dataset.py
8383 -mode infer
84- -source_input_path path/bpe_newstest2015.en [ bpe applied document data]
85- -source_out_path path/source_idx_newstest2015_en.csv [ bpe idx data]
86- -target_input_path path/dev.tar/newstest2015.tc.de [ original raw data]
87- -bucket_out_path ./bpe_dataset/valid_set_newstest2015 [ bucket testset]
88- -voca_path voca_path/voca_file_name [ bpe voca]
84+ -source_input_path path/bpe_newstest2015.en (source bpe applied document data)
85+ -source_out_path path/source_idx_newstest2015_en.csv (source bpe idx data)
86+ -target_input_path path/dev.tar/newstest2015.tc.de (target original raw data)
87+ -bucket_out_path ./bpe_dataset/test_set_newstest2015 ( bucket testset from source bpe idx data, target original raw data)
88+ -voca_path voca_path/voca_file_name ( bpe voca from bpe_learn.py)
8989 ```
90- * make bucket valid_set newstest2016
90+ * make bucket test_set newstest2016
9191 ```
9292 python make_dataset.py
9393 -mode infer
94- -source_input_path path/bpe_newstest2016.en [ bpe applied document data]
95- -source_out_path path/source_idx_newstest2016_en.csv [ bpe idx data]
96- -target_input_path path/dev.tar/newstest2016.tc.de [ original raw data]
97- -bucket_out_path ./bpe_dataset/valid_set_newstest2016 [ bucket testset]
98- -voca_path voca_path/voca_file_name [ bpe voca]
94+ -source_input_path path/bpe_newstest2016.en (source bpe applied document data)
95+ -source_out_path path/source_idx_newstest2016_en.csv (source bpe idx data)
96+ -target_input_path path/dev.tar/newstest2016.tc.de (target original raw data)
97+ -bucket_out_path ./bpe_dataset/test_set_newstest2016 ( bucket testset from source bpe idx data, target original raw data)
98+ -voca_path voca_path/voca_file_name ( bpe voca from bpe_learn.py)
9999 ```
100100 * translation_train.py
101101 * en -> de translation train, validation, test
@@ -104,8 +104,8 @@ Attention Is All You Need
104104 python translation_train.py
105105 -train_path_2017 ./bpe_dataset/train_set_wmt17
106106 -valid_path_2014 ./bpe_dataset/valid_set_newstest2014
107- -test_path_2015 ./bpe_dataset/valid_set_newstest2015
108- -test_path_2016 ./bpe_dataset/valid_set_newstest2016
107+ -test_path_2015 ./bpe_dataset/test_set_newstest2015
108+ -test_path_2016 ./bpe_dataset/test_set_newstest2016
109109 -voca_path voca_path/voca_file_name
110110 ```
111111
0 commit comments