Improving the Transformer Translation Model with Document-Level Context

Introduction

This is the implementation of our work, which extends Transformer to integrate document-level context [paper]. The implementation is on top of THUMT

Usage

Note: The usage is not user-friendly. May improve later.

Train a standard Transformer model, please refer to the user manual of THUMT. Suppose that model_baseline/model.ckpt-30000 performs best on validation set.
Generate a dummy improved Transformer model with the following command:

python THUMT/thumt/bin/trainer_ctx.py --inputs [source corpus] [target corpus] \ --context [context corpus] \ --vocabulary [source vocabulary] [target vocabulary] \ --output model_dummy --model contextual_transformer \ --parameters train_steps=1

Generate the initial model by merging the standard Transformer model into the dummy model, then create a checkpoint file:

python THUMT/thumt/scripts/combine_add.py --model model_dummy/model.ckpt-0 \ --part model_baseline/model.ckpt-30000 --output train printf 'model_checkpoint_path: "new-0"\nall_model_checkpoint_paths: "new-0"' > train/checkpoint

Train the improved Transformer model with the following command:

python THUMT/thumt/bin/trainer_ctx.py --inputs [source corpus] [target corpus] \ --context [context corpus] \ --vocabulary [source vocabulary] [target vocabulary] \ --output train --model contextual_transformer \ --parameters start_steps=30000,num_context_layers=1

Translate with the improved Transformer model:

python THUMT/thumt/bin/translator_ctx.py --inputs [source corpus] --context [context corpus] \ --output [translation result] \ --vocabulary [source vocabulary] [target vocabulary] \ --model contextual_transformer --checkpoints [model path] \ --parameters num_context_layers=1

Citation

Please cite the following paper if you use the code:

@InProceedings{Zhang:18, author = {Zhang, Jiacheng and Luan, Huanbo and Sun, Maosong and Zhai, Feifei and Xu, Jingfang and Zhang, Min and Liu, Yang}, title = {Improving the Transformer Translation Model with Document-Level Context}, booktitle = {Proceedings of EMNLP}, year = {2018}, }

FAQ

What is the context corpus?

The context corpus file contains one context sentence each line. Normally, context sentence is the several preceding source sentences within a document. For example, if the origin document-level corpus is:

==== source ==== <document id=XXX> <seg id=1>source sentence #1</seg> <seg id=2>source sentence #2</seg> <seg id=3>source sentence #3</seg> <seg id=4>source sentence #4</seg> </document> ==== target ==== <document id=XXX> <seg id=1>target sentence #1</seg> <seg id=2>target sentence #2</seg> <seg id=3>target sentence #3</seg> <seg id=4>target sentence #4</seg> </document>

The inputs to our system should be processed as (suppose that 2 preceding source sentences are used as context):

==== train.src ==== (source corpus) source sentence #1 source sentence #2 source sentence #3 source sentence #4 ==== train.ctx ==== (context corpus) (the first line is empty) source sentence #1 source sentence #1 source sentence #2 (there is only a space between the two sentence) source sentence #2 source sentence #3 ==== train.trg ==== (target corpus) target sentence #1 target sentence #2 target sentence #3 target sentence #4

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
docs		docs
thumt		thumt
LICENSE		LICENSE
README.md		README.md
UserManual.pdf		UserManual.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Improving the Transformer Translation Model with Document-Level Context

Contents

Introduction

Usage

Citation

FAQ

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

THUNLP-MT/Document-Transformer

Folders and files

Latest commit

History

Repository files navigation

Improving the Transformer Translation Model with Document-Level Context

Contents

Introduction

Usage

Citation

FAQ

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages