Skip to content

Commit 1c82a6b

Browse files
authored
Update README.md
Using pytorch-pretrained-BERT to convert tensorflow ckpt file to pytorch pt file is not working. It changes state_dict key from *.LayerNorm.beta and *.LayerNorm.gamma to *.LayerNorm.weight and *.LayerNorm.beta. Also it changes dimensions somehow. `bert/convert_tf_checkpoint_to_pytorch.py` doesn't have problems above.
1 parent 286aaea commit 1c82a6b

File tree

1 file changed

+12
-1
lines changed

1 file changed

+12
-1
lines changed

README.md

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,18 @@
8787

8888
#### Data
8989
- The data is annotated by using `annotate_ws.py` which is based on [`annotate.py`](https://github.com/salesforce/WikiSQL) from WikiSQL repository. The tokens of natural language guery, and the start and end indices of where-conditions on natural language tokens are annotated.
90-
- Pre-trained BERT parameters can be downloaded from BERT [official repository](https://github.com/google-research/bert) and can be coverted to `pt`file following instruction from [huggingface-pytorch-pretrained-BERT](https://github.com/huggingface/pytorch-pretrained-BERT).
90+
- Pre-trained BERT parameters can be downloaded from BERT [official repository](https://github.com/google-research/bert) and can be coverted to `pt`file using following script. You need install both pytorch and tensorflow and change `BERT_BASE_DIR` to your data directory.
91+
92+
```sh
93+
cd sqlova
94+
export BERT_BASE_DIR=data/uncased_L-12_H-768_A-12
95+
python bert/convert_tf_checkpoint_to_pytorch.py \
96+
--tf_checkpoint_path $BERT_BASE_DIR/bert_model.ckpt \
97+
--bert_config_file $BERT_BASE_DIR/bert_config.json \
98+
--pytorch_dump_path $BERT_BASE_DIR/pytorch_model.bin
99+
```
100+
101+
`bert/convert_tf_checkpoint_to_pytorch.py` is inspired by [huggingface-pytorch-pretrained-BERT](https://github.com/huggingface/pytorch-pretrained-BERT), but `pytorch-pretrained-BERT` is not compatible with our bert model.
91102
- For the conveinience, the annotated WikiSQL data and the PyTorch-converted pre-trained BERT parameters are available at [here](https://drive.google.com/file/d/1iJvsf38f16el58H4NPINQ7uzal5-V4v4/view?usp=sharing).
92103

93104
### License

0 commit comments

Comments
 (0)