Treebert: A tree-based pre-trained model for programming language
Source code can be parsed into the abstract syntax tree (AST) based on defined syntax
rules. However, in pre-training, little work has considered the incorporation of tree structure
into the learning process. In this paper, we present TreeBERT, a tree-based pre-trained
model for improving programming language-oriented generation tasks. To utilize tree
structure, TreeBERT represents the AST corresponding to the code as a set of composition
paths and introduces node position embedding. The model is trained by tree masked …
rules. However, in pre-training, little work has considered the incorporation of tree structure
into the learning process. In this paper, we present TreeBERT, a tree-based pre-trained
model for improving programming language-oriented generation tasks. To utilize tree
structure, TreeBERT represents the AST corresponding to the code as a set of composition
paths and introduces node position embedding. The model is trained by tree masked …
TreeBERT: A Tree-Based Pre-Trained Model for Programming Language Supplementary Material
X Jiang, Z Zheng, C Lyu, L Li, L Lv - proceedings.mlr.press
In this supplemental material, we first introduce the code tokenization in Section 1. Second,
we provide detailed statistical information of datasets used for the experiment in Section 2.
Then, we describe the metrics used to evaluate TreeBERT in Section 3. Finally, we show the
detailed results of some experiments in Section 4.
we provide detailed statistical information of datasets used for the experiment in Section 2.
Then, we describe the metrics used to evaluate TreeBERT in Section 3. Finally, we show the
detailed results of some experiments in Section 4.
Showing the best results for this search. See all results