We propose GypSum, a new deep learning model that learns hybrid representations using graph neural networks and a pre-trained programming and natural language model. GypSum uses two encoders to learn from the AST-based graph and the token sequence of source code, respectively, and modifies the encoder-decoder sublayer in the Transformer's decoder to fuse the representations.
This repos is developed based on the environment of:
Python 3.7
PyTorch 1.7.0
- How to construct graph for python and java program?
you can use the function
get_graph_from_sourcefromproprocess/java(python)_graph_construction.pyAfter processing the code, save the data aspklinto data folder. - How to run?
./run
$GPU_ID$ $DATASET$ $TASK$ For example, if you want to train java model from scrach withcuda:0, you can just run command./run 0 java train
Gypsum-main ├─- README.md ├── c2nl │ ├── __init__.py │ ├── __pycache__ │ ├── config.py │ ├── decoders │ ├── encoders │ ├── eval │ ├── inputters │ ├── models │ ├── modules │ ├── objects │ ├── tokenizers │ ├── translator │ └── utils ├── config │ ├── general_config.yml │ ├── java_xxx_xxx.yml │ ├── ... ├── data │ ├── java │ └── python ├── evaluation │ ├── bleu │ ├── evaluate.py │ ├── meteor │ └── rouge ├── gypsum │ ├── __pycache__ │ ├── data │ ├── metor.ipynb │ ├── model.py │ ├── modules │ ├── predict.py │ ├── train.py │ └── utils ├── modules │ ├── __pycache__ │ └── attention_zoo.py ├── preprocess │ ├── generate_java_graph.ipynb │ ├── java_graph_construct.py │ ├── python_ast.ipynb │ └── python_graph.py └── runWe have uploaded our dataset on google drive: Dataset For Experiment
We borrowed and modified code from DrQA, OpenNMT. We would like to expresse our gratitdue for the authors of these repositeries.