Skip to content

going-doer/Paper2Code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

56 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“„ Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

PaperCoder Overview

πŸ“„ Read the paper on arXiv

PaperCoder is a multi-agent LLM system that transforms paper into a code repository. It follows a three-stage pipeline: planning, analysis, and code generation, each handled by specialized agents.
Our method outperforms strong baselines on both Paper2Code and PaperBench and produces faithful, high-quality implementations.


πŸ—ΊοΈ Table of Contents


⚑ Quick Start

Using OpenAI API

  • πŸ’΅ Estimated cost for using o3-mini: $0.50–$0.70
pip install openai export OPENAI_API_KEY="<OPENAI_API_KEY>" cd scripts bash run.sh

Using Open Source Models with vLLM

  • If you encounter any issues installing vLLM, please refer to the official vLLM repository.
  • The default model is deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct.
pip install vllm cd scripts bash run_llm.sh

Output Folder Structure (Only Important Files)

outputs β”œβ”€β”€ Transformer β”‚ β”œβ”€β”€ analyzing_artifacts β”‚ β”œβ”€β”€ coding_artifacts β”‚ └── planning_artifacts └── Transformer_repo # Final output repository

πŸ“š Detailed Setup Instructions

πŸ› οΈ Environment Setup

  • πŸ’‘ To use the o3-mini version, make sure you have the latest openai package installed.
  • πŸ“¦ Install only what you need:
    • For OpenAI API: openai
    • For open-source models: vllm
pip install openai pip install vllm 
  • Or, if you prefer, you can install all dependencies using pip:
pip install -r requirements.txt

πŸ“„ (Option) Convert PDF to JSON

The following process describes how to convert a paper PDF into JSON format.
If you have access to the LaTeX source and plan to use it with PaperCoder, you may skip this step and proceed to πŸš€ Running PaperCoder.
Note: In our experiments, we converted all paper PDFs to JSON format.

  1. Clone the s2orc-doc2json repository to convert your PDF file into a structured JSON format.
    (For detailed configuration, please refer to the official repository.)
git clone https://github.com/allenai/s2orc-doc2json.git
  1. Run the PDF processing service.
cd ./s2orc-doc2json/grobid-0.7.3 ./gradlew run
  1. Convert your PDF into JSON format.
mkdir -p ./s2orc-doc2json/output_dir/paper_coder python ./s2orc-doc2json/doc2json/grobid2json/process_pdf.py \ -i ${PDF_PATH} \ -t ./s2orc-doc2json/temp_dir/ \ -o ./s2orc-doc2json/output_dir/paper_coder

πŸš€ Running PaperCoder

  • Note: The following command runs example paper (Attention Is All You Need).
    If you want to run PaperCoder on your own paper, please modify the environment variables accordingly.

Using OpenAI API

  • πŸ’΅ Estimated cost for using o3-mini: $0.50–$0.70
# Using the PDF-based JSON format of the paper export OPENAI_API_KEY="<OPENAI_API_KEY>" cd scripts bash run.sh
# Using the LaTeX source of the paper export OPENAI_API_KEY="<OPENAI_API_KEY>" cd scripts bash run_latex.sh

Using Open Source Models with vLLM

  • The default model is deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct.
# Using the PDF-based JSON format of the paper cd scripts bash run_llm.sh
# Using the LaTeX source of the paper cd scripts bash run_latex_llm.sh

πŸ“¦ Paper2Code Benchmark Datasets

  • Huggingface dataset: paper2code

  • You can find the description of the Paper2Code benchmark dataset in data/paper2code.

  • For more details, refer to Section 4.1 "Paper2Code Benchmark" in the paper.


πŸ“Š Model-based Evaluation of Repositories Generated by PaperCoder

  • We evaluate repository quality using a model-based approach, supporting both reference-based and reference-free settings.
    The model critiques key implementation components, assigns severity levels, and generates a 1–5 correctness score averaged over 8 samples using o3-mini-high.

  • For more details, please refer to Section 4.3.1 (Paper2Code Benchmark) of the paper.

  • Note: The following examples evaluate the sample repository (Transformer_repo).
    Please modify the relevant paths and arguments if you wish to evaluate a different repository.

πŸ› οΈ Environment Setup

pip install tiktoken export OPENAI_API_KEY="<OPENAI_API_KEY>"

πŸ“ Reference-free Evaluation

  • target_repo_dir is the generated repository.
cd codes/ python eval.py \ --paper_name Transformer \ --pdf_json_path ../examples/Transformer_cleaned.json \ --data_dir ../data \ --output_dir ../outputs/Transformer \ --target_repo_dir ../outputs/Transformer_repo \ --eval_result_dir ../results \ --eval_type ref_free \ --generated_n 8 \ --papercoder

πŸ“ Reference-based Evaluation

  • target_repo_dir is the generated repository.
  • gold_repo_dir should point to the official repository (e.g., author-released code).
cd codes/ python eval.py \ --paper_name Transformer \ --pdf_json_path ../examples/Transformer_cleaned.json \ --data_dir ../data \ --output_dir ../outputs/Transformer \ --target_repo_dir ../outputs/Transformer_repo \ --gold_repo_dir ../examples/Transformer_gold_repo \ --eval_result_dir ../results \ --eval_type ref_based \ --generated_n 8 \ --papercoder

πŸ“„ Example Output

======================================== 🌟 Evaluation Summary 🌟 πŸ“„ Paper name: Transformer πŸ§ͺ Evaluation type: ref_based πŸ“ Target repo directory: ../outputs/Transformer_repo πŸ“Š Evaluation result: πŸ“ˆ Score: 4.5000 βœ… Valid: 8/8 ======================================== 🌟 Usage Summary 🌟 [Evaluation] Transformer - ref_based πŸ› οΈ Model: o3-mini πŸ“₯ Input tokens: 44318 (Cost: $0.04874980) πŸ“¦ Cached input tokens: 0 (Cost: $0.00000000) πŸ“€ Output tokens: 26310 (Cost: $0.11576400) πŸ’΅ Current total cost: $0.16451380 πŸͺ™ Accumulated total cost so far: $0.16451380 ============================================

About

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published