Skip to content

Rapid Generation of Rare Event Pathways Using Direction Guided Adaptive Sampling: From Ligand Unbinding to Protein (Un)Folding

License

Notifications You must be signed in to change notification settings

TeamSuman/PathGennie

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧬 PathGennie: Rapid Rare Event Pathway Generator

This repository contains code and examples for the article: β€œRapid Generation of Rare Event Pathways Using Direction-Guided Adaptive Sampling: From Ligand Unbinding to Protein (Un)Folding.” πŸ“„ DOI: https://doi.org/10.1021/acs.jctc.5c01244

PathGennie is a general-purpose steering framework for guiding molecular simulations along data-driven(or physical) collective variables (CVs) to rapidly sample rare event transitions such as:

  • Ligand unbinding
  • Protein folding and unfolding

It leverages high-performance libraries like OpenMM and MDAnalysis, and includes tooling for CV construction, adaptive sampling, and optimized trajectory generation.


🧩 Core Components

pathgennie/ β”‚ β”œβ”€β”€ README.md # Documentation and usage guide β”œβ”€β”€ LICENSE # MIT or compatible license β”œβ”€β”€ environment.yml # Conda environment for reproducibility β”‚ β”œβ”€β”€ Scripts/ # Scripts for various path generation tasks β”‚ β”œβ”€β”€ unbind # Ligand unbinding module β”‚ β”œβ”€β”€ unfold # Protein unfolding module β”‚ └── fold # Protein folding / reverse folding module β”‚ β”œβ”€β”€ examples/ # example systems β”‚ β”œβ”€β”€ 3PTB/ # Example: Bovine Trypsin Inhibitor β”‚ β”‚ β”œβ”€β”€ native.pdb β”‚ β”‚ β”œβ”€β”€ start.gro β”‚ β”‚ β”œβ”€β”€ topol.top β”‚ β”‚ └── system.py β”‚ β”‚ β”‚ └── 2JOF/ # Example: Trp-cage protein system β”‚ β”œβ”€β”€ native.pdb β”‚ β”œβ”€β”€ start.gro β”‚ β”œβ”€β”€ topol.top β”‚ └── system.py 

πŸ› οΈ Installation

Clone the repository and set up the Conda environment:

# Clone the repository git clone https://github.com/dmighty007/PathGennie.git # Navigate to the project folder cd PathGennie # Create and activate the environment conda env create -f environment.yml conda activate pathgennie

This installs all required dependencies, including:

  • openmm
  • mdanalysis
  • numpy, numba, tqdm, and more

βœ… Note: Ensure Miniconda or Anaconda is installed before proceeding.


πŸš€ Use Case: Ligand Unbinding via PCA-Guided Steering

This framework enables ligand unbinding simulations by steering MD along principal components derived from distance features between a ligand and its binding site. These components form a low-dimensional CV space for guided sampling.


πŸ§ͺ Step 0: Prepare Your System

  1. Create a working directory:

    mkdir Test && cd Test
  2. Add your input files:

    pbcmol.gro # Structure with solvent and PBC topol.top # GROMACS-compatible topology 
  3. Perform energy minimization and equilibration using GROMACS or OpenMM. These outputs will be used as initial configurations.


πŸ”§ Step 1: Generate PCA-Based Collective Variables

Generate a PCA model based on ligand–protein contacts:

python pcagen.py pbcmol.gro \ --ligand_sel "resname LIG" \ --output pca.pkl

This script:

  • Computes distance features between the ligand and surrounding protein atoms
  • Performs PCA on the feature matrix
  • Stores the principal components in pca.pkl

βš™οΈ Step 2: Define the Simulation Object

Create a system.py file to build the OpenMM simulation system:

from openmm.app import * from openmm import * from openmm.unit import * class Simulation_obj: def __init__(self): gro = GromacsGroFile('pbcmol.gro') top = GromacsTopFile('topol.top', periodicBoxVectors=gro.getPeriodicBoxVectors(), includeDir='/usr/local/gromacs/share/gromacs/top') system = top.createSystem(nonbondedMethod=PME, nonbondedCutoff=1*nanometer, constraints=HBonds) integrator = LangevinMiddleIntegrator(300*kelvin, 1/picosecond, 0.004*picoseconds) self.simulation = Simulation(top.topology, system, integrator)

πŸ“˜ Refer to the OpenMM documentation for advanced setup options.


🚦 Step 3: Run the Unbinding Simulation

Run the unbinding driver with:

../../Scripts/unbind \ --structure_file pbcmol.gro \ --verbose \ --relax1 10 \ --relax2 15 \ --max_probes 50 \ --temperature 300 \ --model_file pca.pkl

Output:

  • trajectory.xtc: Reactive trajectory file showing unbinding progression

πŸ”„ CLI Options

Parameter Description Default
--ligand_name Ligand residue name LIG
--selection_radius Distance (Γ…) to include nearby protein atoms 20.0
--relax1 MD steps for trial probe 10
--relax2 MD steps for relaxation after acceptance 15
--max_probes Number of parallel probes per cycle 50
--temperature Temperature in Kelvin 290
--no_save If set, disables output trajectory False

πŸ’‘ For full options, run:

../../Scripts/unbind -h

πŸ” Use Case: Protein Folding and Unfolding

The same logic applies to protein folding/unfolding simulations.

Unfolding Example

../../Scripts/unfold \ --ref_config folded.gro \ --start_config folded.gro \ --verbose \ --relax1 10 \ --relax2 15 \ --max_probes 50 \ --temperature 290

Folding Example

../../Scripts/fold \ --ref_config folded.gro \ --start_config equili.gro \ --verbose \ --relax1 10 \ --relax2 15 \ --max_probes 50 \ --temperature 290

πŸ“Œ Make sure system.py is present in the working directory for these commands.


πŸ“‚ Example Datasets

The examples/ directory includes two protein systems with:

  • Native PDB structure
  • Starting gro configuration
  • GROMACS topology (topol.top)
  • Compatible system.py

You can directly run folding/unfolding or unbinding simulations on these examples.


πŸ“œ License

This project is licensed under the MIT License.


πŸ™‹ Questions?

Feel free to open an issue or submit a pull request!


About

Rapid Generation of Rare Event Pathways Using Direction Guided Adaptive Sampling: From Ligand Unbinding to Protein (Un)Folding

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages