This repository contains code and examples for the article: βRapid Generation of Rare Event Pathways Using Direction-Guided Adaptive Sampling: From Ligand Unbinding to Protein (Un)Folding.β π DOI: https://doi.org/10.1021/acs.jctc.5c01244
PathGennie is a general-purpose steering framework for guiding molecular simulations along data-driven(or physical) collective variables (CVs) to rapidly sample rare event transitions such as:
- Ligand unbinding
- Protein folding and unfolding
It leverages high-performance libraries like OpenMM and MDAnalysis, and includes tooling for CV construction, adaptive sampling, and optimized trajectory generation.
pathgennie/ β βββ README.md # Documentation and usage guide βββ LICENSE # MIT or compatible license βββ environment.yml # Conda environment for reproducibility β βββ Scripts/ # Scripts for various path generation tasks β βββ unbind # Ligand unbinding module β βββ unfold # Protein unfolding module β βββ fold # Protein folding / reverse folding module β βββ examples/ # example systems β βββ 3PTB/ # Example: Bovine Trypsin Inhibitor β β βββ native.pdb β β βββ start.gro β β βββ topol.top β β βββ system.py β β β βββ 2JOF/ # Example: Trp-cage protein system β βββ native.pdb β βββ start.gro β βββ topol.top β βββ system.py Clone the repository and set up the Conda environment:
# Clone the repository git clone https://github.com/dmighty007/PathGennie.git # Navigate to the project folder cd PathGennie # Create and activate the environment conda env create -f environment.yml conda activate pathgennieThis installs all required dependencies, including:
openmmmdanalysisnumpy,numba,tqdm, and more
β Note: Ensure Miniconda or Anaconda is installed before proceeding.
This framework enables ligand unbinding simulations by steering MD along principal components derived from distance features between a ligand and its binding site. These components form a low-dimensional CV space for guided sampling.
-
Create a working directory:
mkdir Test && cd Test
-
Add your input files:
pbcmol.gro # Structure with solvent and PBC topol.top # GROMACS-compatible topology -
Perform energy minimization and equilibration using GROMACS or OpenMM. These outputs will be used as initial configurations.
Generate a PCA model based on ligandβprotein contacts:
python pcagen.py pbcmol.gro \ --ligand_sel "resname LIG" \ --output pca.pklThis script:
- Computes distance features between the ligand and surrounding protein atoms
- Performs PCA on the feature matrix
- Stores the principal components in
pca.pkl
Create a system.py file to build the OpenMM simulation system:
from openmm.app import * from openmm import * from openmm.unit import * class Simulation_obj: def __init__(self): gro = GromacsGroFile('pbcmol.gro') top = GromacsTopFile('topol.top', periodicBoxVectors=gro.getPeriodicBoxVectors(), includeDir='/usr/local/gromacs/share/gromacs/top') system = top.createSystem(nonbondedMethod=PME, nonbondedCutoff=1*nanometer, constraints=HBonds) integrator = LangevinMiddleIntegrator(300*kelvin, 1/picosecond, 0.004*picoseconds) self.simulation = Simulation(top.topology, system, integrator)π Refer to the OpenMM documentation for advanced setup options.
Run the unbinding driver with:
../../Scripts/unbind \ --structure_file pbcmol.gro \ --verbose \ --relax1 10 \ --relax2 15 \ --max_probes 50 \ --temperature 300 \ --model_file pca.pklOutput:
trajectory.xtc: Reactive trajectory file showing unbinding progression
| Parameter | Description | Default |
|---|---|---|
--ligand_name | Ligand residue name | LIG |
--selection_radius | Distance (Γ ) to include nearby protein atoms | 20.0 |
--relax1 | MD steps for trial probe | 10 |
--relax2 | MD steps for relaxation after acceptance | 15 |
--max_probes | Number of parallel probes per cycle | 50 |
--temperature | Temperature in Kelvin | 290 |
--no_save | If set, disables output trajectory | False |
π‘ For full options, run:
../../Scripts/unbind -hThe same logic applies to protein folding/unfolding simulations.
../../Scripts/unfold \ --ref_config folded.gro \ --start_config folded.gro \ --verbose \ --relax1 10 \ --relax2 15 \ --max_probes 50 \ --temperature 290../../Scripts/fold \ --ref_config folded.gro \ --start_config equili.gro \ --verbose \ --relax1 10 \ --relax2 15 \ --max_probes 50 \ --temperature 290π Make sure
system.pyis present in the working directory for these commands.
The examples/ directory includes two protein systems with:
- Native PDB structure
- Starting
groconfiguration - GROMACS topology (
topol.top) - Compatible
system.py
You can directly run folding/unfolding or unbinding simulations on these examples.
This project is licensed under the MIT License.
Feel free to open an issue or submit a pull request!