Skip to content

Commit 17bd745

Browse files
authored
Merge pull request #27 from akalpokas/feature_protein_fep
Add protein FEP tutorial
2 parents 163ca44 + 2b74c8c commit 17bd745

File tree

8 files changed

+886
-2
lines changed

8 files changed

+886
-2
lines changed

04_fep/04_PFEP/01_setup_pfep.ipynb

Lines changed: 553 additions & 0 deletions
Large diffs are not rendered by default.

04_fep/04_PFEP/01_setup_pfep.md

Lines changed: 272 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,272 @@
1+
# Alchemical Protein Mutations
2+
3+
In this tutorial you will learn how to use BioSimSpace's mapping functionality to set up alchemical calculations in order to compute the change in the binding affinity of a ligand as a result of a protein mutation. Specifically, we are going to focus on two proteins, first a set up of a single alchemical point mutation on ubiquitin, and second a set up on [aldose reductase](https://en.wikipedia.org/wiki/Aldose_reductase) (AR), which is a drug target for the treatment of diabetic nephropathy. It is recommended to complete [previous BioSimSpace tutorials](https://github.com/OpenBioSim/biosimspace_tutorials) before attempting this one.
4+
5+
The relative change in the binding affinity as a result of a mutation, $\Delta \Delta G_{mut}$ can be calculated from the difference between free energy of mutation in the holo (bound) and apo (unbound) simulation legs, i.e.:
6+
7+
![pfep_tutorial_tcycle](images/pfep_tutorial_tcycle.png)
8+
9+
$$
10+
\Delta \Delta G_{mut} = \Delta G_{holo} - \Delta G_{apo}
11+
$$
12+
13+
To get started, let's go through a simple example of generating the required input files in order to set up an alchemical mutation.
14+
15+
## Simple Case - Input File Generation
16+
17+
In order to create an alchemical protein system in BioSimSpace, we need two input protein structures, a wild-type and a mutant. We also need to make sure that the atom ordering between the two proteins is identical. Don't worry, this is an easy assumption to satisfy. We will load a structure `1UBQ` via [sire](https://sire.openbiosim.org/), which comes with bundled with BioSimSpace:
18+
19+
20+
```python
21+
import BioSimSpace as BSS
22+
import sire as sr
23+
mols = sr.load("1UBQ")
24+
```
25+
26+
There are multiple of ways of generating a mutant structure from a wild-type protein, some examples are:
27+
- [Pymol Mutagenesis Plugin](https://pymolwiki.org/index.php/Mutagenesis) (when exporting the mutant structure, you want to make you select 'retain atom ids' under 'PDB Options', or pass both input structures through *pdb4amber*)
28+
- [HTMD](https://software.acellera.com/htmd/tutorials/system-building-protein-protein.html#mutate-modified-residues)
29+
- [FoldX](https://foldxsuite.crg.eu/command/BuildModel)
30+
- [pdb4amber](https://ambermd.org/tutorials/basic/tutorial9/index.php)
31+
32+
For this simple case we are going to use *pdb4amber* to mutate a threonine at position 9 to an alanine residue. First we are going to pass the wild-type protein from the crystal structure through *pdb4amber* in order create a consistent atom ordering between wild-type and mutant structures:
33+
34+
35+
```python
36+
!pdb4amber --reduce --dry --add-missing-atoms -o 1UBQ_dry_wt.pdb 1UBQ.pdb
37+
```
38+
39+
Next, we are going to create a mutant structure:
40+
41+
42+
```python
43+
!pdb4amber --reduce --dry -o 1UBQ_dry_t9a.pdb -m "9-ALA" --add-missing-atoms 1UBQ_dry_wt.pdb
44+
```
45+
46+
<div class="alert alert-block alert-warning">
47+
<b>Warning:</b> This is a simple, but ultimately a crude way of generating a mutant structure. Different factors such as sidechain rotomers, packing and protonation states need to be taken into the account in order to accurately and robustly describe the mutant end-state.
48+
</div>
49+
50+
## Simple Case - Alchemical System Generation
51+
52+
Now that correct input files have been created, we can now proceed to create an alchemical protein in BioSimSpace. Let's load our two proteins:
53+
54+
55+
```python
56+
protein_wt = BSS.IO.readMolecules("1UBQ_dry_wt.pdb")[0]
57+
protein_mut = BSS.IO.readMolecules("1UBQ_dry_t9a.pdb")[0]
58+
```
59+
60+
Next, we want to parametrise them with our forcefield of choice:
61+
62+
63+
```python
64+
protein_wt = BSS.Parameters.ff14SB(protein_wt).getMolecule()
65+
protein_mut = BSS.Parameters.ff14SB(protein_mut).getMolecule()
66+
```
67+
68+
Now we want to compute the mapping between the two proteins, first let's figure out the residue index of our residue of interest (ROI):
69+
70+
71+
```python
72+
protein_wt.getResidues()[7:10]
73+
```
74+
75+
76+
```python
77+
protein_mut.getResidues()[7:10]
78+
```
79+
80+
We can see that the residue with the index value of 8 are different between the two proteins. Let's pass this value to the [`BioSimSpace.Align.matchAtoms`](https://biosimspace.openbiosim.org/api/generated/BioSimSpace.Align.matchAtoms.html#BioSimSpace.Align.matchAtoms) function:
81+
82+
83+
```python
84+
mapping = BSS.Align.matchAtoms(molecule0=protein_wt, molecule1=protein_mut, roi=[8])
85+
```
86+
87+
<div class="alert alert-block alert-info">
88+
<b>Note:</b> You can also pass multiple residues of interest indices to the mapping if you wish to mutate several residues simultaneously.
89+
</div>
90+
91+
Now that the mapping has been computed, we can visualise it:
92+
93+
94+
```python
95+
BSS.Align.viewMapping(protein_wt, protein_mut, mapping, roi=8, pixels=500)
96+
```
97+
98+
The computed atom mapping shows that both hydroxyl and methyl groups in the threonine side chain will be transformed into hydrogen atoms respectively. We can now proceed to align the two residues of interest:
99+
100+
101+
```python
102+
aligned_wt = BSS.Align.rmsdAlign(molecule0=protein_wt, molecule1=protein_mut, roi=[8])
103+
```
104+
105+
Finally, we can create a merged alchemical protein system:
106+
107+
108+
```python
109+
merged_protein = BSS.Align.merge(aligned_wt, protein_mut, mapping, roi=[8])
110+
```
111+
112+
The alchemical protein can now be solvated, ionised and exported to different file formats, for example GROMACS or [SOMD2, our OpenMM-based FEP engine](https://github.com/OpenBioSim/somd2):
113+
114+
115+
```python
116+
merged_system = merged_protein.toSystem()
117+
118+
# solvate the system with the padding of 15 angstroms
119+
padding = 15 * BSS.Units.Length.angstrom
120+
box_min, box_max = merged_system.getAxisAlignedBoundingBox()
121+
box_size = [y - x for x, y in zip(box_min, box_max)]
122+
box_sizes = [x + padding for x in box_size]
123+
124+
box, angles = BSS.Box.rhombicDodecahedronHexagon(max(box_sizes))
125+
solvated_system = BSS.Solvent.tip3p(molecule=merged_system, box=box, angles=angles, ion_conc=0.15)
126+
```
127+
128+
129+
```python
130+
# export the solvated system to GROMACS input files
131+
BSS.IO.saveMolecules("apo_ubiquitin_t9a", solvated_system, ["gro87", "grotop"])
132+
```
133+
134+
135+
```python
136+
# export the solvated system to SOMD2 input file
137+
BSS.Stream.save(solvated_system, "apo_ubiquitin_t9a")
138+
```
139+
140+
# Aldose Reductase - Alchemical System Generation
141+
142+
## Apo System
143+
144+
Now we are going to focus on the aldose reductase system and set up an alchemical transformation in both apo and holo forms of the protein. The input files (2PDG_8.0) were taken from the SI of a [paper by Aldeghi et. al](https://pubs.acs.org/doi/10.1021/acscentsci.8b00717), residue 47 mutated via PyMol (V47I), and standardised via *pdb4amber*.
145+
146+
147+
```python
148+
protein_wt = BSS.IO.readMolecules(BSS.IO.expand(BSS.tutorialUrl(), "aldose_reductase_dry.pdb"))[0]
149+
protein_mut = BSS.IO.readMolecules(BSS.IO.expand(BSS.tutorialUrl(), "aldose_reductase_v47i_dry.pdb"))[0]
150+
```
151+
152+
We can use `ensure_compatible=False` in order to get tLEaP to re-add the hydrogens for us:
153+
154+
155+
```python
156+
protein_wt = BSS.Parameters.ff14SB(protein_wt, ensure_compatible=False).getMolecule()
157+
protein_mut = BSS.Parameters.ff14SB(protein_mut, ensure_compatible=False).getMolecule()
158+
```
159+
160+
161+
```python
162+
protein_wt.getResidues()[44:47]
163+
```
164+
165+
166+
```python
167+
protein_mut.getResidues()[44:47]
168+
```
169+
170+
This time we are going to automatically detect the different residues between the two proteins:
171+
172+
173+
```python
174+
roi = []
175+
for i, res in enumerate(protein_wt.getResidues()):
176+
if res.name() != protein_mut.getResidues()[i].name():
177+
print(res, protein_mut.getResidues()[i])
178+
roi.append(res.index())
179+
```
180+
181+
We can then pass these residue indices to the mapping function as before:
182+
183+
184+
```python
185+
mapping = BSS.Align.matchAtoms(molecule0=protein_wt, molecule1=protein_mut, roi=roi)
186+
```
187+
188+
189+
```python
190+
BSS.Align.viewMapping(protein_wt, protein_mut, mapping, roi=roi[0], pixels=500)
191+
```
192+
193+
The mapping shows that the perturbation will transform a hydrogen to a methyl group. Is this what we would expect for a valine to isoleucine transformation? If we are happy, we can proceed with the rest of the set up as before:
194+
195+
196+
```python
197+
aligned_wt = BSS.Align.rmsdAlign(molecule0=protein_wt, molecule1=protein_mut, roi=roi)
198+
merged_protein = BSS.Align.merge(aligned_wt, protein_mut, mapping, roi=roi)
199+
```
200+
201+
202+
```python
203+
merged_system = merged_protein.toSystem()
204+
```
205+
206+
207+
```python
208+
padding = 15 * BSS.Units.Length.angstrom
209+
210+
box_min, box_max = merged_system.getAxisAlignedBoundingBox()
211+
box_size = [y - x for x, y in zip(box_min, box_max)]
212+
box_sizes = [x + padding for x in box_size]
213+
```
214+
215+
216+
```python
217+
box, angles = BSS.Box.rhombicDodecahedronHexagon(max(box_sizes))
218+
solvated_system = BSS.Solvent.tip3p(molecule=merged_system, box=box, angles=angles, ion_conc=0.15)
219+
```
220+
221+
222+
```python
223+
# export the solvated system to GROMACS input files
224+
BSS.IO.saveMolecules("apo_aldose_reductase_v47i", solvated_system, ["gro87", "grotop"])
225+
```
226+
227+
228+
```python
229+
# export the solvated system to SOMD2 input file
230+
BSS.Stream.save(solvated_system, "apo_aldo_reductase_v47i")
231+
```
232+
233+
## Holo System
234+
235+
To set up a holo (bound) system, we are going to load in the associated ligand and the cofactor of aldose reductase:
236+
237+
238+
```python
239+
ligand_47d = BSS.IO.readMolecules(BSS.IO.expand(BSS.tutorialUrl(), ["ligand_47_gaff2.gro", "ligand_47_gaff2.top"]))[0]
240+
cofactor_nap = BSS.IO.readMolecules(BSS.IO.expand(BSS.tutorialUrl(), ["cofactor_nap_gaff2.gro", "cofactor_nap_gaff2.top"]))[0]
241+
```
242+
243+
We can use BioSimSpace's Amber parametrisation pipeline if we wish to, but in this case the ligands have been parametrised for us so we can skip the following cell. If you uncomment and run this cell it may take several minutes to complete.
244+
245+
246+
```python
247+
# ligand_47d = BSS.Parameters.gaff2(ligand_47d, charge_method="BCC", net_charge=-1).getMolecule()
248+
# cofactor_nap = BSS.Parameters.gaff2(cofactor_nap, charge_method="BCC", net_charge=-4).getMolecule()
249+
```
250+
251+
We can simply add the ligands to our alchemical protein in order to create an alchemical holo system. This way we are assuming that the ligands are already placed correctly with respect to the protein:
252+
253+
254+
```python
255+
merged_system = merged_protein + ligand_47d + cofactor_nap
256+
```
257+
258+
As before we can now proceed to solvate, ionise and export our prepared system or use BioSimSpace's functionallity to [further set up and execute the alchemical simulations](https://github.com/OpenBioSim/biosimspace_tutorials/tree/main/04_fep).
259+
260+
261+
```python
262+
padding = 15 * BSS.Units.Length.angstrom
263+
264+
box_min, box_max = merged_system.getAxisAlignedBoundingBox()
265+
box_size = [y - x for x, y in zip(box_min, box_max)]
266+
box_sizes = [x + padding for x in box_size]
267+
268+
box, angles = BSS.Box.rhombicDodecahedronHexagon(max(box_sizes))
269+
solvated_system = BSS.Solvent.tip3p(molecule=merged_system, box=box, angles=angles, ion_conc=0.15)
270+
271+
BSS.IO.saveMolecules("holo_aldose_reductase_v47i", solvated_system, ["gro87", "grotop"])
272+
```
3.08 MB
Loading

04_fep/README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,10 @@ Authors:
2828
- [Finlay Clark -- @fjclark](https://github.com/fjclark)
2929

3030
Functionality for running alchemical absolute binding free energy calculations is currently present in the Exscientia Sandpit within BioSimSpace. These notebooks discuss the idea of Sandpits for including experimental features in BioSimSpace, and detail setting up and analysing absolute binding free energy calculations.
31+
32+
## 03. Protein Free Energy Calculations
33+
34+
Authors:
35+
- [Audrius Kalpokas -- @akalpokas](https://github.com/akalpokas)
36+
37+
An introduction to alchemical protein free energy calculations. Specifically, this includes an overview of BioSimSpace region-of-interest (ROI) functionality and examples of how to setup, view and export alchemical protein systems.

LIVECOMS/04_fep/livecoms.tex

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ \subsubsection{Introduction}
1414
Computational chemists can support the study of structure-activity relationships in medicinal chemistry by making computer models that can predict the binding affinity of ligands to proteins. Alchemical free energy (AFE) methods are a popular class of methodologies to do so. Some
1515
introductory reading is recommended ~\cite{mey2020best, cournia_allen_sherman_2017, kuhn_firth-clark_tosco_mey_mackey_michel_2020, Hahn2022}.
1616

17-
This tutorial covers the basic principles of alchemical free energy calculations with BioSimSpace; how to setup, simulate, and analyze alchemical Relative Binding Free Energy (RBFE) calculations for congeneric series of protein-ligand complexes; how to set up and analyze alchemical ABFE calculations of a ligand bound to a protein.
17+
This tutorial covers the basic principles of alchemical free energy calculations with BioSimSpace; how to setup, simulate, and analyze alchemical Relative Binding Free Energy (RBFE) calculations for congeneric series of protein-ligand complexes; how to set up and analyze alchemical ABFE calculations of a ligand bound to a protein; how to set up alchemical RBFE calculations involving protein mutations.
1818
The notebooks prompt the readers to complete a series of exercises that typically involve completing cells to test their understanding of the material presented.
1919

2020
\subsubsection{Setting up AFE calculations using BioSimSpace}
@@ -85,3 +85,17 @@ \subsubsection{ABFE calculations}
8585
\\
8686

8787
The fifth notebook \href{https://github.com/OpenBioSim/biosimspace_tutorials/blob/main/04_fep/03_ABFE/02_analysis_abfe.ipynb}{02-analysis-abfe} describes how to analyse an ABFE calculation to estimate the free energy of binding of a ligand. Sample output simulation data are provided for each leg of the double decoupling thermodynamic cycle and analysed using \textit{BSS.FreeEnergy.AlchemicalFreeEnergy.Analyse} to plot potentials of mean forces. The standard free energy of binding is then obtained by summing the free energy changes from each leg and adding a standard state correction term for the use of Boresch restraints, along with any symmetry corrections required. The notebook also describes how to carry out convergence analyses (see Figure \ref{abfe_fig}B) to assess the robustness of the ABFE estimates.
88+
89+
\subsubsection{Protein FEP calculations}
90+
91+
The sixth notebook \href{https://github.com/OpenBioSim/biosimspace_tutorials/blob/main/04_fep/04_PFEP/01_setup_pfep.ipynb}{01-setup-pfep} describes how to set up alchemical RBFE calculations with BSS that involve alchemical protein modifications. This free energy functionality was introduced in the 2024.2 release of BSS, which added changes to the maximum common substructure, alignment and merge functionalities in order to support the setting up of alchemical protein calculations.
92+
93+
While traditionally AFE methodologies have been developed with the focus on ligand modifications, AFE protocols that focus on protein modifications, specifically side-chain mutations, are less established. These protein free energy perturbation (PFEP) protocols allow for example, the calculation of the change in the binding free energy as a result of a protein mutation, which can then be used to optimize the efficacy of the ligand against drug resistance, or to tune the selectivity selectivity profile against off-targets \cite{doi:10.1021/acscentsci.8b00717}. In addition, PFEP protocols also allow for estimating the effect of amino acid mutations on protein-protein binding affinities \cite{doi:10.1021/acs.jctc.3c00333}. These types of calculations involve alchemically transforming the protein side-chains in both holo and apo stages in the thermodynamic cycle, as shown in Figure \ref{fig:pfep_tcycle}.
94+
95+
\begin{figure}[htp]
96+
\includegraphics[width=\linewidth]{LIVECOMS/04_fep/pfep-tutorial_tcycle.png}
97+
\caption{Thermodynamic cycle for calculating change in the binding affinity of a ligand due to a protein mutation. The change in the binding affinity as a result of a mutation ($\Delta\Delta G_{mut}$) can be calculated by taking the difference between $\Delta G_{holo}$ and $\Delta G_{apo}$.}
98+
\label{fig:pfep_tcycle}
99+
\end{figure}
100+
101+
The notebook describes how to prepare protein input files in order to make use of the new region-of-interest (ROI) mapping functionality of \textit{BSS.Align.matchAtoms}. This is illustrated by using ubiquitin as the example protein system. Next, the notebook shows how to use the ROI features of \textit{BSS.Align.rmsdAlign} and \textit{BSS.Align.merge} functions in order to set up an alchemical side-chain modification. Another, more realistic example with an aldose reductase system taken from a study by \citeauthor{doi:10.1021/acscentsci.8b00717} et al. is also demonstrated, where both apo and the holo parts of the thermodynamic cycle are prepared.
3.08 MB
Loading

LIVECOMS/main.tex

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@
5151
\author[4]{Benjamin P. Cossins}
5252
\author[3]{Adele Hardie}
5353
\author[3]{Anna M. Herz}
54+
\author[3]{Audrius Kalpokas}
5455
\author[5]{Dominykas Lukauskis}
5556
\author[3]{Antonia S.J.S. Mey}
5657
\author[2,3*]{Julien Michel}
@@ -80,6 +81,7 @@
8081
\orcid{Benjamin P. Cossins}{0000-0002-6699-8833}
8182
\orcid{Adele Hardie}{0009-0002-9943-9920}
8283
\orcid{Anna M. Herz}{0000-0003-2831-6691}
84+
\orcid{Audrius Kalpokas}{0000-0002-4579-070X}
8385
\orcid{Dominykas Lukauskis}{0000-0002-4999-2691}
8486
\orcid{Antonia Mey}{0000-0001-7512-5252}
8587
\orcid{Julien Michel}{0000-0003-0360-1760}
@@ -217,7 +219,7 @@ \section{Author Contributions}
217219
% See the policies ``Policies on Authorship'' section of https://livecoms.github.io
218220
% for more information on deciding on authorship and author order.
219221
%%%%%%%%%%%%%%%%
220-
LH prepared Tutorial 1, DL and LH prepared Tutorial 2, AH and LH prepared Tutorial 3, JS, LH, AH, FC, and JM prepared Tutorial 4, which was built on older tutorial material by AM and SB and on contributions from ZW, MS, and BC. MB and CW reviewed and tested all tutorials and ported the tutorials to a web server. The authors are listed in alphabetical order, with the exception of the first coauthor.
222+
LH prepared Tutorial 1, DL and LH prepared Tutorial 2, AH and LH prepared Tutorial 3, JS, AK, LH, AH, FC, and JM prepared Tutorial 4, which was built on older tutorial material by AM and SB and on contributions from ZW, MS, and BC. MB and CW reviewed and tested all tutorials and ported the tutorials to a web server. The authors are listed in alphabetical order, with the exception of the first coauthor.
221223
% We suggest you preserve this comment:
222224
For a more detailed description of author's contributions,
223225
see the GitHub issue tracking and changelog at \githubrepository.

LIVECOMS/references.bib

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -561,4 +561,40 @@ @book{Bussi2019
561561
booktitle = {Biomolecular Simulations},
562562
chapter = {21},
563563
doi={10.1007/978-1-4939-9608-7_21}
564+
}
565+
566+
@article{doi:10.1021/acs.jctc.3c00333,
567+
author = {Zhang, Ivy and Rufa, Dominic A. and Pulido, Iván and Henry, Michael M. and Rosen, Laura E. and Hauser, Kevin and Singh, Sukrit and Chodera, John D.},
568+
title = {Identifying and Overcoming the Sampling Challenges in Relative Binding Free Energy Calculations of a Model Protein:Protein Complex},
569+
journal = {Journal of Chemical Theory and Computation},
570+
volume = {19},
571+
number = {15},
572+
pages = {4863-4882},
573+
year = {2023},
574+
doi = {10.1021/acs.jctc.3c00333},
575+
note ={PMID: 37450482},
576+
URL = {
577+
https://doi.org/10.1021/acs.jctc.3c00333
578+
},
579+
eprint = {
580+
https://doi.org/10.1021/acs.jctc.3c00333
581+
}
582+
}
583+
584+
@article{doi:10.1021/acscentsci.8b00717,
585+
author = {Aldeghi, Matteo and Gapsys, Vytautas and de Groot, Bert L.},
586+
title = {Accurate Estimation of Ligand Binding Affinity Changes upon Protein Mutation},
587+
journal = {ACS Central Science},
588+
volume = {4},
589+
number = {12},
590+
pages = {1708-1718},
591+
year = {2018},
592+
doi = {10.1021/acscentsci.8b00717},
593+
note ={PMID: 30648154},
594+
URL = {
595+
https://doi.org/10.1021/acscentsci.8b00717
596+
},
597+
eprint = {
598+
https://doi.org/10.1021/acscentsci.8b00717
599+
}
564600
}

0 commit comments

Comments
 (0)