You are on page 1of 5

An efficient and cost-effective means to identify potential therapeutic candidates is molecular

docking and dynamics. Molecular docking is a computational technique to predict binding


geometry for compounds within the binding site of a target protein model. Protein structure
models are typically held rigid and ligands remain flexible to sample energetically favorable
conformers within the binding cavity. Molecular dynamics (MD) involves examining and
interpreting computationally predicted motions of proteins and bound ligands over time. MD
simulations typically cover short periods of real time (nano- to microseconds), but require a
significant amount of computation time (days to months) to perform. Despite the amount of
computation time, a significantly large amount of data can be mined from these simulations,
thereby streamlining more costly experimental studies. Initial protein models for docking and
dynamics are typically derived from crystallography and nuclear magnetic resonance (NMR)
data available in the RCSB Protein Data Bank (PDB). Homology modeling can also be used to
build structural models based on similarity to known protein structures.

For the purposes of this tutorial, you will use crystal structures to perform re-docking and cross-
docking. Re-docking refers to the ability to reproduce co-crystallized binding geometry and
orientation of the associated ligand given a rigid macromolecule state. Cross-docking refers to
utilizing different ligand structures isolated from multiple PDB files of the same protein to test
against a single rigid protein model structure. In both cases, success is relative to root mean-
squared deviation (RMSD) between the docked pose and the respective crystal conformer.
RMSD values less than 2.0 Å are considered good, but values closest to zero are ideal.

Start with re-docking. This will allow you to identify protein structures that may be ideal for
screening multiple compounds and will allow you to test the docking parameters. For this step,
you will examine five crystal structures for the protein peroxisome proliferator-activated
receptor-gamma (PPARg). The five crystal structures all contain rosiglitazone as the ligand. The
PDB IDs for these structures are:
 1FM6
 1ZGY
 2PRG
 3CS8
 3DZY
Build a table (either written in your lab notebook or as a spreadsheet on the computer) that has
the PDB codes as headers for both the rows and columns. Rows will be the protein structure;
columns will be the ligands. (See example below)
1fm6 ligand 1zgy ligand 2prg ligand 3cs8 ligand 3dzy ligand
1fm6 protein
1zgy protein
2prg protein
3cs8 protein
3dzy protein

Even though all of these structures are composed of the same protein and ligand, you should
notice differences in the structures. You can use Chimera to examine and compare the structures
by superimposing the coordinates (check out the “MatchMaker” tool under Chimera to learn how
to do this.) You will use the AutoDock tutorial I provided to go through the docking process with
each of these structures.
1. I recommend that your first step should be superimposing the five structures and saving
the coordinates relative to a single reference structure. This will prove helpful when
comparing the docking results later.
2. In Chimera, fetch by ID 1FM6.
Note: not all PDBs start with chain A. Use the PDB page to check which chain is the PPARg
chain and save the first in the series. (e.g., PDB page for the ID# lists chains D, E, and F as
PPARg. You would save chain D).
a. Select Chain X (where X is the chain identifier for PPARg on the PDB page for
the ID). (Select  Chain  X)
b. Invert selection. (Select  Invert (selected models)
c. Delete all except chain X. (Actions  Atom/Bonds  Delete) This should also
delete water molecules.
d. Save as PDB (File  Save PDB…)
e. Close session
f. Repeat for remaining PDB IDs.
3. You should have 5 PDB files now saved. Each contains the protein and a ligand (ligand
abbreviation BRL). Now we will superimpose the structures and save the coordinates.
a. Open all 5 PDB files.
b. Open the MatchMaker widget. (Tools  Structure Comparison  MatchMaker)
i. Note: You can add this widget to your toolbar
ii. Favorites  Preferences…
iii. Beside “Category:” click the button and select “Tools” from the list.
iv. The window should now display all tools with check boxes for where to
display each. Under “Structure Comparison” check “On Toolbar” for
MatchMaker
v. Click Save and Close.
vi. The image on the toolbar for this tool is a couple kissing.
c. Select one of the structures as your reference.
d. Select all others (using shift key) under structures to match.
e. Leave the default selections (take a look at these options so you know what the
tool is doing) and select Apply.
f. Wait until you see all the proteins superimposed in the view window (this may
take a few minutes).
i. You can reset your viewing window:
1. Tools  Viewing controls  Side View
2. Click “view all”
ii. You can also put this tool on your toolbar (see instructions above under
MatchMaker)
g. Select each chain separately and save the new coordinates.
i. Select  Chain  X (or some other letter) [if more than one chain X is
present, select the appropriate filename]
ii. File  Save PDB…
1. Check “Save selected atoms only”
2. Make sure the save relative to option reads the structure you used
as your reference for the MatchMaker tool.
3. Give the structure a new name (e.g., 2prg_1fm6Position.pdb)
4. Save
h. Once you have saved all the new coordinate files, close your session.
4. Now add hydrogen atoms to each file and save the components (protein and ligand)
separately. This should be spelled out in the tutorial. Make sure your file names make it
clear from which PDB file the proteins and ligands are found (e.g.,
1fm6Prot_wH_2prgPosition.pdb and 1fm6Lig_wH_2prgPosition.pdb).
5. You should now have 5 protein files with hydrogen atoms added and 5 ligand files with
hydrogen atoms added.
6. Open ADT and prepare the macromolecule (protein) and ligand files (See tutorial).
7. You will also need to record box coordinates and dimensions to enter into Vina. Using
the same grid box tool as before, you will change the grid spacing from 0.375 to 1.000.
Reduce the size of your box so it covers just the binding cavity as with re-docking. Write
down the X, Y, and Z coordinates for the center of your box and the X, Y, Z dimensions
for your box.
8. Now create a configuration file.
a. In your working directory:
b. vi config.txt (This will be generated as a new file.)
c. If you don’t know how to use the vi tool, you can look up the commands online
by doing a search for “vi commands”. The contents of the file should be (where
## are the box coordinates and dimensions you recorded from ADT):
receptor = 1fm6Prot_2prgPos.pdbqt
center_x = ##.###
center_y = ##.###
center_z = ##.###
size_x = ##
size_y = ##
size_z = ##
d. You can also type this into a text editor and save as config.txt, but writing the file
this way does not always work with the vina command. If you write the
configuration file using a text editor and it does not work/you get an error, delete
the file and try using vi to write it.
9. In your working directory, you will execute vina for each ligand to be tested twice. You
will need to set the output file name each time so it does not overwrite the previous file.
10. The command for vina is (as one line without the notes in parentheses):
vina --config config.txt --ligand ligand.pdbqt (enter your ligand filename here)
--out protname_ligand_out.pdbqt --log log_protein_ligand.txt (name the log
file in a way that lets you know what the protein and ligand were)
11. Running vina should take 30 seconds to 2 minutes for ligand. You will see a progress bar
that lets you know when the run is finished. Up to 3 iterations can run at a time without
freezing the machine or killing performance.
12. The output is a PDBQT file that contains multiple poses. Open each in ADT and save
molecule #1 as a PDB (you will need to select molecule #1 from the list of molecules at
the bottom before saving).
13. Now you want to visualize all of your docked results in Chimera and compare the poses.
a. Open one of the protein files. Does not matter which if you superimposed the
structures before docking.
b. Open the lowest energy pose that you saved for each protein.
c. Look at the RMSD values you recorded for each. Compare the docked poses for
each ligand to the co-crystallized position. Vina does not give you RMSD values
relative to the starting pose. The RMSD values in the log files are relative to the
lowest energy pose. These are useful to see how much sampling was carried out
for the pose. You will need to calculate RMSD values by hand or by script.
i. Run the RMSD script and write down the returned RMSD value for each
pose. You will need to compare against the crystal structure pose for the
ligand.
ii. Usage: perl rmsd_calculator_SingleUse.pl
1. Enter in the file names for the reference crystal structure pose and
the docked pose when prompted.
2. The RMSD value will be displayed.
d. Any poses with and RMSD greater than 2.0 are not favorable. Any poses that do
not have the ligand positioned in the same orientation as the crystal structure (e.g.,
reactive head group in pointing out of cavity rather than into cavity) are not
favorable.
14. Decide of the five which structure(s) worked the best and should be used for cross-
docking.

Cross-docking: Here you will see if the selected structure(s) can accurately accommodate non-
native ligands, which are different ligands found in other crystal structures for the same protein.
With the crystal structures, you have a reference for where the ligand should dock if the binding
cavity of your selected protein structure(s) can accommodate it. Using a single structure for
cross-docking is fine, but comparing results across more than one protein structure helps to
strengthen your case. You can more easily see what successful cross-docking looks like
compared to unsuccessful. Given the amount of docking to be carried out for this step, you will
use AutoDock Vina instead of the autodock4 command. You will use the ligands found in the
following PDB structures:
 1FM9
 2F4B
 2HWQ
 2I4J
 2I4P
 2VSR
 2VST
 3ET1
Either expand the table you started for re-docking, or start a new one for the cross-docking
RMSD values.

15. Fetch the first PDB ID in Chimera.


16. Save one of the PPARg chains as before.
17. Close session.
18. Repeat for other IDs.
19. Superimpose these structures to the same reference you used in the re-docking step.
20. Save these coordinates as before.
21. Now save just the ligand as a separate PDB file.
a. Select just the ligand you want to keep. Use the PDB page information to identify
the three-letter code for the ligand. Select  Residue  XXX (where XXX
indicates the PDB three-letter code).
b. There may be more than one copy of your ligand in the file. Once you identify the
ligand, you can select one of the atoms (control-click) and press the “up” arrow
key to select the rest of the atoms for that ligand.
22. Add hydrogen atoms to the ligand and save.
23. Prepare the PDBQT file for the ligand using ADT (see tutorial).
24. Repeat steps 6 through 12 from re-docking section for running Vina and saving lowest
energy poses.
25. Compare the two poses (crystal and docked) for each ligand tested. If the poses are
practically identical, you will only need to use one. If not, run vina two more times for
that ligand and see which pose shows up as the lowest energy pose more than once. If
none of the poses are the same, the ligand most likely did not dock well to the protein.

Now that you have re- and cross-docking results, analyze what you have and determine which
single protein structure would be the best for doing a docking study with a diverse ligand set.
Ideally, an appropriate protein structure model should pass both re- and cross-docking steps. Not
all ligands may pass the cross-docking, but more than one should and the ligand molecular
structures would ideally be diverse.

You might also like