You are on page 1of 25

1.

Aim:
1. Predict the secondary structure of the given sequence using any online
prediction tool. The output should be a sequence of alphabets [H,E,L}
having the same length as that of the given query sequence.
2. Using Bioinformatics tools, find the annotations available for this protein
3. BLAST against PDB to find a suitable target from mouse
4. Construct 3 homology models using one of the best hits from the above
search
5. Compare the Ramachandran plots of the template and the 3 constructed
models.
6. Report a cartoon diagram of the final model, and a superposition diagram
of the best model and the chosen template
7. Compare the predicted secondary structure with that of the model.

Principle:
It is the technique of extrapolating a structure from a sequence that ought to match
the outcomes of experiments. The sequence of amino acids in a protein determines
its shape. These protein structures can be discovered by X-ray crystallography,
NMR spectroscopy, theoretical approaches based on actual investigations, or
homology modelling. However, due to the high protein dimension, genuine studies
failed to produce high resolution information for the majority of proteins, and
NMR and other analyses also fell short. In contrast to a sequence, the structure is
more stable during the evolutionary process. Both closely related and distantly
related proteins fold into the same structure when they have a similar amino acid
sequence.
In this method, a target protein sequence is aligned with established template
structures to create a three-dimensional structure. You may get the protein
sequence from NCBI or UniProt. The degree of alignment between the target
sequence and the database having the highest similarity determines the quality of a
structure. Safe zone >30% twilight zone 10% midnight zone might be used to
describe it. It is not possible to predict structures with less than 30% homology
using homology modelling. Less than 20% is occasionally also chosen. Sequence
alignment, structural modification, database searches, energy minimization, and
structure assessment are all steps in the multi-step process known as homology
modelling, which produces a structure. Tools Used
Chimera, Jpred, Modeller, ccp4i, clustalW, coot, PyMol, Predict Protein , InterPro,
Gedit, BLAST

Procedure:

I. Online secondary structure prediction:


The query file sequence was taken and submitted to JPred an online
secondary structure prediction tool, in order to check the secondary
structure of the given sequence.
Query sequence:

>query
MALRAAVFDLVGVLTQPSVTSFWDRAEEELALPRGFLSKAFLKGG
PDGPSTRVMKGEITFSQWVPFMEEDCRKCSKDSGICLPENFSIKQIF
EKVLSARKINYPMLQAAVTLKQKGFTTCILTNNWLDDSPERGSMA
QVLCELKPHFDFLIESSQIGMVKPDPQIYKFVLDTLKTSPSEVVFLD
NFETNLQPAREMGMVTIFVRDIDAALKELEKVTGVQLLQT

II. Using Bioinformatics tools, find the annotations available for this
protein:
InterPro was used for Annotation.
III. Blast against PDB to find a suitable target from mouse:
A. BLASTed the sequence against the pdb database in BLASTp using
mus musculus as the organism.
B. The search only produced one hit, 1CQZ_A.

IV. Construct 3 homology models using one of the best hits from the above
search:
A. The PDB structure was downloaded from the PDB.
1. Chain A was selected and exported using PyMol.
2. PDB cleaning was done:
a) egrep “ATOM|HETATM|TER” 1CQZ_a.pdb >
1cqz_A_clean.pdb
b) Alternate conformers were removed using the CCP4i.
B. The file was opened in chimera and the structure sequence was
downloaded using Tools > sequence > sequence > file > aligned
fasta
C. Aligned fasta and the query sequence were put in clustalw and
aligned. THe output was downloaded in the PIR format.
D. To the output subsequent modifications were made for the modeller
:
E. Subsequent changes were also made in the model_default.py

F. Mod10.3 model-default.py:
1. Led to the generation of 3 model files
G. The energies were compared:

V. Compare the Ramachandran plots of the template and the 3 constructed


models.
Ramachandran plots for template and all the 3 models were generated:
A. Template:
B. Model 1:
RMSD between 157 pruned atom pairs is 0.389 angstroms;
(across all 167 pairs: 1.298) with 18 outliers and 190 in favoured
regions.

C. Model 2:
RMSD between 157 pruned atom pairs is 0.367 angstroms; (across
all 167 pairs: 1.063) with 14 outliers and 188 in favoured regions

D. Model 3:
RMSD between 156 pruned atom pairs is 0.406 angstroms;
(across all 167 pairs: 1.371) with 15 outliers and 190 in the
favoured region.

VI. Report a cartoon diagram of the final model: 


A. Model 1:
B. Model 2:

C. Model 3:
VII. Compare the predicted secondary structure with that of the model.
Predicted secondary structure of the query:

Secondary structure of the model:


Online tool predict protein was used for the secondary structure
determination and comparison.

Conclusion:
As seen from the secondary structure predictions, we can see that the
chosen model exhibited the same secondary structures as the predicted
one of the query.
AutoDock
PROTOCOL:

I. Using any online or offline tools, identify the potential ligand binding
sites in the given molecule (model.pdb). Show snapshots depicting the
binding sites.
A. egrep "ATOM|HETATM" model.pdb > model_working.pdb
B. Coot > select residue > mutate > MSE > MET
C. CCP4i > remove alternate conformers.

D. identify the potential ligand binding sites in the given molecule


(model.pdb).
1. Discovery studio > file > open > <cleaned pdb file for the
protein>
2. Add hydrogens to the protein for better site detection
Chemistry > Hydrogens > Add
3. In the Tools Explorer, expand Receptor-Ligand Interactions |
Define and Edit Binding Site and click "Define Receptor".
II. Using for lig1.pdb, subsequenty VINA for all three ligands (including
lig1.pdb) provided, find out which ligand binds the strongest at the best
binding site identified. Report their estimated binding energy values.
A. Opened autodock using adt
B. File > read molecule > <protein file cleaned>.pdb
C. Edit > hydrogen > add > polar only

D. Now pdbqt files for both the protein and the ligand were generated:
1. LIgand:
a) Ligand > input > chose > lig1.pdb
b) Ligand > output > save as .pdbqt
2. Protein:
a) Grid > macromolecule > chose > <protein molecule>
> save as .pdbqt
E. Selection of ligand:
1. All the .pdbqt files were loaded
a) Protein file
(1) Grid -> set map types -> choose ligand ->
ligand selected
b) Ligand file
(1) Grid > macromolecule > choose
(2) Grid -> grid box -> set parameters (dials) to
engulf full protein (blind docking)
(3)
(4) File > close saving current
(5) Save the coordinates using Grid > OUtput >
save GFP
2. Run -> autogrid (gave gpf file as input, saved log file as glg
file)
3.
4. Docking -> macromolecules -> set rigid filename ->
<protein to be docked>.pdbqt
5. Docking -> ligand -> ligand parameters -> choose and set
lig1.pdbqt
6.
7. Docking -> search parameters -> genetic algorithms
8. Docking -> output -> lamarckian GA
9. Saved file as .dpf file
10.
11.Run -> autodock (gave dpf file as input, saved log file as dlg
file)
12.
13.Grep “Estimated Free” lig1.dlg
14.
15.analyse -> dockings -> open -> choose the ligand.pdbqt
file
16.Analyse -> conformations -> play, ranked by energy
17.Molecule > MS> (for molecular surface)
18.Ligand > B > for visual depiction.

19.
III. Using for lig1.pdb, subsequenty VINA for all three ligands (including
lig1.pdb) provided, find out which ligand binds the strongest at the best
binding site identified.

Report their estimated binding energy values


A. The specific binding sites were ranked by energies and then the
grid was designed for the first one and then changes to the config
text were made.

B. To get the pdb output of the best ligand conformation run vina:
1. Make a config.txt file to put all the grid box parameters in it
and run command:

2.
3. These values were obtained from the grid sizes in the
docking run.
C. vina --config config.txt7.0 --partialcharges gasteiger --log lig1.log
--ligand lig1.pdbqt
1.
D. vina –config config.txt –log lig2.log --ligand lig2.pdbqt

1.
E. vina –config config.txt –log lig3.log --ligand lig3.pdbqt
1.
F. On the basis of the vina results, we got the second ligand to be the
best ligand with model one being the best.

IV. Using Ligplot or Discovery studio or any other tool generate 2D


ligand-receptor interaction diagram and describe the differences in the
interactions shown by the best poses of the top two ligands.
A. The pdbqt for the lig2 was converted to the pdb using the following
command:
1. openbabel.obabel -ipdbqt lig2_out.pdbqt -opdb -O
lig2_works.pdb
2. The best model out of the 9 models : model1 was taken out
and then the molecule was exported to a new pdb file using
the gedit.
a) Lig2_works > lig2_1.pdb
3. Terminal > ds
a) Both the ligand the receptor were loaded and then
b) View interaction plot > show 2D interaction
c) LIgand 2:
d)
4. Ligand 1:

a)
V. Suggest chemical modifications in the best ligand / best pose that might
increase its binding affinity.
A. Using ds both the file (model pdb (receptor) and the the ligand
were loaded ).
B. Different interactions were visualised using the toolbox
C. H bonds:

1.
D. Aromatic interactions:

1.
E. Hydrophobic interactions:

1.
F. On the basis of the observations, the original structure given below:
1.
G. The original structure was changed to:

1.
H. The basis of modification:
1. The area around the modified aromatic ring had shown to be
a donor and therefore was modified to add an acceptor
around that area for better H bonding interaction.
2. The OH group just besides the modified ring was causing an
acceptor acceptor interaction , therefore was deleted to
produce an acceptor donor interaction.
VI. Generate the ligand with your suggested modifications, perform docking
again and report the list of all binding affinity values.
A. Using the smiles:
1. The smiles was saved and then using ligand > smiles →
simple 3D was converted to a pdb structure which was saved
as a pdb.
B. The file was then converted to pdbqt using openbabel:
1. openbabel.obabel lig8.pdb -O lig8.pdbqt -p7.0
--partialcharges gasteiger
C. Docking was done using vina:
1. vina –config config.txt –log lig8.log –ligand lig8.pdbqt
D. Results:

1.
2. The binding energies came out be -7.5 kcal/mol in
comparison to the -7.1 earlier seen in the ligand 2.

You might also like