Professional Documents
Culture Documents
Received February 11, 2010; Revised April 15, 2010; Accepted May 2, 2010
*To whom correspondence should be addressed. Tel: +44 20 7594 5212; Fax: +44 20 7594 5264; Email: m.sternberg@imperial.ac.uk
Selecting residues
The chosen cluster is used to predict the binding site in the
query protein. The number of ligands within a xed
distance (distance cut o) of a residue is used to determine
if it is predicted to form part of the binding site. The
threshold number of ligands is a proportion of the total
number of ligands in the cluster and is set using Equation
(1) (where m is a constant that determines the proportion
of the ligands that need to be within the distance cut o to
be predicted as part of the binding site). The threshold
needs to account for variation between the modeled and
real structure and between the ligands in the cluster, so a
range of distances from 0.2A to 2.0A (Evaluating
3DLigandSite performance section and Figure 2) and a
range of m values in the equation between 0.10 and 0.35
were considered. The server uses a distance setting of 0.8 A
and Equation (1) with m set to 0.24.
Threshold m cluster size+1 1
in CASP8 (14). The FINDSITE data set was ltered using the rst step of the prediction process is to model the
our list of accepted ligands, which resulted in a set of 617 structure of the protein using Phyre (16).
test structures. 3DLigandSite made predictions for all but
three of these structures. 3DLigandSite performance was Results output
assessed using a range of distance cut os between 0.2A
and 2.0A at 0.2A intervals and with m in Equation (1) set 3DLigandSite output is split into four main sections. The
to values in the range 0.100.35. We assessed the predic- rst provides details of the phyre model used (only where
a sequence has been submitted) and details of the search
tions using the MCC (15), and coverage and accuracy, all
against the structural library. This information provides
of which have been used for assessment in recent CASP
details of condence in two individual steps of the predic-
experiments (14,23). The results for the m setting of 0.24
tion process, which can aid the user in deciding their con-
are displayed in Figure 2 (the full set of results is shown in
dence in the prediction.
Supplementary Figure S1). At low-distance cut os, high
The second section shows a table of the ligand clusters
accuracy and lower coverage are obtained and as the
identied. The cluster containing the greatest number of
distance cut o increases the accuracy lowers while the ligands is automatically selected for prediction by
coverage increases. The maximum MCC of 0.68 is 3DLigandSite. This table provides details of the other
obtained at a 0.8A distance cut o. The MCC decreases clusters and allows the user to view the potential sites
at lower and higher cut os (Figure 2). At this distance cut associated with these clusters. Links are provided to
o 70% coverage and accuracy are obtained. This setting Jmol applets for each cluster, similar to that of the main
was selected for use in the 3DLigandSite server and for the prediction (see below and Figure 3).
analysis of the CASP8 targets. The nal two sections display the 3DLigandSite predic-
To make our predictions of the 28 CASP8 targets com- tion. A table lists all of the predicted binding-site residues
parable with the predictions made during CASP8, the with details of the number of ligands that they contact, the
structural library was restricted to structures present in average distance between the residue and the residue con-
the PDB before May 2008 (the start of CASP8). Using servation score (JSD). A table of the heterogens present in
the settings described earlier, 3DLigandSite obtained a the cluster is provided together with details of the source
MCC of 0.64 and coverage and accuracy of 71 and structures from the structural library. A Jmol (www.jmol
60%, respectively. These results are comparable to our .org) applet enables visualization of the modeled protein,
human performance in CASP8, where we obtained the ligand cluster and the predicted binding site (Figure 3).
MCC of 0.63, 83% coverage and 56% accuracy (13). Jmol is java based and only requires users to have a java
runtime environment installed on their machine. By
default, the protein is displayed in cartoon format with
metallic ligands in spacell and non-metallic ligands in
THE 3DLIGANDSITE WEB SERVER
wireframe representation. A table to the right of the
The 3DLigandSite server is available at http://www.sbg applet provides controls for users to modify the display
.bio.ic.ac.uk/3dligandsite. Users can submit either a in the applet. Options are available to modify the display
protein sequence or a structure. For sequence submission, of the whole protein, predicted residues and ligands.
W472 Nucleic Acids Research, 2010, Vol. 38, Web Server issue
proteins: applications to assessing the validity of inheriting 14. Lopez,G., Ezkurdia,I. and Tress,M.L. (2009) Assessment of ligand
protein function from homology in genome annotation and to binding residue predictions in CASP8. Proteins, 77(Suppl. 9),
protein docking. J. Mol. Biol., 311, 395408. 138146.
6. Capra,J.A., Laskowski,R.A., Thornton,J.M., Singh,M. and 15. Matthews,B.W. (1975) Comparison of the predicted and observed
Funkhouser,T.A. (2009) Predicting protein ligand binding sites by secondary structure of T4 phage lysozyme. Biochim. Biophys.
combining evolutionary sequence conservation and 3D structure. Acta., 405, 442451.
PLoS Comput. Biol., 5, e1000585. 16. Kelley,L.A. and Sternberg,M.J. (2009) Protein structure prediction
7. Glaser,F., Morris,R.J., Najmanovich,R.J., Laskowski,R.A. and on the Web: a case study using the Phyre server. Nat. Protoc., 4,
Thornton,J.M. (2006) A method for localizing ligand binding 363371.
pockets in protein structures. Proteins, 62, 479488. 17. Ortiz,A.R., Strauss,C.E. and Olmea,O. (2002) MAMMOTH
8. Huang,B. and Schroeder,M. (2006) LIGSITEcsc: predicting ligand (matching molecular models obtained from theory): an automated
binding sites using the Connolly surface and degree of method for model comparison. Protein Sci., 11, 26062621.
conservation. BMC Struct. Bio., 6, 19. 18. Berman,H.M., Westbrook,J., Feng,Z., Gilliland,G., Bhat,T.N.,
9. Hernandez,M., Ghersi,D. and Sanchez,R. (2009) Weissig,H., Shindyalov,I.N. and Bourne,P.E. (2000) The Protein
SITEHOUND-web: a server for ligand binding site Data Bank. Nucleic Acids Res., 28, 235242.
identication in protein structures. Nucleic Acids Res., 17, 19. The UniProt Consortium. (2009) The Universal Protein Resource
W413W416. (UniProt) 2009. Nucleic Acids Res., 37, D169D174.
10. Lopez,G., Valencia,A. and Tress,M.L. (2007) restarprediction 20. Capra,J.A. and Singh,M. (2008) Characterization and prediction
of functionally important residues using structural templates and of residues determining protein functional. Bioinformatics, 24,