You are on page 1of 16

Presented By Shri Vaishnavi & Pinky

CATH DATABASE

HIERARCHIAL DOMAIN CLASSIFICATION OF PROTEIN STRUCTURES .

BetaSheets and AlphaBeta. Alpha-helices. Includes both alpha/beta and alpha+beta.CATH     CLASS: Secondary structure packing within the protein structure. .

but different architectures.A et al. Ex: Tim Barrel.  Architecture Distinguishes structures within the same class. 1997)   . Groupings can sometimes be rather broad as they describe general features of protein-fold shape.. the number of layers in an αβ sandwich(Orengo C.

Within the topology level.  Topology Arrangement and connectivity of secondary structure elements are same in number. structures are same but may differ in function. Ex. Globin or immunoglobin fold.   .

α Non-bundle globin-like folds—the erythrocruorins. but are differentiated by their H numbers 10.  Homology Structures are grouped by their high structural similarity and similar functions. colicins.340). respectively   . 20. phycocyanins and domain 1 of diptheria toxin — all have the same CAT number (1. They may have evolved from a common ancestor. 30 and 40.10.

  . SOLID.  Sequence family Have sequence identities >35% Presumed to have extremely similar structures and functions— they may be slightly different examples of the same protein from different species belonging to the same sequence superfamily.

FLOW CHART OF CATH DATABASE .

ac.uk/cgi-bin/cath/CathServer.pl) Generate derived data from the PDB coordinate files Identify more remote homologues Set Threshold E-value from validated structural homologs If match is found. superfamilies are structurally compared with the query structure using the SSAP structure alignment program Any query structure unmatched is scanned against a library of representative structures from each close sequence family in CATH The top 10 matches are displayed ..2005) Input Structure to Server (http://www.ucl.biochem.CATH SERVER PROTOCOL (FRANCES PEARL ET AL.

E.DICTIONARY OF HOMOLOGOUS SUPERFAMILIES (DHS) ( J.2000)  Database of validated multiple structural alignment annotated with consensus functional information for evolutionary protein families..  Also provides a tool for examining sequence-structure relationships for proteins within each fold group .  A powerful resource to validate. examine and visualize key structural and functional features of each homologous superfamily.BRAY ET AL.

homologous and checking for incorrect classifications  Automatic validation of structural relatives(DHS-VALID) DHS-VALID program is used to check automatically all the pairwise sequence and structure comparison data generated for each fold group and homologues superfamily in CATH.  Generation of multiple structural alignment using CORA    Conserved Residue Attributes Uses the pairwise structural comparison data from SSAP to determine the initial set of proteins to be aligned Identifies conserved characteristics and expresses as a 3D structural profile Profiles encapsulate the ‘core’  Annotation of structural alignments .GENERATION OF DATA FOR THE DHS Generation of structure comparison data using SSAP  Comparisons provide a complete data set for analyzing analogues .

. BUCHAN ET AL.  .A.2002)  It is focused on providing structural annotation for protein sequences without structural representatives The protein sequences have also been clustered into whole chain families so as to aid functional prediction..GENE3D (DANIEL W.   The structural annotation is generated using HMM models based on the CATH domain families Applications:   Annotate Hypothetical proteins and gene (Corin Yeats et al.2006) Examine the functions of homologous superfamilies that are multiply expanded within genomes or sets of genomes.

Bray et al.2006)  To capture evolutionary divergence (Lesley H....2000)   The organization of proteins by global structural similarity helps improve prediction algorithms based on fold recognition Allow the distribution of common motifs to be explored more easily   Gives insights into which combinations of motifs generate stable protein architectures Allows newly determined structures to be easily examined for recognizable folds (CA Orengo et al. Greene et al.1997)  .. Lees et al.E.APPLICATIONS OF CATH DATABASE  CATH database was used as a guide to select proteins from a wide variety of protein families (Jonathan G.2007) For identifying remote homologs (J.

2 G 1 C H O P C L O S E N E 3 D 4 1. Database of validated multiple structural alignments 4. Boundary assignment by inheriting from other chain 2. Scores used for identifying matches 3 H S S A P . Predicts Hypothetical proteins 3.

E. Greene et al.2007 The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution Frances Pearl et al.. Buchan et al.A..Bray et al..REFERENCES        CA Orengo et al.2006 Gene3D: modelling protein structure. function and evolution . 2002 Gene3D: Structural Assignment for Whole Genes and Genomes Using the CATH Domain Structure Database Corin Yeats et al....1999 The CATH Database provides insights into protein structure/function relationships Lesley H.2000 The CATH Dictionary of Homologous Superfamilies(DHS): a consensus approach for identifying distant structural homologues CA Orengo et al..1997 CATH — a hierarchic classification of protein domain structures J.2005 The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis Daniel W.