Functional Genomics – Protein sequence analysis, Modeling and visualization

BY

Dr.C. Amrutha Valli Head of the Department of Bioinformatics Center for Information Science and Technology

University of Mysore
CIST UNIVERSITY OF MYSORE

Structural genomics
• The genome wide structural study of genes, proteins, and other • Biomolecules, including genome mapping, sequencing, and • Organization as well as protein structure characterization. • The large‐scale determination of macromolecular structures, • Principally those of proteins (structural proteomics) (Brenner,2001; Burley, 2000)

CIST UNIVERSITY OF MYSORE

CIST UNIVERSITY OF MYSORE

From DNA to biological function
Modeling & Inference

High-throughput DNA Sequencing

Gene Model

Functional Assignments

Basic Understanding/ Applications (e.g. therapeutics)

Structure Determination & Experimental Analysis

CIST UNIVERSITY OF MYSORE

Developing a gene model
Genome sequencing Genome assembly Regulatory elements Identification of ORF’s

CIST UNIVERSITY OF MYSORE

The “sequence Analysis” of Proteins
Software like Blastp, Fasta, PSI-BLAST, ClustalW, HMM Many others… Pfam Prodom, Prosite Databases
Universe of all protein sequences HYSIELNASLLERGV… HLNIEDNPSCNAMGV… PLNIELNASLNEPGV… WERIELNASLNER--… HQRIEL--SLMMRG-…
HLNIEDNPSCNAMGV… PLNIELNASLNEPGV… WERIELNASLNER--… HQRIEL--SLMMRG-… HYSIELNASLLERGV… HLNIEDNPSCNAMGV… PLNIELNASLNEPGV… WERIELNASLNER--… HQRIEL--SLMMRG-… HYSIELNASLLERGV… HLNIEDNPSCNAMGV… WERIELNASLNER--… HQRIEL--SLMMRG-…

HLNIEDNPSCNAMGV… PLNIELNASLNEPGV… WERIELNASLNER--… HQRIEL--SLMMRG-… HYSIELNASLLERGV… HLNIEDNPSCNAMGV… PLNIELNASLNEPGV… WERIELNASLNER--… HQRIELK-SLMMRG-…

CIST UNIVERSITY OF MYSORE

Flow of information from DNA to functional understanding
Modeling & Inference

High-throughput DNA Sequencing

Gene Model

Functional Assignments

Basic Understanding/ Applications

Structure Determination & Experimental Analysis

CIST UNIVERSITY OF MYSORE

Crystallography reveals locations of electron ‘clouds’ of the atoms: And the polypeptide chain can be traced through space

CIST UNIVERSITY OF MYSORE

The “fold-space” of proteins
Scop Cath

Universe of all protein structures CIST UNIVERSITY OF MYSORE

CIST UNIVERSITY OF MYSORE

Glimpes of the “fold space” of proteins

CIST UNIVERSITY OF MYSORE

Flow of information from DNA to functional understanding
Modeling & Inference

High-throughput DNA Sequencing

Gene Model

Functional Assignments

Basic Understanding/ Applications (e.g. therapeutics)

Structure Determination & Experimental Analysis

CIST UNIVERSITY OF MYSORE

Connections between sequence and structure

Universe of sequences

Universe of structures

CIST UNIVERSITY OF MYSORE

Connections between sequence and structure
?

Universe of sequences

Universe of structures

CIST UNIVERSITY OF MYSORE

At what level of homology can one trust a structural inference

CIST UNIVERSITY OF MYSORE

What is structural genomics
• Experimental determination of key structures (target selection is a key part of the idea) • Modeling of protein family members • Inferring function . • Making direct use of the new structures

CIST UNIVERSITY OF MYSORE

Protein Sequences and Folds
• ~100,000 families of proteins that cannot be reliably modeled at present (modeling families: <30% identity over large fraction to a known structure) • ~50% of all domain families can be assigned to a structure under CATH

CIST UNIVERSITY OF MYSORE

Protein Structure
“To make the three-dimensional atomic level structures of most proteins easily available from knowledge of their corresponding DNA sequences.”

CIST UNIVERSITY OF MYSORE

Generation of new structures

Chandonia and Brenner, Science 311:347 2006.

CIST UNIVERSITY OF MYSORE

CIST UNIVERSITY OF MYSORE

Structural Genomics is working!
- Structural Genomics is driving the

developments of parallel protein technologies - Taget selection a key to down-stream scientific impact - ”Purifiable and crystallisable clones” will constitute a key resource for future biomedical research and for industry - Traditional structural biology projects and structural genomics projects will have to find efficient interfaces
CIST UNIVERSITY OF MYSORE

Pipeline details: cell-based and cell-free protein production for X-ray and NMR

Note: project involves sequencing, which aids gene modeling! CIST UNIVERSITY OF MYSORE

Pfam B: 13 and 136 matches to #’s 7198 and 11634
>>Alignment of GalP_UDP_transf vs 1Z84:A|PDBID|CHAIN|SEQUENCE/15-196 *->kkfsplDhvhrrynpLtlvwilVsphrakRPikqsqsLidlkkeLwq ++ ++ + +r p t +w+ sp+rakRP 15 GDSVENQSPELRKDPVTNRWVIFSPARAKRP---------------- 45 gavetpkvptdplhdp.dcysakLcpg........atratgevNPdyest + ++k p+ p p++c+ c g++++ ++ r++ ++ P + 46 -TDFKSKSPQNPNPKPsSCP---FCIGreqecapeLFRVP-DHDPNWKLR 90 yvLkspkkftndFyalseDnpyikvsvSNeaIaknplfqlksvrGhelci + +n ++als+ +++ +++++ G +++ 91 VI-------ENLYPALSRN---LETQ------------STQPETG--TSR 116 VI...CF......SKPehDptlpalakeeirevvdaWqlcteelGyegre +I + F++ +S P h+ l + i+ ++ a + + 117 TIvgfGFhdvvieS-PVHSIQLSDIDPVGIGDILIAYKKRINQIA----- 160 nhpayqnvqIFEmNkGaemGcsnpHPYaYFnEHGQvwatsfiP<-* h + + q+F N Ga G s H H Q a++ +P 161 QHDSINYIQVFK-NQGASAGASMSHS------HSQMMALPVVP 196 1Z84:A|PDB

1Z84:A|PDB

1Z84:A|PDB

1Z84:A|PDB

1Z84:A|PDB

http://www.sanger.ac.uk/Software/Pfam/

CIST UNIVERSITY OF MYSORE

Flow of information from DNA to functional understanding
Modeling & Inference

High-throughput DNA Sequencing

Gene Model

Functional Assignments

Basic Understanding/ Applications (e.g. therapeutics)

Structure Determination & Experimental Analysis

CIST UNIVERSITY OF MYSORE

Function space of proteins
KEGG = Kyoto Encyclopedia of Genes and Genomes The Gene Ontology project (GO) Don’t forget to consider protein-protein interactions Metabolism
Enzymes

Cellular Processes

Signal Processing

CIST UNIVERSITY OF MYSORE

Related to a human protein associated with Hallervorden-Spatz syndrome, a neurological disorder?
CIST UNIVERSITY OF MYSORE

Flow of information from DNA to functional understanding
Modeling & Inference

High-throughput DNA Sequencing

Gene Model

Functional Assignments

Basic Understanding/ Applications (e.g. therapeutics)

Structure Determination & Experimental Analysis

CIST UNIVERSITY OF MYSORE

Enzyme of unknown specificity.

CIST UNIVERSITY OF MYSORE

A functional annotation lesson

CIST UNIVERSITY OF MYSORE

Functional Annotation by Inference
From raw DNA sequences, one looks for genomic features such as promoters, alternative splicing of mRNAs, retrotransposons, pseudogenes, tandem duplications, synteny, and homology. It Is homology, both from sequence and from structure, that allow functional inferences to be made. Parasite, Dali, VAST, FFAS03 Some tool integrate knowledge from many sources into one place, acting a meta-servers of clues.

CIST UNIVERSITY OF MYSORE

Connections between structure and function

Universe of functions

Universe of structures

CIST UNIVERSITY OF MYSORE

Connections between structure and function
Convergent evolution

Universe of functions

Universe of structures

CIST UNIVERSITY OF MYSORE

Connections between structure and function

Divergent evolution

Universe of functions

Universe of structures

CIST UNIVERSITY OF MYSORE

Structure led to discovery of function.

CIST UNIVERSITY OF MYSORE

Flow of information from DNA to functional understanding
Modeling & Inference

High-throughput DNA Sequencing

Gene Model

Functional Assignments

Basic Understanding/ Applications (e.g. therapeutics)

Structure Determination & Experimental Analysis

CIST UNIVERSITY OF MYSORE

Summary
Structural genomics efforts are gaining momentum and helping to assign new functions to orfs and to fill in the space of all possible protein folds.

CIST UNIVERSITY OF MYSORE

Thank you for your attention !

CIST UNIVERSITY OF MYSORE

Sign up to vote on this title
UsefulNot useful