You are on page 1of 15

What is bioinformatics?

• Interface of biology and computers

• Analysis of proteins, genes and genomes


using computer algorithms and
computer databases

• Genomics is the analysis of genomes.


The tools of bioinformatics are used to make
sense of the billions of base pairs of DNA
that are sequenced by genomics projects.
Growth of GenBank

Base pairs of DNA (billions)


Sequences (millions)

Updated 8-12-04: 1982 1986 1990 1994 1998 2002 Fig. 2.1
>40b base pairs
Year Page 17
Central dogma of molecular biology

DNA RNA protein

genome transcriptome proteome

Central dogma of bioinformatics and genomics


DNA RNA protein phenotype

protein
cDNA sequence
ESTs databases
genomic
UniGene
DNA
databases
Fig. 2.2
Page 20
There are three major public DNA databases

EMBL GenBank DDBJ

The underlying raw DNA sequences are identical

Page 16
There are three major public DNA databases

EMBL GenBank DDBJ


Housed Housed Housed
at EBI at NCBI in Japan
European National
Bioinformatics Center for
Institute Biotechnology
Information

Page 16
>100,000 species are represented in GenBank

all species 128,941


viruses 6,137
bacteria 31,262
archaea 2,100
eukaryota 87,147

Table 2-1
Page 17
The most sequenced organisms in GenBank

Homo sapiens 10.7 billion bases


Mus musculus 6.5b
Rattus norvegicus 5.6b
Danio rerio 1.7b
Zea mays 1.4b
Oryza sativa 0.8b
Drosophila melanogaster 0.7b
Gallus gallus 0.5b
Arabidopsis thaliana 0.5b

Updated 8-12-04
GenBank release 142.0
National Center for Biotechnology
Information (NCBI)

www.ncbi.nlm.nih.gov
www.ncbi.nlm.nih.gov
PubMed is…
• National Library of Medicine's search service
• 12 million citations in MEDLINE
• links to participating online journals
• PubMed tutorial (via “Education” on side bar)
Entrez integrates…
• the scientific literature;
• DNA and protein sequence databases;
• 3D protein structure data;
• population study data sets;
• assemblies of complete genomes
Entrez is a search and retrieval system
that integrates NCBI databases
BLAST is…

• Basic Local Alignment Search Tool


• NCBI's sequence similarity search tool
• supports analysis of DNA and protein databases
• 80,000 searches per day

You might also like