Professional Documents
Culture Documents
Databases in Bioinformatics
Why?
3D structure databases
Ontologies
Biological databases: Why?
Data:
Type of data:
nucleotide sequences
protein sequences
3D structures
metabolic pathways
or marked
error checking
consistency, updates
flat files
Relational databases
Object-oriented databases
Curators:
Commercial company
Availability:
Commercial
Identifiers and Accession numbers
3 main databases
EMBL: www.ebi.ac.uk/embl
GenBank: www.ncbi.nlm.nih.gov/GenBank
DDBJ: www.ddbj.nig.ac.jp
11/30/2005
Nucleotide Sequence Databases
Example: TPIS_CHICK
Example: TPIS_CHICK
UniGene is an experimental system for automatically partitioning
GenBank sequences into a non-redundant set of gene-oriented clusters.
Each UniGene cluster contains sequences that represent a unique gene,
as well as related information such as the tissue types in which the gene
has been expressed and map location.
Other Nucleotide Sequence Databases
UniGene www.ncbi.nlm.nih.gov/UniGene/
Genome databases:
SGD genome-www.stanford.edu/Saccharomyces/
(Saccharomyces cerevisiae)
EBI Genomes www.ebi.ac.uk/genomes/
Genome Biology www.ncbi.nlm.nih.gov/Genomes/
TIGR http://www.tigr.org/db.shtml
Ensembl www.ensembl.org
(eukaryotic genomes)
Amino Acid
Composition
Size of SwissProt
SwissProt: Statistics
Biomolecule Structure Database
PDB: http://www.rcsb.org
SCOP: http://scop.berkeley.edu
CATH: http://biochem.ucl.ac.uk/bsm/CATH
ASTRAL: http://astral.berkeley.edu
HOMSTRAD: http://www-cryst.bioc.cam.ac.uk/data/align/
Interfaces to PDB:
PDB at a glance
http://cmm.info.nih.gov/modeling/pdb_at_a_glance.html
Molecules to go http://molbio.info.nih.gov/cgi-bin/pdb/
PDBSum: http://www.ebi.ac.uk/thorntonsrv/databases/pdbsum
Application of GO
Transmembrane
receptor
Protein tyrosine
kinase
Is_a
Is_a
SYSTEMS for SEARCHING
http://www.ncbi.nlm.nih.gov/gquery/gquery.fcgi