Professional Documents
Culture Documents
A/Prof. Ly Le
School of Biotechnology
Email: ly.le@hcmiu.edu.vn
Office: Rm 705, HCM International University
OBJECTIVES
32 Kbytes RAM
Bioinformatics is
2.18 µHz
one solution to this
$2,900,000 in 1960
problem—a way of
coping with large IBM 7090 computer
data sets and
making sense of
1 GB RAM
genomic-scale data
2.4 GHz
$1199 in 2008
1) Database
– structured
– searchable (index) -> table of contents
– updated periodically (release) -> new edition
– cross-referenced (hyperlinks) -> links with
other db
2) Resource: Includes also associated tools
(software) necessary for db access, db updating,
db information insertion, db information
deletion….
DATABASE ENTRIES OFTEN
PRESENTED AS FLATFILES
Description
DE Homo sapiens truncated breast and ovarian cancer susceptibility protein
DE (BRCA1) gene, partial cds.
KW .
Keyword
OS Homo sapiens (human)
Organism Source
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
Organism
Classification OC Eutheria; Primates; Catarrhini; Hominidae; Homo.
THE WORLD BIOINFORMATICS
CENTERS AND ON-LINE SERVICES.
• ExPASy
• EBI
• EMBL
• GenomeNet
• NCBI
BIOINFORMATICS CENTERS AND ON-LINE SERVICES
ExPASy : http://www.expasy.org/
BIOINFORMATICS CENTERS AND ON-LINE SERVICES
EBI: http://www.ebi.ac.uk/
BIOINFORMATICS CENTERS AND ON-LINE SERVICES
EMBNet: http://www.ch.embnet.org/
BIOINFORMATICS CENTERS AND ON-LINE SERVICES
GenomeNet: http://www.genome.ad.jp/
NCBI (NATIONAL CENTER FOR
BIOTECHNOLOGY INFORMATION)
Remember the server, the database, and the program version used
Write down sequence identification numbers
Write down the program parameters
Save your internet results the right way
(use screenshots or PDFs if necessary)
Databases are not like good wine
(use up-to-date builds)
Use local installs when it becomes necessary
• Sequence Databases
• Bibliographic Databases
• Clinical Databases
• Integrated Databases
• Structural Databases
SEQUENCE DATABASES
Nucleotide Databases:
Amos Bairoch and Rolf Apweiler "The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 2000",
Nucleic Acids Res. 28:45-48(2000).
SEQUENCE DATABASES
Acts as a supplement to
Protein Databases: SwissProt and contains
translated EMBL
sequences with
TrEMBL: Translated EMBL
automatic annotation.
TrEMBL entries are
Current Release: 632,013 entries manually annotated
before being entered
into SwissProt.
SpTrEMBL & RemTrEMBL
Protein Databases:
The PIR is a computer system
PIR: Protein Information offering both peptide an
Resource nucleotide sequences
designed to aid protein
Current Release: identification.
283,175 entries
Protein Databases:
http://www.biochem.ucl.ac.uk/bsm/cath_new/
total
yearly
Protein Data Bank (PDB)
BIBLIOGRAPHIC DATABASES
Used for searching for reference articles
Currently
holds over 12
million
MEDLINE
entries.
http://www.ncbi.nlm.nih.gov/Entrez
BIBLIOGRAPHIC DATABASES
PubCrawler: http://www.pucrawler.ie
Free to academics, will search journals and sequences daily, weekly or monthly
and alert the user when results are found corresponding to their search
CLINICAL DATABASES
Interpro: http://www.ebi.ac.uk/interpro
Integration of individual protein resources PRINTS;
PROSITE; SMART; ProDom; Pfam; TIGRfam into one
database. A search will scan entries of each and output
results.
INTEGRATED DATABASES